Slack error handler: Resource exists but you don't have access to it
When trying to send a test message for the Slack error handler, the run fails with the following error:
Any idea what could be going wrong?
43 Replies
either you do not have access to it, or the error_handler group doesn't have access to it
it's not a resource I've created manually,
f/slack_bot/bot_token
is the resource that got created after setting up Slack OAuth as per https://www.windmill.dev/docs/misc/setup_oauth#slack
the weird bit is the test message worked once
then i changed the target channel
and it seems to have just stopped workingthe resource perms are incorrect, it should give perms to the error_handler group
did you sync from git?
i do use the windmill cli for pulling and pushing, but i don't sync resources at all, so it shouldn't have been affected
I run
wmill sync pull/push --skip-variables --skip-secrets --skip-resources
I see, so likely the perms changed when you changed the channel even though it shouldn't have
add the
error_handler
group to that resource
as an adminI am unsure how to change the permissions for a resource
and for some reason I am struggling to find relevant docs
resources pages -> find resource -> share
oh, right, thanks
lemme see
so I did this: https://img.qilin-qilin.ts.net/2024-08-13_10-30-57_OJX7L.webp
and nothing seems to have changed, the error is still the same: https://img.qilin-qilin.ts.net/2024-08-13_10-32-08_q8KGo.webp
you also need to share the variable
i see
there is a linked variable at same name
yeah, that did it
i am still unsure why it worked initially then stopped working, but it seems to be working now
it's weird though, the variable being linked, the sharing should have applied as well. What version are you on ?
We will investigate and try to reproduce
EE v1.377.1-5-gd56a956b9
thanks
all that i did was configure Slack OAuth, after which I set one channel as the target, and it worked
then later on I changed the channel
which is the only change I remember doing
after which I noticed it just stopped working
@rubenf I appear to have hit this issue again:
I've checked the permissions of both the resource and the variable, I've tried recreating it entirely, but that doesn't fix it for whatever reason. Any ideas?
the users that is running this is neither an admin nor on those roups
it's working as expected
well, the job is supposed to be "permissioned as g/error_handler"
(this is the job triggered by the "Send test message" button in Workspace Settings > Error Handler > Send test message)
I tried sharing the resource and variable with
g/all
as well, which did nothing, I am not even sure what g/all
stands forthe error you've shown is the error of that job?
yes
my "critical alerts" tab is filled with errors exactly like that one
due to the workspace error handler failing
apparently it's been happening for a while, I'm noticing just now
I can't reproduce and indeed it makes no sense since it's permissioned by g/error_handler
@invakid404 can you look the value of that resource from the resources page
check if it's not empty or something
and on very latest, in the error message we put more info about who you are authed as
the resource is linked to the variable, and the variable itself has a value
I am running EE v1.424.0
what is the resource value itself?
the json of it
and you're not using agent workers right?
Wait for latest to build, and paste here the new error message
no, i'm not using agent workers
and ok
just updated the server to
EE v1.424.0-7-g44f3dcc2b
, error message is exactly the same:
am i supposed to update the workers as well?yes
that's going to be slightly harder, as we use ee-full, give me a moment
and those workers use the same DATABASE_URL as your servers, and you have not disabled RLS and aren't in a funky setup?
workers have their own postgres accounts, but they're connected to the same database
i don't remember disabling RLS
workers have their own postgres accountsThat's likely the issue try giving them the normal accounts to see if that solve your issue
right, lemme try
also make sure you haven't done anything to the windmill_user user role
so I redeployed all workers with the same DATABASE_URL as the server and it's still erroring
would it make sense for this to be the only issue retrieving resources I have if i've indeed done something to the windmill_user role?
do you often run things as non admin?
Because if not then yes it would make sense
login as a normal user/non admin
i may have an idea what's going on then, lemme try something
i tried granting permissions to windmill_user, but unfortunately nothing changed
it is very possible my windmill_user role is not okay though
i'm not sure how i'd go about fixing it
yep, my windmill_user role is very much not okay
ALTER ROLE windmill_user WITH BYPASSRLS;
fixes the issue, so something's wrong with the state of our database
I suppose my only option is to run all migrations from zero, then restore from Git
we did change database servers at one point, so i'm suspecting it was a bad restore
I would've never suspected it's RLS, so thanks for helping me debug thisit's necessarily RLS but windmill_user shouldn't use BYPASSRLS
the permissions are enforced with RLS
indeed you might need to recreate the whole user
yeah, I'll leave it like this for now, all of our jobs are permissioned as some admin right now, so it doesn't affect us that much, and I'll address this properly over the weekend when I wouldn't interrupt any of our clients
👍