A 400 error
It all starts with an old 400 error that was occurring in Telescope for a while now, first documented here.
I had found that when we do the SAML sign in flow for authentication we seem to have, on occasion, a blocked session id cookie on the callback when the SAML response returns to us.
I could not reproduce this in the local setup, so I resolved to push tests for possible fixes to prod and David let me, which I am sure he regrets now.
This is the starting point for my experimentation and inevitable breaking of Telescope prod.
The first idea
This is my first PR and it outlines my first ideas.
Directly from the PR:
There is a 400 we sometimes get because the cookie that is sent to us with the post binding from Azure AD is sometimes blocked since it is from a different origin/site. Enabling sameSite 'none' should hopefully fix this. By default sameSite is set to 'Lax' which sometimes block incoming cookies from other sites such as the request AD makes to our callback.
This is all outlined nicely in the following document David found https://web.dev/samesite-cookie-recipes/#unsafe-requests-across-sites.
This pattern is used for sites that may redirect the user out to a remote service to perform some operation before returning, for example redirecting to a third-party identity provider. Before the user leaves the site, a cookie is set containing a single use token with the expectation that this token can be checked on the returning request to mitigate Cross Site Request Forgery (CSRF) attacks. If that returning request comes via POST then it will be necessary to mark the cookies as
SameSite=None; Secure
.
This took Telescope from sometimes not working and returning the 400 to ALWAYS not working and returning a 400. Luckily I am not out of ideas, but unfortunately none of them have worked since either...
Second idea
My second PR. Since in production Telescope express server is behind two proxies in Traefik and nginx I thought maybe using this trust setting with 2 hops would be enough to allow the requests to maintain their cookies even now that they were set as secure and sameSite none. This was not the case.
Example:
var app = express()
app.set('trust proxy', 1) // trust first proxy
app.use(session({
secret: 'keyboard cat',
resave: false,
saveUninitialized: true,
cookie: { secure: true }
}))
My final ideas
My final ideas before resetting prod that I have not yet tested are to change the origin to be explicit instead of a wildcard *. Because I have read that cors needs to be specified if the cookie is secure. I also want to try setting express.urlencoded({ extended: true })
to false in Satellite src because in all examples I have seen so far they have it set to false which uses a different query string library and the default setting of true has been deprecated as per the docs. The odds that that could actually be the problem is really low but it's worth a shot.
Problem now is that the explicit origin is different based on the environment Telescope is run in, and I have to find a way to dynamically change it based on where/how it's being run.
Hopefully in any case I will be able to finish my testing and have it running at least as well as it was before I messed with it.
Top comments (0)