Why doesn't Access Token have the identity of user? - oauth-2.0

I am struggling with the concept of access tokens in regards to them not having the identity of the person making the request.
And i think i may be using access tokens incorrectly in my project.
The thing I don't understand is access tokens are often explained as a hotel card key to get into a room, but this analogy just doesn't make sense to me when it comes to the vast majority of requests.
For example say i have a resource for your private messages and scope read:privateMessages
If you want to do some action like read private messages, and you send me an access token I have to know who it is, as in I can't send you back someone else's private messages or all private messages.
A hotel key in this case doesn't work here, a hotel card that has scope read:gymroom write:gymroom
Then yeah you can access the room, but in any applications i have worked on I really need to know who is making the request.
To go back to private messages if you post to private messages, maybe this takes a Message and Destination, I need to know who is sending the private message and I need that from the access token...
The analogy here also breaks a bit as the sub of the access token (if access token is jwt) is usually the user id and I look that up or call the user info endpoint.
But isn't it simpler to just put the user principal name or email in the access token and save the backend from doing the lookup request on each user request?
There is no security benefit that i can see from having the email / username removed from the token, if the same token can be used to call the user info endpoint
Can I get some help with understanding this concept better as I feel maybe I am doing things incorrectly or have some kind of large misunderstanding on this concept.
I do know there are questions on this on stack overflow but I haven't found anything specifically to clear this up.

The analogy here also breaks a bit as the sub of the access token (if
access token is jwt) is usually the user id and I look that up or call
the user info endpoint.
But isn't it simpler to just put the user principal name or email in
the access token and save the backend from doing the lookup request on
each user request?
Yes, you can. If I talk about specific to JWT access token then the primary intent behind using it for stateless authentication which mean token should contain sufficient information for the resource server to authenticate the request without datastore lookup. Oauth access token are merely random keys and resource server does need authroization server for veriyfing the same. However, you can extends oauth authroization server with JWT support to issue the JWT token.

The hotel card key is a good analogy for the access token because it deals with delegation. Whoever presents the hotel card key can get in to the room. If needed there can be identity information of the original user (Resource Owner) in the access token, but in that case it does not represent the "presenter"'s identity, merely the "owner"s identity.
When you actually want to know who is entering the room (or, presenting the token), you'll need to revert to a different token and protocol such as the Identity Token in OpenID Connect.

I think, the hotel analogy helps to understand the concept of Bearer tokens. Anyone with a valid hotel card key can enter the room. This also means that if an attacker manages to steal a Bearer token (the hotel card key), the attacker can gain access to the API (the hotel room), as long as the token is valid.
In general, the API should use the claims from a JWT for authorization, then you can skip the call to the UserEndpoint. Consequently, this means that the JWT should contain all the claims that the API needs for proper authorization. However, be careful when designing the token. If the access tokens are JWTs then a malicious client (e.g. from an attacker) may still try to parse the JWT and access its claims even though the client is not intended to. Or an attacker may come across the token by other means and parse it. In such cases, the application leaks data to an unauthorized party (an attacker). Data leaks can imply a violation of data protection laws or other regulations and cause troubles depending on what data the token contained. Username and email may not be a big issue regarding compliance but they can still cause security issues. For example, username and email may both be used by an attacker for personalized attacks (think of phishing emails).
Note, that it doesn't matter if the token is valid or not - the data is still there.
At Curity, we often recommend using the Phantom Token Pattern. It places an API Gateway in front of the API. The Authorization Server only issues by-reference tokens to the client (i.e. some id). When the client calls the API via the API Gateway, the latter exchanges the by-reference token for a by-value token, i.e. a JWT. In this way, no data will be leaked to the client and the API can still benefit from the JWT and its claims.

Related

Step-up authentication with OAuth2

I'm looking for guidance and/or best practices in implementing step-up authentication. The generic scenario is as follows: user logs in to web app using some identity provider, user then goes to some specific area of web site which needs to be protected by additional MFA, for example OTP. All functionality for the website is via REST API, authenticating with JWT bearer token.
The best description of the flow I found is from Auth0 here. Basically, user acquires the access token with scope which is just enough to access APIs that do not require additional protection. When there is a need to access secure API the authorization handler on backend would check if the token has the scope indicating that the user has completed the additional MFA check, otherwise it's just HTTP 401.
Some sources, including the one from Auth0, suggest using amr claim as an indication of passed MFA check. That means that identity provider must be able to return this claim in response to access token request with acr_values parameter.
Now, the detail that is bugging me: should the frontend know in advance the list of API that might require MFA and request the elevated permissions beforehand, or should frontend treat the HTTP 401 response from backend as a signal to request elevated permissions and try again?
Should identity provider generate relatively additional short-lived token to access restricted APIs? Then, if frontend has 2 tokens it must definitely know which token to use with which API endpoint. Or maybe identity provider can re-issue the access token with normal lifespan but elevated permissions? Sounds less secure then the first approach.
Finally, do I understand the whole process right or not? Is there some well documented and time-tested flow at all?
CLIENTS
Clients can be made aware of authentication strength via ID tokens. Note that a client should never read access tokens - ideally they should use an opaque / reference token format. Most commonly the acr claim from the ID token is used.
This can be a little ugly and a better option can sometimes be to ask the API, eg 'can I make a payment'? The client sends an access token and gets a JSON response tailored to what the UI needs. The API's job is to serve the client after all.
APIs
APIs receive JWT access tokens containing scopes and claims. Often this occurs after an API gateway swaps an opaque token for a JWT.
The usual technique is that some logic in the Authorization Server omits high privilege ones, eg a payment scope or claim, unless strong authentication was used.
It is worth mentioning plain old API error responses here. Requests with an insufficient scope typically return 403, though a useful JSON error code in an error object can be useful for giving the client a more precise reason.
STEP UP
As you indicate, this involves the client running a new code flow, with different scope / claims and potentially also with an acr_values parameter. The Curity MFA Approaches article has some notes on this.
It should not be overused. The classic use case is for payments in banking scenarios. If the initial sign in / delegation used a password, the step up request might ask for a second factor, eg a PIN, but should not re-prompt for the password.
CLIENTS AND ACCESS TOKENS
It is awkward for a client to use multiple access tokens. If the original one had scope read then the stepped up one could use scope read write payment and completely replace the original one.
In this case you do not want the high privilege scope to hang around for long. This is yet another reason to use short lived access tokens (~ 15 minutes).
Interestingly also, different scopes the user has consented to can have different times to live, so that refreshed access tokens drop the payment scope.
ADVANCED EXAMPLE
Out of interest, here is an interesting but complicated article on payment workflows in Open Banking. A second OIDC redirect is used to get a payment scope after normal authentication but also to record consent in an audited manner. In a normal app this would be overkill however.

Should JWT access tokens contain PII?

JWT access tokens shouldn't contain personally identifiable information (PII) as I understand it. This is to keep them small but also if intercepted, reduce the exposure of the information contained.
The OIDC protocol asks for a user info endpoint to be implemented. It can be called using the access token and it will return a bunch of claims about the user. Effectively what the id token contains, but potentially even more information.
So even though the access token doesn't carry this PII itself, if intercepted it can certainly be used to expose all this information anyway. So the argument about PII in the access token doesn't really stand up.
Does this mean I should be fine including email in the access token, because the API might want it in addition to the sub claim?
There are several points to be addressed here:
Not all access tokens must allow access to the userinfo endpoint. First, your system must expose a userinfo endpoint. Secondly, the user must have consented to release information in the userinfo endpoint to the given client. So in case of some access tokens there will be no threat that a malicious party could access the userinfo endpoint. And sometimes the user can consent to only expose their username, so even if you gain access to userinfo you'll still not be able to read the email. (of course it depends on the implementation of the OIDC Provider)
In the majority of cases oauth access tokens are used as bearer tokens. That means that anyone who has the token can access any data which can be accessed with that token. If someone manages to steal that token they can do whatever the original client could. If it is a concern for you, you can use sender constrained tokens instead of bearer tokens (e.g. mTLS constrained tokens or implement DPoP). These tokens are tied to the client which originally requested them. An attacker would have to steal not only the access token, but also a certificate used to verify proof-of-possession. The implementation is a bit more tricky than with bearer tokens, but security is greatly improved.
I would avoid putting any PII in a JWT. JWT can be decoded just like that, and any information kept within can be read by anyone. Lt's say that someone manages to get hold of a JWT issued from your system, but it's expired. They will not be able to access the API, or userinfo, but they can still extract data from the JWT. It's much better to use opaque tokens as access tokens and exchange them in your gateway (something which is called a Phantom Token approach).
Interestingly enough I only recently gave a talk on that concrete subject - using JWTs as access tokens and the Phantom Token flow :) (you can view here if you're interested :) link)

Why do you need authorization grant when you can just give the token out directly?

Watching this video, it details in OAuth2 that the client application first has to get the authorization grant from the Authorization server and then use that grant to get a token before being able to access the resource server. What purpose does the grant serve? Why not give the client the token right away after the user signs on with his/her username and password?
Because it is more secure, for some application types.
What you describe is so called authorization-code-flow. It is normally used for "classical" web applications, where only the backend needs to access resource server. The exchange of authorization code to access token happens on the backend and access token never leaves it. Exchange can be done only once and in addition client id and secret (stored on the backend) are necessary.
Single-Page-Applications often use implicit-flow where access token is delivered to the frontend directly in the URL.
See more here:
IdentityServer Flows
EDIT: Q: "I still don't see how it is more secure given that you have to have the grant in order to get the token. Why need 2 things instead of just 1 thing to access the resource? If someone steals the token, they can access the resource anyway – stackjlei"
"Stealing" access token will work independent on how your application acquires it. However, stealing access token on the backend is much more difficult than on the frontend.
Authorization code is delivered to the backend also over the frontend but the risk that someone intercepts and uses it is tiny:
It can be exchanged only once.
You need client-id and client-secret in order to exchange it. Client-secret is only available on the backend.
Normally, authorization code will be exchanged by your backend to access-token immediately. So the lifetime of it is just several seconds. It does not matter if someone gets hold of used authorization code afterwards.
In your scenario there could be two servers, an Authorization and a Resource one.
It could be only one as well, but let's imagine this scenario.
The purpose of the Authorization Server is to issue short lived access tokens to known clients. The clients identify themselves via their CLientID and CLientSecret.
The Authorization Server ( AS ) holds the list of clients and their secrets and first checks to make sure the passed values match its list. If they do, it issues a short lived token.
Then the client can talk to the Resource Server ( RS ), while the token is valid. Once the token expires, a new one can be requested or the expired one can be refreshed if that is allowed by the Authorization Server.
The whole point here is security, Normally, the access tokens are passed in the Authorization header of the request and that request needs to be over https to make sure that the data can't be stolen. If, somehow, someone gets hold of an access token, they can only use it until it expires, hence why the short life of the tokens is actually very important. That's why you don't issue one token which never expires.
You have different type of OAuth. On type doesn't require to use the 'grant' authorization. It depend who are the user/application, the ressource owner and the server API.
This way, you - as a user - don't send the password to the application. The application will only use the grant token to gain access to your ressources.
I think this tuto is a pretty good thing if you want more details
https://www.digitalocean.com/community/tutorials/an-introduction-to-oauth-2

How does an OAuth2 (Bearer) token translate to an ACL

I have been looking into OAuth2 lately and I think I understand the authorization process.
However, what I don't seem to understand is, once authorization has taken place and an access_token and a refresh_token have been established to make calls, how is the decision made based on the access_token if the request can or cannot access a specific resource?
I.e. a token is send to the server to request a photo. How does the logic on the server determines, based on the given token, that access to that particular photo is allowed or denied?
The access_token is usually an opaque artifact. There's nothing intrinsic that associates it with a resource (e.g. a specific photo). When the authorization flow starts, you typically request a specific scope that defines the access you need. If the owner of the resource consents to this access, then the request succeeds. Users can revoke access too.
All this is app specific code. Each app defines what their scopes are and how they enforce the check.
You might want to explore Authorization Server as an example.
The access token is actually an encrypted object, this object defines the scopes and may re-establish the authorization.
Imagine the service provider giving you an HMAC encrypted token which makes no sense to you, but the endpoint knows how to decrypt it. On decryption, it would have info like :
{"scope":"Photos", "userID":"3refefe"}
So, basically the module handling over the token to you encrypts this JSON (or any other format) object and gives you the encrypted token. When you hit the API endpoint, it sends token to the decryption logic and fetches this JSON object and hence knows what all you are authorized to do.
This object can contain any type of info and in any format depending upon the service provider. I have described how an OAuth provider works here.
This should explain the basics of a minimalist OAuth framework.

Understanding the use of the user ID in a 3-legged OAuth session?

After a real brain bending session today I feel like I understand 3-legged OAuth authentication fairly well. What I'm still having trouble understanding is the use of the User ID. The examples I have seen so far all seem to just arbitrarily assign a user ID at the top of the sample script and go. That confuses me.
Most of the sample code I have seen seems to center around the concept of using a user ID and the OAuth server's consumer key for managing an OAuth "session" (in quotes because I'm not trying to conflate the term with a browser "session"). For example, the database sample code I've seen stores and retrieves the tokens and other information involved based on the user ID and consumer key field values.
I am now in that state of uncertainty where a few competing fragments of understanding are competing and conflicting:
1) If my understanding of the OAuth session details record or "OAuth store" lookups is correct, via the consumer key and user ID fields, then doesn't that mandate that I have a disparate user ID for each user using my application that connects with an OAuth server?
2) If #1 is correct, then how do I avoid having to create my own user accounts for different users, something I am trying to avoid? I am trying to write software that acts as a front end for an OAuth enabled service, so I don't need to have my own user records and the concomitant maintenance headaches. Instead I'll just let the OAuth server handle that end of the puzzle. However, it seems to follow that the downside of my approach would be that I'd have to reauthorize the user every session, since without my own persistent user account/ID I could not lookup a previously granted "good to revoked" access token, correct?
3) What bothers me is that I have read about some OAuth servers not permitting the passing of a dynamically specified callback URL during the requesting of the unauthorized token, making the passing of a consumer key and a user ID back to yourself impossible. Instead you specify the callback URL when you register as a developer/consumer and that's that. Fortunately the OAuth server I'm dealing with does allow that feature, but still, if I was dealing with one that wasn't, wouldn't that throw a giant monkey wrench into the whole idea of using the consumer key and user id pair to index the OAuth session details?
This is an answer to the question by Lode:
Is it correct that not only the provider needs to have user ids (that sounds logical) but also the client? So the client (using OAuth as a login system) needs to create a user (with an ID) before successfully authenticating them via the OAuth server. Making you have a lot of empty user accounts when authentication fails or access is not granted.
It's possible to use OAuth for authentication of users without having local accounts at the consumer application, but you've got to have some kind of session mechanism (cookies/get params) in order to have some internal session representation in which you would store the oauth_token.
For example, if someone has landed to your web application, and your application is a consumer of some OAuth provider, you will have to create a local session at your site for the end-user: for example with a cookie. Then you send the end-user to the OAuth provider for authorization of a token, so that your application can get protected resources from the provider. Currently you know nothing about the user and you don't care about his identity. You just want to use some protected information from the provider.
When the user comes back from the provider after successful authorization and brings back the oauth_token, you now have to store this token in the session that you previously created for the user. As long as you keep your session (and the token if it's needed for further requests for resources), you can consider that the end-user is logged in. In the moment that you delete his session or the token expires, you can consider him no more logged-in. This way you don't have to make your own users DB table or storage mechanism.
However, if you need to have some persistent information about the users in your application, that will be used between user sessions (logins), you have to maintain your own users in order to know with which user to associate the information.
As for the difference between openid and oauth - from the perspective of local accounts, there is no difference. It's all the same. The difference is only that with openid you receive immediately some basic user info (email, etc.) while with oauth you receive a token and you have to make one more request to get the basic user info (email, etc.)
There is no difference however in regard to local accounts, if you're going to use OpenID or OAuth.
I will try to tell my view on the issues that you raised and hope that will clear things a little bit...
First, the idea is that the OAuth server is protecting some API or DATA, which third party applications (consumers) want to access.
If you do not have user accounts or data at your API behind the OAuth server, then why would a consumer application want to use your service - what is it going to get from you? That being said, I can't imagine a scenario, where you have an OAuth server and you don't have user accounts behind it.
If you just want to use OAuth for login of users, without providing user data through API, then it's better to use OpenID, but again you will have to have user accounts at your side.
Your point is correct that you make lookups via Consumer Key and (Your) User ID, and that is because of the protocol design.
The general flow is:
OAuth server (Provider) issues unauthorized Request Token to consumer application
Consumer sends the end-user to authorize the Request Token at the OAuth server (Provider)
After end-user authorizes the token, an access token is issued and given to the consumer (I've skipped some details and steps here, as they are not important for what I want to say, e.g. the consumer receives valid access token at the end)
On the authorization step, it's your OAuth server that create and save as a pair - which local user (local for the provider) authorized which consumer (consumer key-user id pair).
After that, when the consumer application want to access end-users DATA or API from Provider, it just sends the access token, but no user details.
The OAuth server (Provider) then, can check by the token, which is the local USER ID that has authorized that token before that, in order to return user data or API functionallity for that user to the consumer.
I don't think that you can go without local users at your side, if you are a provider.
About the callback question, I think there's no difference if you have dynamic or static (on registration) callback URL in regard to how you handle OAuth sessions with consumer keys and user id. The OAuth specification itself, does not mandate to have a callback URL at all - it's an optional parameter to have, optional to send every time, or optional to register it only once in the beginning. The OAuth providers decide which option is best for them to use, and that's why there are different implementations.
When the provider has a static defined callback URL in the database, connected with a consumer, it is considered a more secure approach, because the end-user cannot be redirected to a 'false' callback URL.
For example, if an evil man steals the consumer key of a GreatApp, then he can make himself a consumer EvilApp that can impersonate the original GreatApp and send requests to the OAuth server as it was the original. However, if the OAuth server only allows static (predefined) callback URL, the requests of the EvilApp will always end at the GreatApp callback URL, and the EvilApp will not be able to get Access Token.

Resources