I have been looking into OAuth2 lately and I think I understand the authorization process.
However, what I don't seem to understand is, once authorization has taken place and an access_token and a refresh_token have been established to make calls, how is the decision made based on the access_token if the request can or cannot access a specific resource?
I.e. a token is send to the server to request a photo. How does the logic on the server determines, based on the given token, that access to that particular photo is allowed or denied?
The access_token is usually an opaque artifact. There's nothing intrinsic that associates it with a resource (e.g. a specific photo). When the authorization flow starts, you typically request a specific scope that defines the access you need. If the owner of the resource consents to this access, then the request succeeds. Users can revoke access too.
All this is app specific code. Each app defines what their scopes are and how they enforce the check.
You might want to explore Authorization Server as an example.
The access token is actually an encrypted object, this object defines the scopes and may re-establish the authorization.
Imagine the service provider giving you an HMAC encrypted token which makes no sense to you, but the endpoint knows how to decrypt it. On decryption, it would have info like :
{"scope":"Photos", "userID":"3refefe"}
So, basically the module handling over the token to you encrypts this JSON (or any other format) object and gives you the encrypted token. When you hit the API endpoint, it sends token to the decryption logic and fetches this JSON object and hence knows what all you are authorized to do.
This object can contain any type of info and in any format depending upon the service provider. I have described how an OAuth provider works here.
This should explain the basics of a minimalist OAuth framework.
Related
I am struggling with the concept of access tokens in regards to them not having the identity of the person making the request.
And i think i may be using access tokens incorrectly in my project.
The thing I don't understand is access tokens are often explained as a hotel card key to get into a room, but this analogy just doesn't make sense to me when it comes to the vast majority of requests.
For example say i have a resource for your private messages and scope read:privateMessages
If you want to do some action like read private messages, and you send me an access token I have to know who it is, as in I can't send you back someone else's private messages or all private messages.
A hotel key in this case doesn't work here, a hotel card that has scope read:gymroom write:gymroom
Then yeah you can access the room, but in any applications i have worked on I really need to know who is making the request.
To go back to private messages if you post to private messages, maybe this takes a Message and Destination, I need to know who is sending the private message and I need that from the access token...
The analogy here also breaks a bit as the sub of the access token (if access token is jwt) is usually the user id and I look that up or call the user info endpoint.
But isn't it simpler to just put the user principal name or email in the access token and save the backend from doing the lookup request on each user request?
There is no security benefit that i can see from having the email / username removed from the token, if the same token can be used to call the user info endpoint
Can I get some help with understanding this concept better as I feel maybe I am doing things incorrectly or have some kind of large misunderstanding on this concept.
I do know there are questions on this on stack overflow but I haven't found anything specifically to clear this up.
The analogy here also breaks a bit as the sub of the access token (if
access token is jwt) is usually the user id and I look that up or call
the user info endpoint.
But isn't it simpler to just put the user principal name or email in
the access token and save the backend from doing the lookup request on
each user request?
Yes, you can. If I talk about specific to JWT access token then the primary intent behind using it for stateless authentication which mean token should contain sufficient information for the resource server to authenticate the request without datastore lookup. Oauth access token are merely random keys and resource server does need authroization server for veriyfing the same. However, you can extends oauth authroization server with JWT support to issue the JWT token.
The hotel card key is a good analogy for the access token because it deals with delegation. Whoever presents the hotel card key can get in to the room. If needed there can be identity information of the original user (Resource Owner) in the access token, but in that case it does not represent the "presenter"'s identity, merely the "owner"s identity.
When you actually want to know who is entering the room (or, presenting the token), you'll need to revert to a different token and protocol such as the Identity Token in OpenID Connect.
I think, the hotel analogy helps to understand the concept of Bearer tokens. Anyone with a valid hotel card key can enter the room. This also means that if an attacker manages to steal a Bearer token (the hotel card key), the attacker can gain access to the API (the hotel room), as long as the token is valid.
In general, the API should use the claims from a JWT for authorization, then you can skip the call to the UserEndpoint. Consequently, this means that the JWT should contain all the claims that the API needs for proper authorization. However, be careful when designing the token. If the access tokens are JWTs then a malicious client (e.g. from an attacker) may still try to parse the JWT and access its claims even though the client is not intended to. Or an attacker may come across the token by other means and parse it. In such cases, the application leaks data to an unauthorized party (an attacker). Data leaks can imply a violation of data protection laws or other regulations and cause troubles depending on what data the token contained. Username and email may not be a big issue regarding compliance but they can still cause security issues. For example, username and email may both be used by an attacker for personalized attacks (think of phishing emails).
Note, that it doesn't matter if the token is valid or not - the data is still there.
At Curity, we often recommend using the Phantom Token Pattern. It places an API Gateway in front of the API. The Authorization Server only issues by-reference tokens to the client (i.e. some id). When the client calls the API via the API Gateway, the latter exchanges the by-reference token for a by-value token, i.e. a JWT. In this way, no data will be leaked to the client and the API can still benefit from the JWT and its claims.
I have a rest API service that enables API calls for some resources.
The API calls can be done from an API server using a secret, or from a web client using a bearer token.
In order to get to the web client flow, a URL must be generated by the API server call:
https://www.someapi.com/somelink
The response will be a link with a token I call "action token". This token is passed with the link, for example:
https://www.somepage.com/?action_token=somejwttoken
By entering the link, the action token will be exchanged with an access token, that enables a client to access some pre-defined resources.
I wonder if there is a best practice for such flow.
ONE TIME TOKENS
If using tokens in URLs you should also ensure that:
The token has a one time use
The token is short lived
The token is not guessable - eg make it a UUID
The final access token returned has limited privileges and also a short lifetime
The one time action token may then be available in the browser history or server logs, so be aware of that. Make the actual swap for the proper token a JSON to JSON POST operation, after the page has loaded.
Typically each action token issued would also be stored by the API in a database along with its expiry time, then used to verify links later.
This type of URL solution is sometimes used in operations such as verify email or to enable magic links, where users can quickly access limited data.
OAUTH FLOWS
One time tokens should not be overused and they are of course no substitute for signing the user in properly via OpenID Connect standards, then running an authenticated user session.
SCOPES AND API TO API FLOWS
A more secure option may be to forward JWTs from the original client to downstream APIs, as described in Scopes Best Practices. This enables user context to flow securely and also simplifies code. This may eliminate or reduce the need for any token exchange operations.
I am missing some understanding of OAuth2 access_token hope someone can explain or guide me to what I am missing.
I am using Microsoft Azure AD as an authentication provider for my application, I used the returned id_token after successful authentication to extend it with some additional data custom to my application (to facilitate authorization).
I am doing this throw JWT.sign, I decode the data from id_token and add data then I sign it using a secret key saved at the server.
My question is, can I do the same for access_token? When I tried to do so, I get unauthorized.
Am I doing something wrong? Or this is not possible? And why is this happening, I don't find any request made to MS to validated my new signed access_token.
You should never change tokens issued - this is not a correct thing to do. But your point about using domain specific claims is totally valid - all real world systems need these for their authorization.
OPTION 1
Some specialist providers can reach out at time of token issuance and contact your APIs, to get domain specific data to include in tokens. See this Curity article for how that works. I don't think Azure AD supports this though.
PRIVACY
It is best to avoid revealing sensitive data in readable tokens returned to internet clients. If you include name, email etc in ID tokens or access tokens this may be flagged up in PEN tests, since it is Personally Identifiable Information and revealing it can conflict with regulations such as GDPR.
Curity recommends protecting access tokens by issuing them in an opaque reference token format - via the phantom token pattern.
OPTION 2
An option that would work fir Azure AD is to adopt the following approaches:
Look up extra domain specific claims in your API when an access token is first received, then cache results for further API requests with the same access token. See this Azure AD Code Sample class of mine for some code that builds a custom ClaimsPrincipal. Note that the API continues to validate the JWT on every request.
If the UI needs extra domain specific claims then serve them from your API, which can return both OAuth User Info and domain specific data from its ClaimsPrincipal to the UI. See this API controller class for how that looks. Personally I always do this and never read ID tokens in UIs - which should also never read access tokens.
Applications interacting with Azure AD, receive ID tokens after authenticating the users. The applications use access tokens and refresh tokens while interacting with APIs.
The id_token is a JSON Web Token (JWT) which has user profile
attributes in the form of claims. The ID Token is consumed by the
application and used to get user information like the user's name,
email.
The Access Token on the otherhand is a credential that can be
used by an application to access an API.
So if you need application to access api, there the access token is used and you may follow the suggestion steps provided by Tiny Wang
Similar to id tokens, access tokens are also signed, but they are not
encrypted. As per IETF OAuth (RFC 6749) standard specification ,
access token can have different formats and structures for each
services whereas, id token should be JWT format.
To validate an id_token or an access_token, your app has to validate
both the token's signature and the claims. To validate access tokens,
your app should also validate the issuer, the audience, and the
signing tokens.
So in production application, you should get id token by specifying
“id_token+code” or “id_token+token” as response_type to verify
whether the authentication is correctly succeeded. It means it uses
the id_token for authentication and “code” to exchange access_token
to access the resource for authorization.
In short id_token is used to identify the authenticated user, and the
access token is used to prove access rights to protected resources.
Refer this for the information regarding access token and id token.
Imagine you're going through a standard OAuth2 process to retrieve an access_token for some third-party API. This is the usual process.
User is redirected to http://some-service.com/login
User successfully logs in and is redirected to some destination http://some-destination.com. In this step, there's usually a code parameter that comes with the request. So the actual URL looks like http://some-destination.com?code=CODE123
I need to use CODE123 to request an access_token that can be used to authorize my future API calls. To do this, there's an endpoint that usually looks like this (I am using Nylas as an example but should be generic enough):
As you can see, it requires me to POST the code from above (CODE123) along with client_id and client_secret like this: http://some-service.com/oauth/token?code=CODE123&client_secret=SECRET&client_id=ID. As a response, I get an access_token that looks like TOKEN123 and I can use this to make API calls.
QUESTION
Until step 2, everything happens in the client side. But in step 3, I need to have client_id and client_secret. I don't think it's a good idea to store these values in the client side. Does that mean I need to have a backend server that has these two values, and my backend should convert CODE123 to TOKEN123 and hand it to the client side?
As you probably know, the question describes the most common (and usually, the more secure) OAuth "Authorization Code" flow. To be clear, here's an approximation of the steps in this flow:
User indicates that they wish to authorize resources for our application (for example, by clicking on a button)
The application redirects the user to the third-party login page, where the user logs in and selects which resources to grant access to
The third-party service redirects the user back to our application with an authorization code
Our application uses this code, along with its client ID and secret to obtain an access token that enables the application to make requests on behalf of the user only for the resources that the user allowed
Until step 2, everything happens in the client side. But in step 3, I need to have client_id and client_secret. I don't think it's a good idea to store these values in the client side. Does that mean I need to have a backend server that has these two values[?]
You're correct, it's certainly not a good idea to store these values in the client-side application. These values—especially the client secret—must be placed on a server to protect the application's data. The user—and therefor, the client application—should never have access to these values.
The server uses its client ID and secret, along with the authorization code, to request an access token that it uses for API calls. The server may store the token it receives, along with an optional refresh token that it can use in the future to obtain a new access token without needing the user to explicitly authorize access again.
...and my backend should convert CODE123 to TOKEN123 and hand it to the client side?
At the very least, our server should handle the authorization flow to request an access token, and then pass only that token back to the client (over a secure connection).
However, at this point, the client-side application (and the user of that client) is responsible for the security of the access token. Depending on the security requirements of our application, we may want to add a layer to protect this access token from the client as well.
After the server-side application fetches the access token from the third-party service, if we pass the access token back to the client, malware running on the client machine, or an unauthorized person, could potentially obtain the access token from the client, which an attacker could then use to retrieve or manipulate the user's third-party resources through privileges granted to our application. For many OAuth services, an access token is not associated with a client. This means that anyone with a valid token can use the token to interact with the service, and illustrates why our application should only request the minimum scope of access needed when asking for authorization from the user.
To make API calls on behalf of a user more securely, the client-side application could send requests to our server, which, in turn, uses the access token that it obtained to interact with the third-party API. With this setup, the client does not need to know the value of the access token.
To improve performance, we likely want to cache the access token on the server for subsequent API calls for the duration of its lifetime. We may also want to encrypt the tokens if we store them in the application's database—just like we would passwords—so the tokens cannot be easily used in the event of a data breach.
Watching this video, it details in OAuth2 that the client application first has to get the authorization grant from the Authorization server and then use that grant to get a token before being able to access the resource server. What purpose does the grant serve? Why not give the client the token right away after the user signs on with his/her username and password?
Because it is more secure, for some application types.
What you describe is so called authorization-code-flow. It is normally used for "classical" web applications, where only the backend needs to access resource server. The exchange of authorization code to access token happens on the backend and access token never leaves it. Exchange can be done only once and in addition client id and secret (stored on the backend) are necessary.
Single-Page-Applications often use implicit-flow where access token is delivered to the frontend directly in the URL.
See more here:
IdentityServer Flows
EDIT: Q: "I still don't see how it is more secure given that you have to have the grant in order to get the token. Why need 2 things instead of just 1 thing to access the resource? If someone steals the token, they can access the resource anyway – stackjlei"
"Stealing" access token will work independent on how your application acquires it. However, stealing access token on the backend is much more difficult than on the frontend.
Authorization code is delivered to the backend also over the frontend but the risk that someone intercepts and uses it is tiny:
It can be exchanged only once.
You need client-id and client-secret in order to exchange it. Client-secret is only available on the backend.
Normally, authorization code will be exchanged by your backend to access-token immediately. So the lifetime of it is just several seconds. It does not matter if someone gets hold of used authorization code afterwards.
In your scenario there could be two servers, an Authorization and a Resource one.
It could be only one as well, but let's imagine this scenario.
The purpose of the Authorization Server is to issue short lived access tokens to known clients. The clients identify themselves via their CLientID and CLientSecret.
The Authorization Server ( AS ) holds the list of clients and their secrets and first checks to make sure the passed values match its list. If they do, it issues a short lived token.
Then the client can talk to the Resource Server ( RS ), while the token is valid. Once the token expires, a new one can be requested or the expired one can be refreshed if that is allowed by the Authorization Server.
The whole point here is security, Normally, the access tokens are passed in the Authorization header of the request and that request needs to be over https to make sure that the data can't be stolen. If, somehow, someone gets hold of an access token, they can only use it until it expires, hence why the short life of the tokens is actually very important. That's why you don't issue one token which never expires.
You have different type of OAuth. On type doesn't require to use the 'grant' authorization. It depend who are the user/application, the ressource owner and the server API.
This way, you - as a user - don't send the password to the application. The application will only use the grant token to gain access to your ressources.
I think this tuto is a pretty good thing if you want more details
https://www.digitalocean.com/community/tutorials/an-introduction-to-oauth-2