Something I can't wrap my head around.
As I understand the authorization code flow is supposed to be more secured than the implicit flow, because the tokens are not directly sent to the client from the authorization server, but rather retrieved by your backend.
So the flow is basically:
Browser gets the authorization code (as a URL parameter of sort).
Sends it to a public backend endpoint.
The backend sends the code + client secret to the authorization server, retrieves the tokens and stores them in the client's cookie/local storage for further use.
In this flow all the tutorials describe the authorization code as useless to the hacker, why is that? Can't a hacker use Postman or some other client and access your (public) API directly, make it go through step 3 and thus retrieve the tokens just the same?
What am I missing here?
The code is used exactly once. In many scenarios that an attacker might get access to the code, it's already been exchanged for an access token and therefore useless.
The authorization_code is a one-time token.
Authorization Code aka auth code is used publicly so that the client can establish a secure back channel between him and the authorization server so that he can exchange it with the access token without the use of a browser.
The auth code is public and can be intercepted via a proxy since it appears in the query of the redirect_uri and is used via the browser (which is considered insecure). The access token depends on the auth_code (public) and the client_secret (private) for the exchange. Without the client_secret an attacker can get the access token with brute-forcing this way through.
Summary: even if the attacker knows the authcode he can do anything without the client_secret given to the client at registration (or dynamically) and assumed to be secured.
I have a back-end processor, (imagine a chron job once a day generating reports), that needs to integrate with a third-party system. Their APIs only support the "Authorization code" grant type. The problem is I can't even fill out a request for a token as I don't have a redirect_uri (no website), and I definitely don't have a user of any kind. I'll just have the OAuth clientId and secret I provisioned via their developer portal, (Mashery), for my back-end report processor app.
I want to use the "Client credentials" grant type/flow since I'm just a back-end service.
Is there any way to fake this or hack it so my little back-end service can somehow work with authorization code flow?
Thanks in advance
No, there is no way to hack it. Client credentials only authenticate the client. A token issued for client credentials have no information about the user. If their API needs information about the user (you probably get information only about your user), then you need to have a token issued with Code Flow.
What you can do is to generate the OAuth token yourself. E.g. you can use oauth.tools to perform a Code Flow with their Authorization Server, or you can perform the flow from browser with a dummy redirect URI (e.g. http://localhost), the get the code returned from authorization request and perform a token request from curl.
Once you have an access and refresh token you can hard code them in your script (or read them from an env variable or file, etc). You can then call the API as long as the access token is valid, and use refresh token to get a new access token when it expires. You will not have to perform a new Code Flow for as long as the refresh token is valid.
If I authenticate with OpenID connect, I can authenticate my SPA ok.
How do I use the obtained access token now to access my own REST resources?
It's a simple question, but I don't find satisfactory answers.
A prominent answer I always find is 'use oidc when you don't have a backend'.
Now that makes me wonder if ever a webapp was created that didn't need a backend.
Oidc is almost always the answer when the question of storing a refresh token in the client pops up (like in 'use oidc, it's a better architecture and ditch the refresh token') but it doesn't really explain anything.
So when the user logs in with, say Google, he obtains an identity and an access token (to ensure that the user is who he claims he is).
So how do you use this to authenticate at your own REST service?
The only real way I see it as stateless is by sending another request at the server to the provider on every request to the REST api, to match the identity to the validity of the access token there.
If not, we fall back to the good 'ol session vs jwt discussion, which doesn't quite seem to click with the oidc because now we're duplicating authentication logic.
And the good 'ol refresh token in the browser is generally promoted as a bad idea, although you can keep access tokens in the browser session storage (according to the js oidc client library), autorefresh them with the provider and that's fine then (-.-).
I'm running again circles.
Anybody can lay this out for me and please break the loop?
Your SPA (frontend) needs to add an authorization header with access token to each API request. Frontend should implement the authorization code flow + PKCE (implicit flow is not recommended anymore) + it needs to refresh access token.
Your API (backend) needs to implement OIDC (or you can use "oidc auth" proxy in front of backend) - it just validates access token, eventually returns 401 (Unauthorized) for request with invalid/expired/... token. This token validation is stateless, because it needs only public key(s) to verify token signature + current timestamp. Public keys are usually downloaded when backends is starting from OIDC discovery URL, so they doesn't need to be redownloaded during every backend request.
BTW: refresh token in the browser is bad idea, because refresh token is equivalent of your own credentials
I am facing a custom implementation of OpenId Connect. But (there is always a but) I have some doubts:
I understand the process of obtainning an acces_token an a id_token, except the step when the OP provides an authorization_code to the client. If it is done by a redirect (using the redirect uri)
HTTP/1.1 302 Found
Location: https://client.example.org/cb?
code=SplxlOBeZQQYbYS6WxSbIA
&state=af0ifjsldkj
The end-user is able to see that authorization code? It does not expire? Imagine we catch it and we use later (some days later) Is it a security hole? Should the state be expired in the Token Endpoint?
The flow continues and we got at the client the Access_token and the id_token in the client.
How the Access_token should be used on the OP side ? It should be stored in a database? Or be self containing of the information required to validate it ?What would you recommend?
And in the client-side , both tokens should be sent in every request?
And the last doubt, if we have an Access_token the existence of an id_token is for representing authorization and authentication in separated tokens?
Extra doubts:
I know the process to obtain an access token but I have doubts of how the OP ,once generated and sent, it validates the access_token that comes with every request
How the OP knows an access token is valid? As far as I know, the OP should say that an access_token is valid/invalid. There should be some way to check it right? How it gets to know that a token represents a valid authenticated user if it is not stored in DB?
Is it a bad idea to store access_token in a cookie? Because sometimes we call to some webservices and we want to send access_token as parameter. Or there is another workaroundsolution?
How the access token should be stored in the Client , for example, in ASP.NET, in the session?
Thanks very much to all of you, I will give upvote and mark as answer as soon as you give me the explanations.
Thanks!
The end-user is able to see that authorization code?
Yes. Although, even if the authorization code can be seen, the token request requires that the client's secret be sent as well (which the browser does not see)
it does not expires? Imagine we catch it and we use later (some days later) It is a security hole? Should the state be expired in the Token Endpoint?
The spec says that the authorization code should expire. See https://www.rfc-editor.org/rfc/rfc6749#section-4.1.2.
How the Access_token should be used on the OP side ? It should be stored in a database? Or be self containing of the information required to validate it ?What would you recommend?
The access token should be stored on the OP if you want to be able to revoke the tokens. If you don't, the token will be in JWT format (self-contained)...but you should store it if you want to be able to revoke it whether it's a JWT or not.
And in the client-side , both tokens should be sent in every request?
No, just the access token.
And the last doubt, if we have an Access_token the existance of an id_token is for representing authorization and authentication in separeted tokens?
Yes, they are separate tokens for different purposes. Access token is for authorization and Id token is self contained and used to communicate to the client that the user is authenticated.
How the OP knows an access token is valid? As far as i know, the OP should say that an access_token is valid/invalid. There should be some way to check it right? How it gets to know that a token represents a valid authenticated user if it is not stored in DB?
see How to validate an OAuth 2.0 access token for a resource server? about thoughts on how the resource server should validate the access token before letting the request from the client go through.
It´s a bad idea to store access_token in a cookie? because sometimes we call to some webservices and we want to send access_token as parameter. Or there is another workaroundsolution?
I'm assuming you're using the authorization code grant flow (...from your questions). If that's the case, the reason why an authorization code is, first of all, passed back from the OP rather than the access token is so that the access token can stay hidden on the server side--away from the browser itself. With the authorization code grant flow, the access token should stay out of the browser. If you're wanting to send api requests to the resource server directly from the browser, then look into the oauth2 implicit flow (https://www.rfc-editor.org/rfc/rfc6749#section-4.2).
How the access token should be stored in the Client , for example, in ASP.NET, in the session?
In the OpenID Connect flavour of OAuth2, the access token is for offline_access (i.e. outside of an authenticated "session"). The access token could be used during the session of the user but it might be better to store the refresh token in the database so that your client app can request new access tokens whenever it needs to and as long as the refresh token is valid...even when the user's authentication is expired. The access token should be short-lived so storing it in the database is an option but not necessary.
OAuth 2.0 has multiple workflows. I have a few questions regarding the two.
Authorization code flow - User logs in from client app, authorization server returns an authorization code to the app. The app then exchanges the authorization code for access token.
Implicit grant flow - User logs in from client app, authorization server issues an access token to the client app directly.
What is the difference between the two approaches in terms of security? Which one is more secure and why?
I don't see a reason why an extra step (exchange authorization code for token) is added in one work flow when the server can directly issue an Access token.
Different websites say that Authorization code flow is used when client app can keep the credentials secure. Why?
The access_token is what you need to call a protected resource (an API). In the Authorization Code flow there are 2 steps to get it:
User must authenticate and returns a code to the API consumer (called the "Client").
The "client" of the API (usually your web server) exchanges the code obtained in #1 for an access_token, authenticating itself with a client_id and client_secret
It then can call the API with the access_token.
So, there's a double check: the user that owns the resources surfaced through an API and the client using the API (e.g. a web app). Both are validated for access to be granted. Notice the "authorization" nature of OAuth here: user grants access to his resource (through the code returned after authentication) to an app, the app get's an access_token, and calls on the user's behalf.
In the implicit flow, step 2 is omitted. So after user authentication, an access_token is returned directly, that you can use to access the resource. The API doesn't know who is calling that API. Anyone with the access_token can, whereas in the previous example only the web app would (it's internals not normally accessible to anyone).
The implicit flow is usually used in scenarios where storing client id and client secret is not recommended (a device for example, although many do it anyway). That's what the the disclaimer means. People have access to the client code and therefore could get the credentials and pretend to become resource clients. In the implicit flow all data is volatile and there's nothing stored in the app.
I'll add something here which I don't think is made clear in the above answers:
The Authorization-Code-Flow allows for the final access-token to never reach and never be stored on the machine with the browser/app. The temporary authorization-code is given to the machine with the browser/app, which is then sent to a server. The server can then exchange it with a full access token and have access to APIs etc. The user with the browser gets access to the API only through the server with the token.
Implicit flow can only involve two parties, and the final access token is stored on the client with the browser/app. If this browser/app is compromised so is their auth-token which could be dangerous.
tl;dr don't use implicit flow if you don't trust the users machine to hold tokens but you do trust your own servers.
The difference between both is that:
In Implicit flow,the token is returned directly via redirect URL with "#" sign and this used mostly in javascript clients or mobile applications that do not have server side at its own, and the client does not need to provide its secret in some implementations.
In Authorization code flow, code is returned with "?" to be readable by server side then server side is have to provide client secret this time to token url to get token as json object from authorization server. It is used in case you have application server that can handle this and store user token with his/her profile on his own system, and mostly used for common mobile applications.
so it is depends on the nature of your client application, which one more secure "Authorization code" as it is request the secret on client and the token can be sent between authorization server and client application on very secured connection, and the authorization provider can restrict some clients to use only "Authorization code" and disallow Implicit
Which one is more secure and why?
Both of them are secure, it depends in the environment you are using it.
I don't see a reason why an extra step (exchange authorization code
for token) is added in one work flow when the server can directly
issue an Access token.
It is simple. Your client is not secure. Let's see it in details.
Consider you are developing an application against Instagram API, so you register your APP with Instagram and define which API's you need. Instagram will provide you with client_id and client_secrect
On you web site you set up a link which says. "Come and Use My Application". Clicking on this your web application should make two calls to Instagram API.
First send a request to Instagram Authentication Server with below parameters.
1. `response_type` with the value `code`
2. `client_id` you have get from `Instagram`
3. `redirect_uri` this is a url on your server which do the second call
4. `scope` a space delimited list of scopes
5. `state` with a CSRF token.
You don't send client_secret, You could not trust the client (The user and or his browser which try to use you application). The client can see the url or java script and find your client_secrect easily. This is why you need another step.
You receive a code and state. The code here is temporary and is not saved any where.
Then you make a second call to Instagram API (from your server)
1. `grant_type` with the value of `authorization_code`
2. `client_id` with the client identifier
3. `client_secret` with the client secret
4. `redirect_uri` with the same redirect URI the user was redirect back to
5. `code` which we have already received.
As the call is made from our server we can safely use client_secret ( which shows who we are), with code which shows the user have granted out client_id to use the resource.
In response we will have access_token
The implicit grant is similar to the authorization code grant with two distinct differences.
It is intended to be used for user-agent-based clients (e.g. single page web apps) that can’t keep a client secret because all of the application code and storage is easily accessible.
Secondly instead of the authorization server returning an authorization code which is exchanged for an access token, the authorization server returns an access token.
Please find details here
http://oauth2.thephpleague.com/authorization-server/which-grant/
Let me summarize the points that I learned from above answers and add some of my own understandings.
Authorization Code Flow!!!
If you have a web application server that act as OAuth client
If you want to have long lived access
If you want to have offline access to data
when you are accountable for api calls that your app makes
If you do not want to leak your OAuth token
If you don't want you application to run through authorization flow every time it needs access to data. NOTE: The Implicit Grant flow does not entertain refresh token so if authorization server expires access tokens regularly, your application will need to run through the authorization flow whenever it needs access.
Implicit Grant Flow!!!
When you don't have Web Application Server to act as OAuth Client
If you don't need long lived access i.e only temporary access to data is required.
If you trust the browser where your app runs and there is limited concern that the access token will leak to untrusted users.
Implicit grant should not be used anymore, see the IETF current best practices for details. https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics-18#section-2.1.2
As an alternative use a flow with response type code; for clients without possibility to securely store client credentials the authorization code with PKCE flow should be your choice.
From practical perspective (What I understood), The main reason for having Authz code flow is :
Support for refresh tokens (long term access by apps on behalf of User), not supported in implicit: refer:https://www.rfc-editor.org/rfc/rfc6749#section-4.2
Support for consent page which is a place where Resource Owner can control what access to provide (Kind of permissions/authorization page that you see in google). Same is not there in implicit . See section : https://www.rfc-editor.org/rfc/rfc6749#section-4.1 , point (B)
"The authorization server authenticates the resource owner (via the user-agent) and establishes whether the resource owner grants or denies the client's access request"
Apart from that, Using refresh tokens, Apps can get long term access to user data.
There seem to be two key points, not discussed so far, which explain why the detour in the Authorization Code Grant Type adds security.
Short story: The Authorization Code Grant Type keeps sensitive information from the browser history, and the transmission of the token depends only on the HTTPS protection of the authorization server.
Longer version:
In the following, I'll stick with the OAuth 2 terminology defined in the RFC (it's a quick read): resource server, client, authorization server, resource owner.
Imagine you want some third-party app (= client) to access certain data of your Google account (= resource server). Let's just assume Google uses OAuth 2. You are the resource owner for the Google account, but right now you operate the third-party app.
First, the client opens a browser to send you to the secure URL of the Google authorization server. Then you approve the request for access, and the authorization server sends you back to the client's previously-given redirect URL, with the authorization code in the query string. Now for the two key points:
The URL of this redirect ends up in the browser history. So we don't want a long lived, directly usable access token here. The short lived authorization code is less dangerous in the history. Note that the Implicit Grant type does put the token in the history.
The security of this redirect depends on the HTTPS certificate of the client, not on Google's certificate. So we get the client's transmission security as an extra attack vector (For this to be unavoidable, the client needs to be non-JavaScript. Since otherwise we could transmit the authorization code via fragment URL, where the code would not go through the network. This may be the reason why Implicit Grant Type, which does use a fragment URL, used to be recommended for JavaScript clients, even though that's no longer so.)
With the Authorization Code Grant Type, the token is finally obtained by a call from the client to the authorization server, where transmission security only depends on the authorization server, not on the client.