As soon as I log in successfully to the authentication server, the server redirects back to the application with an authorization code. And then this authorization code is used to get the access token in the backend. My doubt is if somebody has seen/captured or copied my authorization code before I have used it. Then he can also login with my credential. I want to know, Is it correct what I am thinking? Or I am missing some security flow in the process.
Edit: I am mostly concerned with the case where somebody has seen the authorization code in my browser history, and then he is sending this code from other machine to get access token. How can we prevent it.
You are correct: this is why in OAuth 2.0 there should be a fixed, registered callback URI that the Client receives the Authorization Code on, which is enforced by profiles such as OpenID Connect. The security considerations section of the specification analyzes the risks of the authorization code concept in more depth: https://www.rfc-editor.org/rfc/rfc6749#section-10.5
Your client's secret is required to exchange authorization code for access token. So if you keep your client secret at your backend, the attacker may be able to capture your authorization code, but won't be able to get access token.
Then, a lot depends on your redirect uri. If it belongs to your server, is https (so attacker won't able to listen the traffic) and your server doesn't allow OpenRedirect - the attacker will probably fail to capture the code.
Next, authorization codes are expected to be single-use. So if attacker finds the code in your browser history - this code is probably have been exchanged for token - that's why it can not be used once more time.
Please note that here I assume well-designed OAuth 2.0. OAuth 2.0 implementations that ignore the requirements of RFC 4749 can be subject to a large variety of attacks.
Related
I understand the Oauth code flow which involves the mobile app, app server, auth server, resource server. The app server is registered with auth server using the clientidand secret. The idea being that mobile app calls an endpoint of the app server which triggers the code flow eventually resulting in callback from the auth server to the app server with the auth code. The app server presents the secret and code to auth server to get the access token.
The other legacy option where there is no clientid and secret is the implicit flow wherein the mobile app receives the redirect url with the auth code (assuming redirect url destination is a SPA) which will invoke auth server endpoint to get the access token.
This is insecure because anyone can steal the access code from the url.
The solution to this for clients like mobile app is to use pkce. A random number hash is sent in the initial request which is verified later on when the auth code is passed to retrieve the access token.
This prevents the compromise of the access code from the url if an attacker is snooping because without initial hash the auth code is useless.
However how can the situation where the mobile phone is hacked and the secret and auth code is recorded by an attacker be handled to prevent misuse?
However how can the situation where the mobile phone is hacked and the
secret and auth code is recorded by an attacker be handled to prevent
misuse?
This is out of scope for OAuth 2.0 & related specifications. This issue is similar to storing encryption details in a server, but still, the server can be attacked by gaining physical access. It's a different attack vector altogether. It is user's duty to make sure their devices are safe from other vulnerabilities.
However, PKCE provides an extra security layer for public clients' usage of OAuth flow. It prevents attacks based on redirect (authorization code stealing), by establishing secondary validation at the authorization server.
In general, read through OAuth 2.0 Threat Model and Security Considerations & OAuth 2.0 for Native Apps for best practice suggestions.
These are the standard options:
PKCE uses a different code_verifier and code_challenge for every login attempt. If an authorization code is somehow captured from the system browser by an attacker it cannot be exhanged for tokens. No client secret is used, since a mobile app is a public client.
Use HTTPS redirect URIs (based on mobile deep links) so that if an attacker steals your client_id and redirect_uri they cannot receive the response containing the authorization code and will not be able to get tokens.
See this previous answer of mine for some further details, though claimed HTTPS schemes are tricky to implement.
Of course if an attacker has full control over a device, including authentication factors such as autofilled passwords, there may still be attack vectors
In Oauth Open ID - Authorization Code grant type flow,
We will call the Oauth service provider with the client_id = '..', redirect_uri='...', response_type='code', scope='...', state='...'.
Then from Oauth Service Provider, we will get the authorization code instead of the token.
Q1. So what is the next step? Do we send the code to the back end where the token request will happen or will we call the Oauth service provider from the browser it self?
Q2. Why do we need this additional calls? what problem it is solving?
Q3 After the token is received, how we use it in a typical web application?
p.s: I have read lot of blogs, but unable to get the whole picture. Could you please help me?
Q1. In 2021 it is recommended to keep tokens out of the browser, so send the code to the back end, which will exchange it for tokens and issue secure SameSite HTTP Only cookies to the browser. The cookies can contain tokens if they are strongly encrypted.
Q2. The separation is to protect against browser attacks, where login redirects take place. An authorization code can only be used once but can potentially be intercepted - by a 'man in the browser' - eg some kind of plugin or malicious code. If this happens then the attacker cannot exchange it for tokens since a code_verifier and client_secret are also needed.
Q3. The token is sent from the browser to APIs, but the browser cannot store tokens securely. So it is recommended to unpack tokens from cookies in a server side component, such as a reverse proxy. This limits the scope for tokens to be intercepted in the browser, and also deals well with token renewal, page reloads and multi tab browsing.
APPROACHES
The above type of solution can be implemented in two different ways:
Use a website based technology that does OAuth work and also serves web content
Use an SPA and implement OAuth work in an API driven manner
Unfortunately OAuth / OpenID in the browser is difficult. At Curity we have provided some resources based on the benefit of our experience, and we hope that this provides a 'whole picture' view of overall behaviour for modern browser based apps:
Code
Docs
I went thru multiple posts saying how implicit grant is a security risk and why auth code grant with AJAX request to Authorization server should be used after redirecting to application (without client_secret passed to Auth server).
Now in 2019 there is no CORS issue as I can allow app domains on authrization server.
I have following concerns
If I use implicit grant:
Now implicit grant has security issues as Authorization server redirects to application server with token in url.
If I set expiration time to 5 to 10 minutes, after expiration, user will be redirected to login and its problematic especially if he is filling up important form on application. What to do in this scenario? Note that there is no refresh token in Implicit grant to update with new token, so refresh token is out of the picture.
If I use Auth code grant:
Suppose if I hit AJAX request after getting redirected to my main application site, and get token in exchange of code,
Auth code grant uses client_secret. And in javascript app where anyone can see the code, we cant use secret.
Assume now if I dont use client_secret. There are multiple sites that use auth server say site 1, 2, 3. Now if we say dont use secret, anyone can make host entry in nginx server that will have my site's domain name but his own IP address. In this case host injection is the issue. How to deal with it?
What approach should be taken here? I am more inclined towards auth_code for SPA but the issue is how to deal with client_secret?
Thank you for reading.
There are multiple links that recommends use Auth code grant instead of SPA. A few out of multiple links :
https://www.oauth.com/oauth2-servers/single-page-apps/
https://medium.com/oauth-2/why-you-should-stop-using-the-oauth-implicit-grant-2436ced1c926
You've already linked references that make it clear that you should NOT use implicit grant in an SPA. Your confusion seems to come from your assumption that the code grant flow requires use of a client secret, but that is not the case. A SPA, like any browser or device app, is (should be) a PUBLIC client, and cannot be trusted with a secret. Therefore, it does not use a secret. The client secret is suitable ONLY for use with private clients, which is to say, server-side code calling the auth api.
the 2019 recommendations are to use a PKCE variation of the Authorization Code Flow that does not need a client secret:
https://brockallen.com/2019/01/03/the-state-of-the-implicit-flow-in-oauth2/
There's some write ups and code samples on my blog that might be useful to you - I updated this one on messages recently:
https://authguidance.com/2017/09/26/basicspa-oauthworkflow/
OAuth 2.0 has multiple workflows. I have a few questions regarding the two.
Authorization code flow - User logs in from client app, authorization server returns an authorization code to the app. The app then exchanges the authorization code for access token.
Implicit grant flow - User logs in from client app, authorization server issues an access token to the client app directly.
What is the difference between the two approaches in terms of security? Which one is more secure and why?
I don't see a reason why an extra step (exchange authorization code for token) is added in one work flow when the server can directly issue an Access token.
Different websites say that Authorization code flow is used when client app can keep the credentials secure. Why?
The access_token is what you need to call a protected resource (an API). In the Authorization Code flow there are 2 steps to get it:
User must authenticate and returns a code to the API consumer (called the "Client").
The "client" of the API (usually your web server) exchanges the code obtained in #1 for an access_token, authenticating itself with a client_id and client_secret
It then can call the API with the access_token.
So, there's a double check: the user that owns the resources surfaced through an API and the client using the API (e.g. a web app). Both are validated for access to be granted. Notice the "authorization" nature of OAuth here: user grants access to his resource (through the code returned after authentication) to an app, the app get's an access_token, and calls on the user's behalf.
In the implicit flow, step 2 is omitted. So after user authentication, an access_token is returned directly, that you can use to access the resource. The API doesn't know who is calling that API. Anyone with the access_token can, whereas in the previous example only the web app would (it's internals not normally accessible to anyone).
The implicit flow is usually used in scenarios where storing client id and client secret is not recommended (a device for example, although many do it anyway). That's what the the disclaimer means. People have access to the client code and therefore could get the credentials and pretend to become resource clients. In the implicit flow all data is volatile and there's nothing stored in the app.
I'll add something here which I don't think is made clear in the above answers:
The Authorization-Code-Flow allows for the final access-token to never reach and never be stored on the machine with the browser/app. The temporary authorization-code is given to the machine with the browser/app, which is then sent to a server. The server can then exchange it with a full access token and have access to APIs etc. The user with the browser gets access to the API only through the server with the token.
Implicit flow can only involve two parties, and the final access token is stored on the client with the browser/app. If this browser/app is compromised so is their auth-token which could be dangerous.
tl;dr don't use implicit flow if you don't trust the users machine to hold tokens but you do trust your own servers.
The difference between both is that:
In Implicit flow,the token is returned directly via redirect URL with "#" sign and this used mostly in javascript clients or mobile applications that do not have server side at its own, and the client does not need to provide its secret in some implementations.
In Authorization code flow, code is returned with "?" to be readable by server side then server side is have to provide client secret this time to token url to get token as json object from authorization server. It is used in case you have application server that can handle this and store user token with his/her profile on his own system, and mostly used for common mobile applications.
so it is depends on the nature of your client application, which one more secure "Authorization code" as it is request the secret on client and the token can be sent between authorization server and client application on very secured connection, and the authorization provider can restrict some clients to use only "Authorization code" and disallow Implicit
Which one is more secure and why?
Both of them are secure, it depends in the environment you are using it.
I don't see a reason why an extra step (exchange authorization code
for token) is added in one work flow when the server can directly
issue an Access token.
It is simple. Your client is not secure. Let's see it in details.
Consider you are developing an application against Instagram API, so you register your APP with Instagram and define which API's you need. Instagram will provide you with client_id and client_secrect
On you web site you set up a link which says. "Come and Use My Application". Clicking on this your web application should make two calls to Instagram API.
First send a request to Instagram Authentication Server with below parameters.
1. `response_type` with the value `code`
2. `client_id` you have get from `Instagram`
3. `redirect_uri` this is a url on your server which do the second call
4. `scope` a space delimited list of scopes
5. `state` with a CSRF token.
You don't send client_secret, You could not trust the client (The user and or his browser which try to use you application). The client can see the url or java script and find your client_secrect easily. This is why you need another step.
You receive a code and state. The code here is temporary and is not saved any where.
Then you make a second call to Instagram API (from your server)
1. `grant_type` with the value of `authorization_code`
2. `client_id` with the client identifier
3. `client_secret` with the client secret
4. `redirect_uri` with the same redirect URI the user was redirect back to
5. `code` which we have already received.
As the call is made from our server we can safely use client_secret ( which shows who we are), with code which shows the user have granted out client_id to use the resource.
In response we will have access_token
The implicit grant is similar to the authorization code grant with two distinct differences.
It is intended to be used for user-agent-based clients (e.g. single page web apps) that can’t keep a client secret because all of the application code and storage is easily accessible.
Secondly instead of the authorization server returning an authorization code which is exchanged for an access token, the authorization server returns an access token.
Please find details here
http://oauth2.thephpleague.com/authorization-server/which-grant/
Let me summarize the points that I learned from above answers and add some of my own understandings.
Authorization Code Flow!!!
If you have a web application server that act as OAuth client
If you want to have long lived access
If you want to have offline access to data
when you are accountable for api calls that your app makes
If you do not want to leak your OAuth token
If you don't want you application to run through authorization flow every time it needs access to data. NOTE: The Implicit Grant flow does not entertain refresh token so if authorization server expires access tokens regularly, your application will need to run through the authorization flow whenever it needs access.
Implicit Grant Flow!!!
When you don't have Web Application Server to act as OAuth Client
If you don't need long lived access i.e only temporary access to data is required.
If you trust the browser where your app runs and there is limited concern that the access token will leak to untrusted users.
Implicit grant should not be used anymore, see the IETF current best practices for details. https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics-18#section-2.1.2
As an alternative use a flow with response type code; for clients without possibility to securely store client credentials the authorization code with PKCE flow should be your choice.
From practical perspective (What I understood), The main reason for having Authz code flow is :
Support for refresh tokens (long term access by apps on behalf of User), not supported in implicit: refer:https://www.rfc-editor.org/rfc/rfc6749#section-4.2
Support for consent page which is a place where Resource Owner can control what access to provide (Kind of permissions/authorization page that you see in google). Same is not there in implicit . See section : https://www.rfc-editor.org/rfc/rfc6749#section-4.1 , point (B)
"The authorization server authenticates the resource owner (via the user-agent) and establishes whether the resource owner grants or denies the client's access request"
Apart from that, Using refresh tokens, Apps can get long term access to user data.
There seem to be two key points, not discussed so far, which explain why the detour in the Authorization Code Grant Type adds security.
Short story: The Authorization Code Grant Type keeps sensitive information from the browser history, and the transmission of the token depends only on the HTTPS protection of the authorization server.
Longer version:
In the following, I'll stick with the OAuth 2 terminology defined in the RFC (it's a quick read): resource server, client, authorization server, resource owner.
Imagine you want some third-party app (= client) to access certain data of your Google account (= resource server). Let's just assume Google uses OAuth 2. You are the resource owner for the Google account, but right now you operate the third-party app.
First, the client opens a browser to send you to the secure URL of the Google authorization server. Then you approve the request for access, and the authorization server sends you back to the client's previously-given redirect URL, with the authorization code in the query string. Now for the two key points:
The URL of this redirect ends up in the browser history. So we don't want a long lived, directly usable access token here. The short lived authorization code is less dangerous in the history. Note that the Implicit Grant type does put the token in the history.
The security of this redirect depends on the HTTPS certificate of the client, not on Google's certificate. So we get the client's transmission security as an extra attack vector (For this to be unavoidable, the client needs to be non-JavaScript. Since otherwise we could transmit the authorization code via fragment URL, where the code would not go through the network. This may be the reason why Implicit Grant Type, which does use a fragment URL, used to be recommended for JavaScript clients, even though that's no longer so.)
With the Authorization Code Grant Type, the token is finally obtained by a call from the client to the authorization server, where transmission security only depends on the authorization server, not on the client.
Using oAuth 2.0, in "authorization-code" Authorization Grant, I first call to "/authorize", get the code, and then use this code within a call to "/token" to get the access-token.
My question: why this is the flow? I guess it is from a security reason, but I cannot figure it out. Why the implementation is this way, and not getting the access-token immediately after the first call ("/authorize")?
Why do we need this code for?
Could it also be that by having this intermediate step prevents the client from seeing the access token?
From O'Reilly book:
Authorization code This grant type is most appropriate for server-side web applications. After the resource owner has
authorized access to their data, they are redirected back to the web
application with an authorization code as a query parameter in the
URL. This code must be exchanged for an access token by the client
application. This exchange is done server-to-server and requires
both the client_id and client_secret, preventing even the resource
owner from obtaining the access token. This grant type also allows for
long-lived access to an API by using refresh tokens.
Implicit grant for browser-based client-side applications The implicit grant is the most simplistic of all flows, and is optimized
for client-side web applications running in a browser. The resource
owner grants access to the application, and a new access token is
immediately minted and passed back to the application using a #hash
fragment in the URL. The application can immediately extract the
access token from the hash fragment (using JavaScript) and make API
requests. This grant type does not require the intermediary
“authorization code,” but it also doesn’t make available refresh
tokens for long-lived access.
UPDATE - yes indeed:
When Should the Authorization Code Flow Be Used? The Authorization
Code flow should be used when
Long-lived access is required.
The OAuth client is a web application server.
Accountability for API calls is very important and the OAuth token shouldn’t be leaked to the browser, where the user may have access to
it.
More:
Perhaps most importantly—because the access token is never sent
through the browser— there is less risk that the access token will be
leaked to malicious code through browser history, referer headers,
JavaScript, and the like.
The authorization code flow is meant for scenarios where 3 parties are involved.
These parties are:
Client
The user with his web browser. He wants to use your application.
Provider
Has information about the user. If somebody wants to access this data, the user has to agree first.
Your (web) application
Wants to access information about the user from the provider.
Now your app says to the user (redirecting his browser to the /authorize endpoint):
Hey user, here is my client id. Please talk to the provider and grant him to talk to me directly.
So the user talks to the provider (requests the authorization code and returns it to your app by opening your callback URL in his browser):
Hey provider, I want to use this app, so they require to access my data. Give me some code and I give this code to the application.
Now your app has the authorization code which is already known by client AND the provider. By handing this over to the provider your app can now prove, that it was allowed by the client to access his data. The provider now issues your (web) app an access token, so your (web) app won't have to redo these steps each time (at least for a while).
In case of other application types where your app is running directly at the client side (such as iPhone/Android apps or Javascript clients), the intermediate step is redundant.
Data on client side is generally considered unsafe. In the case of implicit calls where token is granted in the initial step itself, anyone with the access_token can request for data, the API doesn't know who is calling that API.
But, in the case of web-server apps where the application wants to identify itself, client_id with client_secret is sent along with authorization_code to get access_token, which in future can be sent independently.
Suppose, if access_token is granted initially itself then as client_id and access_token will still be considered exposed, so the app will have to send client_secret in addition to access_token every time to assure that request is really coming from it.
While in the current scenario, after getting access_token, further requests can be made independently without needing client_secret.
One important point is
Perhaps most importantly—because the access token is never sent through the browser— there is less risk that the access token will be leaked to malicious code through browser history, referer headers, JavaScript, and the like.
I think it is like this;
When we use the authorization code, we have 2 verification parts;
1; to verify ownership of the user, because he logs in
2; we know that the client, is really who he says he is because the client is sending his client_secret.
So if we would return the access token on the moment when the user authenticates instead of the authorization code, we know that it is the user requesting it but we dont know that it will be used for the registered client. So for example your webapp.
When we use the 'implicit grant'; (or return the access token instead of authorization code)
1; We know it is the user who is receiving the access token, but there is no need in getting a authorization code because the 'user-agent' based application is not checkable. It is checkable, if you think about it but it is usable for everyone. The client_secret is publicly viewable in the source code of the 'user-agent' based application so everyone can just 'view source code' and copy the client_secret and use this method to verify ownership of the client.
#ksht's answer is basically correct. For those looking for the simple,brief answer it is this:
Because the client app, (browser or native app), can have the delivered token intercepted. The oauth implicit flow does allow this but only under very specific circumstances. In all other cases either the browser can leak info (hacks in the OS, browser bugs , plugins) or for native apps your custom url scheme that maps the redirect url to the app can be intercepted. So the workaround is send back a code instead of a token (over tls) and use PKCE to ensure that the code can be securely exchanged for a token.