I have been reading different flows of OAuth and have confusion about the Authorization Code flow. It is said that Authorization Code Flow is more secure because even if the authorization code is hijacked while transfer, it is useless to the hacker because the the hacker would need the client id and client secret to acquire the access token - but what if when the client requests for access token after receiving the authorization code, the hacker hacks the transmission and get the access token?
I don't know but it looks like the Authorization code is only adding an extra layer of security but not actually completely securing the access tokens.
Am I right? Please correct me.
The typical use case for an Authorization Code flow is that the token request (i.e. the one that exchanges the Authorization Code for an access token) is done over a TLS protected backchannel which means that the attacker cannot get to it - unless he's able to break SSL in which case there are much bigger problems.
But for front-channel use case, i.e. an in-browser Javascript application or Single Page Application you are right: a hacker could almost just as easy intercept the token request as the redirect. That is also why that use case cannot use a confidential client since the secret would be too easily exposed.
The Authorization code flow makes sense when you have a frontend client which also has access to a backend which can securely talk to the auth server.
The flow would be as follows:
The frontend client redirects the user to the auth server url
the auth server (after login), redirects the user back to the frontend client redirectUri
the frontend client extracts the code from this url and passes it on the backend.
the backend then exchanges this code for an access token by directly interacting with the auth server.
The backend to auth server communication is what the hacker can't intercept (easily).
Related
Something I can't wrap my head around.
As I understand the authorization code flow is supposed to be more secured than the implicit flow, because the tokens are not directly sent to the client from the authorization server, but rather retrieved by your backend.
So the flow is basically:
Browser gets the authorization code (as a URL parameter of sort).
Sends it to a public backend endpoint.
The backend sends the code + client secret to the authorization server, retrieves the tokens and stores them in the client's cookie/local storage for further use.
In this flow all the tutorials describe the authorization code as useless to the hacker, why is that? Can't a hacker use Postman or some other client and access your (public) API directly, make it go through step 3 and thus retrieve the tokens just the same?
What am I missing here?
The code is used exactly once. In many scenarios that an attacker might get access to the code, it's already been exchanged for an access token and therefore useless.
The authorization_code is a one-time token.
Authorization Code aka auth code is used publicly so that the client can establish a secure back channel between him and the authorization server so that he can exchange it with the access token without the use of a browser.
The auth code is public and can be intercepted via a proxy since it appears in the query of the redirect_uri and is used via the browser (which is considered insecure). The access token depends on the auth_code (public) and the client_secret (private) for the exchange. Without the client_secret an attacker can get the access token with brute-forcing this way through.
Summary: even if the attacker knows the authcode he can do anything without the client_secret given to the client at registration (or dynamically) and assumed to be secured.
Imagine this attack
An attacker intercepts the first call to the authorization server, then they have the code-challenge. (step 1 in the diagram)
The attacker now intercepts the response from the authorization server with the authorization code. (step 2 in the diagram)
Then the attacker can POST the authorization-code and the code-verifier to get the access token. (step 3)
Refer to this diagram:
flow:
Questions
What prevents the attacker intercepting the first call to the authorization server? This is what is meant to make authorization code + PKCE more secure than implicit flow.
Perhaps it does not matter if the call is intercepted because the code-challenge is hashed and therefore the attacker does not have the code-verifier required for the 2nd call. But what if the code-challenge is not hashed?
PKCE is meant to address the threat of the access token / authorization code being leaked from URL, which is relatively likely compared to an attacker intercepting SSL traffic:
URLS are visible in the address bar
URLs are saved in the browser history
On native platforms multiple applications may be registered to use the same custom URI scheme
That said, its recommended that the code challenge be a SHA256 hash of the code verifier, therefore even if the attacker were to intercept the code challenge, they would be unable to complete the token exchange without being able to reverse SHA256.
Also see: What is PKCE actually protecting?
PKCE is meant to assure that the client that requested the user to be authenticated, it the same client that exchanges the authorization code for an access token.
All communication with the authorization server is using HTTPS, go it's difficult to intercept. But some mobile platforms allow (malicious) apps to register the same redirect_uri as the legitimate client. This meant that when the authorization server redirected back to the client with the authorization code, both the legitimate client and the malicious client would be called with the code. This allowed the malicious client to exchange the code for an access token, since no client authentication is done.
PKCE solves this by including the code_challenge in the authentication request, which is a hash of the code verifier. It then requires the actual verifier in the token exchange call. The malicious client does not have the verifier and can therefore not authenticate itself and thus not exchange the code for a token.
In OAuth2.0 Authorization Code Grant as stated in RFC 6749, the token request requires client secret according to sec4.1.3; however, the authorization request is not according to sec4.1.1.
Does anyone know why? It seems using client secret for both authorization and token request makes the process more secure.
They are different because they are two different types of requests. 4.1.1
GET /authorize?response_type=code&client_id=s6BhdRkqt3&state=xyz
&redirect_uri=https%3A%2F%2Fclient%2Eexample%2Ecom%2Fcb HTTP/1.1
Host: server.example.com
Is used to display the actual consent screen to the user.
Once the user has accepted then the code is exchanged for an access token
>HTTP/1.1 302 Found
Location: https://client.example.com/cb?code=SplxlOBeZQQYbYS6WxSbIA
&state=xyz
No secret is needed because you are currently in the Authorization Code section of the document.
4.1. Authorization Code Grant
The authorization code grant type is used to obtain both access
tokens and refresh tokens and is optimized for confidential clients.
Since this is a redirection-based flow, the client must be capable of
interacting with the resource owner's user-agent (typically a web
browser) and capable of receiving incoming requests (via redirection)
from the authorization server.
Authorization Code is sometimes refereed to as the Implicit flow, as the required access token is sent back to the client application without the need for an authorization request token. This makes the whole flow pretty easy, but also less secure. As the client application, which is typically JavaScript running within a Browser is less trusted, no refresh tokens for long-lived access are returned. Returning an access token to JavaScript clients also means that your browser-based application needs to take special care – think of XSS (Cross-Site Scripting) Attacks that could leak the access token to other systems.
Basically a user implicitly trusts their pc so there is really no need for the client secret validation step. Client secret is only needed for server sided applications where the user does not have access to the server so the server must validate itself.
In OAuth 2.0 flows the authorization server sends the authorization code to the redirect endpoint and then the webpage has to hit the server again to get a separate access token to query the protected API with.
Why do there have to be two tokens? Specifically could someone provide example(s) of security attacks/vulnerabilities that emerge without this design.
There is this post Facebook OAuth 2.0 "code" and "token" but it doesn't really fully explain the reasoning behind the design.
One (the authorization code) is exchanged in the frontchannel, the other (the access token) in the backchannel. The end goal is obtaining the access token. Since the frontchannel is inherently more insecure it makes sense to send a very short-lived one-time-usage-only temporary credential (i.e the authorization code) in the front channel that the web server can use to obtain the longer-lived repeatedly-usable access token in the backchannel. That backchannel call would also allow for the web server (or: Client) to authenticate itself to the Authorization Server to increase the assurance about dealing with the right party.
I'm learning about O Auth 2 from here
I was wondering in the step of "Authorization server redirects user agent to client with authorization code", why doesn't the server just give the access token instead? Why give an authorization code that then is used to get the access token? Why not just give the access token directly? Is it because there there is a different access token for each resource so that you need to go through O Auth again to access a different resource?
The authorization grant code can pass through unsecured or potentially risky environments such as basic HTTP connection (not HTTPS) or a browser. But it's worthless without a client secret. The client can be a backend application. If the OAuth2 server returned a token, it could get compromised.
There is another OAuth2 flow - the Implicit flow, which returns an access token right after the authentication, but it's designed mainly for JavaScript applications or other deployments where it's safe to use it.
If a malicious app gets hold of the client id of your app(which is easily available, for example one can inspect the source), then it can use that to retrieve the token without the use of the client secret. All the malicious app needs to do is to somehow either specify the redirect URI to itself or to tap into the registered redirect URI.
That is the reason for breaking the flow as such. Note, when the client secret is not to be used as in SPA (Single Page Apps) or Mobile Apps, then PKCE comes to the rescue.
There is a reason for breaking up the authorization flow so as to keep the resource owner's interaction with the authorization server isolated from the client's interactions with the authorization server. Therefore we need to have two interactions with the authorization server. One in which the resource owner authenticates with it's credentials to the authorization server. And another where the client sends in it's client secret to the authorization server.
Please also see PKCE that deals with SPA (SinglePageApp)/Mobile apps.