I am writing some code to get Twitter and Instagram feed. Before I can write any code, I keep wanting to get a good understanding of oAuth because I have this nagging feeling that it is not all that secure and that most times, for instance when accessing public tweets, it is an unnecessary hassel. I started reading the oAuth 2 specification to get a better understanding, I am still in the middle of it. And I have a host of questions.
Let's use Twitter as an example.
A user accesses your site. You redirect them to Twitter for authentication and to obtain the authorization_grant code.
I understand this part is secure because the user authentication and the redirect to your website will happen over ssl. Is it enough for Twitter to support SSL or does your site also have to support SSL for the redirect to be secure? You wouldn't want the authorization code to be transferred insecurely, right?
Now that you have your authorization_grant code, your site will send a request to Twitter to obtain an access token. When making this request your site will send the authorization_grant code, your client id and client secret. Again I guess the communication is secure because this will happen over ssl. But what if the site has included the client id and secret somewhere in its HTML or Javascript, especially if it is a static site with no server side code?
Should the redirect url always be handled by server side code and the server side code should make the request for access token without ever going through HTML or Javascript?
Once you have the access token, you will include it in your request to obtain the user's tweets, to post tweets on their behalf etc. Again if the site in question were to include the access token inside its HTML or JavaScript along with the client id and secret, that would be pretty insecure, right?
It seems all the security of oAuth stems from ssl and the client's ability to keep their client secret secret. Am I right in this conclusion?
Another thing - in the first step of the process, when the client redirects the user to Twitter to authenticate and obtain the authorization_grant code, they could send in their client id and secret and get the access token directly instead of making a second request for it. I think this is what they mean by the implicit method in the specification.
So, why was this extra step of sending a second request to obtain access token added in the specification? Does it increase security?
I am not sure about twitter API, I am talking with respect to stackexchange API
https://api.stackexchange.com/docs/authentication
Again I guess the communication is secure because this will happen
over ssl. But what if the site has included the client id and secret
somewhere in its HTML or Javascript, especially if it is a static site
with no server side code?
client_secret is send only in the case of explicit flow. Explicit flow should be used by server side application and care should be taken to keep the client_secret safe.
So, why was this extra step of sending a second request to obtain
access token added in the specification?
Well, Implicit flow is less secure than explicit flow since access toke is send to the user agent. But there is an attribute expire in the case of implicit flow which will get expired unless you have specified scope as no_expiry. Also server side flow can be used only by the apps that are registerd
It seems all the security of oAuth stems from ssl and the client's
ability to keep their client secret secret. Am I right in this
conclusion?
Again client_secret will be available in server side flow. But yes, client should take care that access_token is not given out
Check out this link. It gives an example of possible vulnerability in ouath.
Related
I am wondering that in the OpenID Connect Auth Code Flow, whether there is still a need to validate the access_token and id_token given they are obtained by my web server rather than by browser (i.e. using back channel rather than front channel)?
By "auth code flow" I am referring to the flow where browser only receives an "authorization code" from the authorization server (i.e. no access_token, no id_token), and sends the auth code to my web server. My web server can therefore directly talk to the authorization server, presenting the auth code, and exchange it for the access_token and id_token. It looks like I can simply decode the access_token and id_token to get the information I want (mainly just user id etc.)
My understanding of the need for validating the access_token is that because access_token is not encrypted, and if it is transmitted through an insecure channel, there is a chance that my web server can get a forged token. Validating the token is basically to verify that the token has not been modified.
But what if the access_token is not transmitted on any insecure channel? In the auth code flow, web server directly retrieves the access_token from the auth server, and the token will never be sent to a browser. Do I still need to validate the token? What are the potential risks if I skip the validation in such flows?
You should always validate the tokens and apply the well known validation patterns for tokens. Because otherwise you open up your architecture for various vulnerabilities. For example you have the man-in-the-middle issue, if the hacker is intercepting your "private" communication with the token service.
Also most libraries will do the validation automatically for you, so the validation is not a problem.
When you develop identity systems you should follow best practices and the various best current practices, because that is what the users of the system expects from you.
As a client, you use HTTPS to get the public key from the IdentityServer, so you know what you got it from the right server. To add additional security layers, you could also use client side HTTPS certificates, so that the IdentityServer only issues tokens to clients that authenticates using a certificate.
In this way, man-in-the-middle is pretty impossible. However, in data-centers in the backend, you sometimes don't use HTTPS everywhere internally. Often TLS is terminated in a proxy, you can read more about that here
When you receive an ID-token, you typically create the user session from that. And yes, as the token is sent over the more secure back channel, you could be pretty secure with that. But still many attacks today occurs on the inside, and just to do a good job according to all best practices, you should validate them, both the signature and also the claims inside the token (expire, issuer, audience...).
For the access token, it is important that the API that receives it do validate it according to all best practices, because anyone can send requests with tokens to it.
Also, its fun to learn :-)
ps, this video is a good starting point
I went thru multiple posts saying how implicit grant is a security risk and why auth code grant with AJAX request to Authorization server should be used after redirecting to application (without client_secret passed to Auth server).
Now in 2019 there is no CORS issue as I can allow app domains on authrization server.
I have following concerns
If I use implicit grant:
Now implicit grant has security issues as Authorization server redirects to application server with token in url.
If I set expiration time to 5 to 10 minutes, after expiration, user will be redirected to login and its problematic especially if he is filling up important form on application. What to do in this scenario? Note that there is no refresh token in Implicit grant to update with new token, so refresh token is out of the picture.
If I use Auth code grant:
Suppose if I hit AJAX request after getting redirected to my main application site, and get token in exchange of code,
Auth code grant uses client_secret. And in javascript app where anyone can see the code, we cant use secret.
Assume now if I dont use client_secret. There are multiple sites that use auth server say site 1, 2, 3. Now if we say dont use secret, anyone can make host entry in nginx server that will have my site's domain name but his own IP address. In this case host injection is the issue. How to deal with it?
What approach should be taken here? I am more inclined towards auth_code for SPA but the issue is how to deal with client_secret?
Thank you for reading.
There are multiple links that recommends use Auth code grant instead of SPA. A few out of multiple links :
https://www.oauth.com/oauth2-servers/single-page-apps/
https://medium.com/oauth-2/why-you-should-stop-using-the-oauth-implicit-grant-2436ced1c926
You've already linked references that make it clear that you should NOT use implicit grant in an SPA. Your confusion seems to come from your assumption that the code grant flow requires use of a client secret, but that is not the case. A SPA, like any browser or device app, is (should be) a PUBLIC client, and cannot be trusted with a secret. Therefore, it does not use a secret. The client secret is suitable ONLY for use with private clients, which is to say, server-side code calling the auth api.
the 2019 recommendations are to use a PKCE variation of the Authorization Code Flow that does not need a client secret:
https://brockallen.com/2019/01/03/the-state-of-the-implicit-flow-in-oauth2/
There's some write ups and code samples on my blog that might be useful to you - I updated this one on messages recently:
https://authguidance.com/2017/09/26/basicspa-oauthworkflow/
Imagine you're going through a standard OAuth2 process to retrieve an access_token for some third-party API. This is the usual process.
User is redirected to http://some-service.com/login
User successfully logs in and is redirected to some destination http://some-destination.com. In this step, there's usually a code parameter that comes with the request. So the actual URL looks like http://some-destination.com?code=CODE123
I need to use CODE123 to request an access_token that can be used to authorize my future API calls. To do this, there's an endpoint that usually looks like this (I am using Nylas as an example but should be generic enough):
As you can see, it requires me to POST the code from above (CODE123) along with client_id and client_secret like this: http://some-service.com/oauth/token?code=CODE123&client_secret=SECRET&client_id=ID. As a response, I get an access_token that looks like TOKEN123 and I can use this to make API calls.
QUESTION
Until step 2, everything happens in the client side. But in step 3, I need to have client_id and client_secret. I don't think it's a good idea to store these values in the client side. Does that mean I need to have a backend server that has these two values, and my backend should convert CODE123 to TOKEN123 and hand it to the client side?
As you probably know, the question describes the most common (and usually, the more secure) OAuth "Authorization Code" flow. To be clear, here's an approximation of the steps in this flow:
User indicates that they wish to authorize resources for our application (for example, by clicking on a button)
The application redirects the user to the third-party login page, where the user logs in and selects which resources to grant access to
The third-party service redirects the user back to our application with an authorization code
Our application uses this code, along with its client ID and secret to obtain an access token that enables the application to make requests on behalf of the user only for the resources that the user allowed
Until step 2, everything happens in the client side. But in step 3, I need to have client_id and client_secret. I don't think it's a good idea to store these values in the client side. Does that mean I need to have a backend server that has these two values[?]
You're correct, it's certainly not a good idea to store these values in the client-side application. These values—especially the client secret—must be placed on a server to protect the application's data. The user—and therefor, the client application—should never have access to these values.
The server uses its client ID and secret, along with the authorization code, to request an access token that it uses for API calls. The server may store the token it receives, along with an optional refresh token that it can use in the future to obtain a new access token without needing the user to explicitly authorize access again.
...and my backend should convert CODE123 to TOKEN123 and hand it to the client side?
At the very least, our server should handle the authorization flow to request an access token, and then pass only that token back to the client (over a secure connection).
However, at this point, the client-side application (and the user of that client) is responsible for the security of the access token. Depending on the security requirements of our application, we may want to add a layer to protect this access token from the client as well.
After the server-side application fetches the access token from the third-party service, if we pass the access token back to the client, malware running on the client machine, or an unauthorized person, could potentially obtain the access token from the client, which an attacker could then use to retrieve or manipulate the user's third-party resources through privileges granted to our application. For many OAuth services, an access token is not associated with a client. This means that anyone with a valid token can use the token to interact with the service, and illustrates why our application should only request the minimum scope of access needed when asking for authorization from the user.
To make API calls on behalf of a user more securely, the client-side application could send requests to our server, which, in turn, uses the access token that it obtained to interact with the third-party API. With this setup, the client does not need to know the value of the access token.
To improve performance, we likely want to cache the access token on the server for subsequent API calls for the duration of its lifetime. We may also want to encrypt the tokens if we store them in the application's database—just like we would passwords—so the tokens cannot be easily used in the event of a data breach.
I'm building a web app that uses the Oauth2.0 protocol. I have registered my app with the authorization server and received my client id and client secret.
I'm now working on Authorization part and specifically using the Authorization Code grant type. In that process i'm sending the user to the authorize endpoint with the following query parameters:code, client_id, redirect_uri, scope and state. (omitting the client_secret)
The problem that i'm dealing with is i'm getting an error back saying I need to provide the client_secret as well.
I was under the impression the client_secret is not needed at this part and shouldn't be sent in this request but rather when the client sends the authorization code (along with id & secret) to obtain the access token.
So my question is, Is it wrong (against oauth 2 protocol) that the authorization server requires the client secret to be sent in the request for the authorization code?
I am not 100% sure of this, but I did some research myself and what I found is that is not a real problem not to keep the "client secret" a secret. The only possibility of someone malicious being able to get through the Authorization specs is prevented by some facts:
1. Client need to get authorization code directly from the user, not from the service
Even if user indicates the service that he/she trusts the client, the
client cannot get authorization code from the service just by showing
client id and client secret. Instead, the client has to get the
authorization code directly from the user. (This is usually done by
URL redirection, which I will talk about later.) So, for the malicious
client, it is not enough to know client id/secret trusted by the user.
It has to somehow involve or spoof user to give it the authorization
code, which should be harder than just knowing client id/secret.
2. Redirect URL is registered with client id/secret
Let’s assume that the malicious client somehow managed to involve the
user and make her/him click "Authorize this app" button on the service
page. This will trigger the URL redirect response from the service to
user’s browser with the authorization code with it. Then the
authorization code will be sent from user’s browser to the redirect
URL, and the client is supposed to be listening at the redirect URL to
receive the authorization code. (The redirect URL can be localhost
too, and I figured that this is a typical way that a “public client”
receives authorization code.) Since this redirect URL is registered at
the service with the client id/secret, the malicious client does not
have a way to control where the authorization code is given to. This
means the malicious client with your client id/secret has another
obstacle to obtain the user’s authorization code.
// copy paste of hideaki answer
Concluding
OAuth2 specify that you need to inform your secret into a request if your application is a server-side based app (different than a single-page application or mobile) which does not make its source code available. However, if you can't control your base code, like in an native mobile application, you should look for another solution.
References
OAuth2 Documentation
Bear similar stack question
Simplifying OAuth2
OAuth 2.0 has multiple workflows. I have a few questions regarding the two.
Authorization code flow - User logs in from client app, authorization server returns an authorization code to the app. The app then exchanges the authorization code for access token.
Implicit grant flow - User logs in from client app, authorization server issues an access token to the client app directly.
What is the difference between the two approaches in terms of security? Which one is more secure and why?
I don't see a reason why an extra step (exchange authorization code for token) is added in one work flow when the server can directly issue an Access token.
Different websites say that Authorization code flow is used when client app can keep the credentials secure. Why?
The access_token is what you need to call a protected resource (an API). In the Authorization Code flow there are 2 steps to get it:
User must authenticate and returns a code to the API consumer (called the "Client").
The "client" of the API (usually your web server) exchanges the code obtained in #1 for an access_token, authenticating itself with a client_id and client_secret
It then can call the API with the access_token.
So, there's a double check: the user that owns the resources surfaced through an API and the client using the API (e.g. a web app). Both are validated for access to be granted. Notice the "authorization" nature of OAuth here: user grants access to his resource (through the code returned after authentication) to an app, the app get's an access_token, and calls on the user's behalf.
In the implicit flow, step 2 is omitted. So after user authentication, an access_token is returned directly, that you can use to access the resource. The API doesn't know who is calling that API. Anyone with the access_token can, whereas in the previous example only the web app would (it's internals not normally accessible to anyone).
The implicit flow is usually used in scenarios where storing client id and client secret is not recommended (a device for example, although many do it anyway). That's what the the disclaimer means. People have access to the client code and therefore could get the credentials and pretend to become resource clients. In the implicit flow all data is volatile and there's nothing stored in the app.
I'll add something here which I don't think is made clear in the above answers:
The Authorization-Code-Flow allows for the final access-token to never reach and never be stored on the machine with the browser/app. The temporary authorization-code is given to the machine with the browser/app, which is then sent to a server. The server can then exchange it with a full access token and have access to APIs etc. The user with the browser gets access to the API only through the server with the token.
Implicit flow can only involve two parties, and the final access token is stored on the client with the browser/app. If this browser/app is compromised so is their auth-token which could be dangerous.
tl;dr don't use implicit flow if you don't trust the users machine to hold tokens but you do trust your own servers.
The difference between both is that:
In Implicit flow,the token is returned directly via redirect URL with "#" sign and this used mostly in javascript clients or mobile applications that do not have server side at its own, and the client does not need to provide its secret in some implementations.
In Authorization code flow, code is returned with "?" to be readable by server side then server side is have to provide client secret this time to token url to get token as json object from authorization server. It is used in case you have application server that can handle this and store user token with his/her profile on his own system, and mostly used for common mobile applications.
so it is depends on the nature of your client application, which one more secure "Authorization code" as it is request the secret on client and the token can be sent between authorization server and client application on very secured connection, and the authorization provider can restrict some clients to use only "Authorization code" and disallow Implicit
Which one is more secure and why?
Both of them are secure, it depends in the environment you are using it.
I don't see a reason why an extra step (exchange authorization code
for token) is added in one work flow when the server can directly
issue an Access token.
It is simple. Your client is not secure. Let's see it in details.
Consider you are developing an application against Instagram API, so you register your APP with Instagram and define which API's you need. Instagram will provide you with client_id and client_secrect
On you web site you set up a link which says. "Come and Use My Application". Clicking on this your web application should make two calls to Instagram API.
First send a request to Instagram Authentication Server with below parameters.
1. `response_type` with the value `code`
2. `client_id` you have get from `Instagram`
3. `redirect_uri` this is a url on your server which do the second call
4. `scope` a space delimited list of scopes
5. `state` with a CSRF token.
You don't send client_secret, You could not trust the client (The user and or his browser which try to use you application). The client can see the url or java script and find your client_secrect easily. This is why you need another step.
You receive a code and state. The code here is temporary and is not saved any where.
Then you make a second call to Instagram API (from your server)
1. `grant_type` with the value of `authorization_code`
2. `client_id` with the client identifier
3. `client_secret` with the client secret
4. `redirect_uri` with the same redirect URI the user was redirect back to
5. `code` which we have already received.
As the call is made from our server we can safely use client_secret ( which shows who we are), with code which shows the user have granted out client_id to use the resource.
In response we will have access_token
The implicit grant is similar to the authorization code grant with two distinct differences.
It is intended to be used for user-agent-based clients (e.g. single page web apps) that can’t keep a client secret because all of the application code and storage is easily accessible.
Secondly instead of the authorization server returning an authorization code which is exchanged for an access token, the authorization server returns an access token.
Please find details here
http://oauth2.thephpleague.com/authorization-server/which-grant/
Let me summarize the points that I learned from above answers and add some of my own understandings.
Authorization Code Flow!!!
If you have a web application server that act as OAuth client
If you want to have long lived access
If you want to have offline access to data
when you are accountable for api calls that your app makes
If you do not want to leak your OAuth token
If you don't want you application to run through authorization flow every time it needs access to data. NOTE: The Implicit Grant flow does not entertain refresh token so if authorization server expires access tokens regularly, your application will need to run through the authorization flow whenever it needs access.
Implicit Grant Flow!!!
When you don't have Web Application Server to act as OAuth Client
If you don't need long lived access i.e only temporary access to data is required.
If you trust the browser where your app runs and there is limited concern that the access token will leak to untrusted users.
Implicit grant should not be used anymore, see the IETF current best practices for details. https://datatracker.ietf.org/doc/html/draft-ietf-oauth-security-topics-18#section-2.1.2
As an alternative use a flow with response type code; for clients without possibility to securely store client credentials the authorization code with PKCE flow should be your choice.
From practical perspective (What I understood), The main reason for having Authz code flow is :
Support for refresh tokens (long term access by apps on behalf of User), not supported in implicit: refer:https://www.rfc-editor.org/rfc/rfc6749#section-4.2
Support for consent page which is a place where Resource Owner can control what access to provide (Kind of permissions/authorization page that you see in google). Same is not there in implicit . See section : https://www.rfc-editor.org/rfc/rfc6749#section-4.1 , point (B)
"The authorization server authenticates the resource owner (via the user-agent) and establishes whether the resource owner grants or denies the client's access request"
Apart from that, Using refresh tokens, Apps can get long term access to user data.
There seem to be two key points, not discussed so far, which explain why the detour in the Authorization Code Grant Type adds security.
Short story: The Authorization Code Grant Type keeps sensitive information from the browser history, and the transmission of the token depends only on the HTTPS protection of the authorization server.
Longer version:
In the following, I'll stick with the OAuth 2 terminology defined in the RFC (it's a quick read): resource server, client, authorization server, resource owner.
Imagine you want some third-party app (= client) to access certain data of your Google account (= resource server). Let's just assume Google uses OAuth 2. You are the resource owner for the Google account, but right now you operate the third-party app.
First, the client opens a browser to send you to the secure URL of the Google authorization server. Then you approve the request for access, and the authorization server sends you back to the client's previously-given redirect URL, with the authorization code in the query string. Now for the two key points:
The URL of this redirect ends up in the browser history. So we don't want a long lived, directly usable access token here. The short lived authorization code is less dangerous in the history. Note that the Implicit Grant type does put the token in the history.
The security of this redirect depends on the HTTPS certificate of the client, not on Google's certificate. So we get the client's transmission security as an extra attack vector (For this to be unavoidable, the client needs to be non-JavaScript. Since otherwise we could transmit the authorization code via fragment URL, where the code would not go through the network. This may be the reason why Implicit Grant Type, which does use a fragment URL, used to be recommended for JavaScript clients, even though that's no longer so.)
With the Authorization Code Grant Type, the token is finally obtained by a call from the client to the authorization server, where transmission security only depends on the authorization server, not on the client.