I think the best way to truly understand when to use any tool is to understand why it was made in the first place. In order to do that we need to go back in time. It is the year of 2007 and you are presented with this form:
Do you already know the answer? No? Then keep reading this section.
As you can see, in order to send an invite to your contacts, Facebook is asking you to enter the Gmail email address and password. They have this nice message explaining that they won’t store the password or do anything without your permission. Still, this is a really bad thing to do. If we agree to this, we are giving full control of our Gmail account to Facebook – but the only thing Facebook needs is to see our contact list.
This was very common back then since there was no other way. Many websites asked for your password from another service in order to integrate with it. Facebook did ask for permission to fetch your contacts, but nothing prevented them to send an email from your account, delete your emails, access your documents, etc. In the eyes of Gmail, Facebook was you.
So, the need was born to have a way to authorize a service to consume other service without a password and to allow access to only specific parts of that service. The practical problem sounded something like this:
“I want to give permission to Facebook to read my Gmail contacts and nothing else. And I don’t want to give my password to anyone but Gmail.”
This is the problem the OAuth protocol was trying to solve.
We won’t discuss OAuth in detail since OAuth2 is now the preferred solution.
OAuth 1.0 was largely based on two existing proprietary protocols: Flickr’s authorization API and Google’s AuthSub. The work that became OAuth 1.0 was the best solution based on actual implementation experience at the time but it was hard to implement and had some design flaws.
OAuth 2.0 represents years of discussions between a wide range of companies and individuals including Yahoo!, Facebook, Salesforce, Microsoft, Twitter, Deutsche Telekom, Intuit, Mozilla, and Google.
OAuth 2.0 is a complete rewrite of OAuth 1.0 from the ground up. OAuth 2.0 is not backward compatible with OAuth 1.0 or 1.1 and should be thought of as a completely new protocol.
One of the hardest things about OAuth2 is the terminology. Everything has 3 or 4 terms that are used interchangeably. Unfortunately, authors of specification used terms that are not commonly used by developers. So, this led to a large amount of alternative terms on websites that tried to make OAuth2 easier to understand.
OAuth defines four roles:
Resource owner – The user gives permission to the service to act on their behalf. In our case, it is the user that wants to allow Facebook to read contacts from Gmail. Note that OAuth2 also supports machine to machine authorization. In that case the resource owner is not a person.
Resource server – The API, in our case Gmail. Usually, it doesn’t know anything about the client.
Authorization server – The authorization server is what the user interacts with when an application is requesting access to their account. This is the server that displays the OAuth prompt, and where the user approves or denies the access request. The authorization server is also responsible for granting access token after the user authorizes the application. It can be the same server as the API.
and 3 types of tokens:
Authorization code is an intermediate token used in the server-side app flow. An authorization code is returned to the client after the authorization step, and then the client exchanges it for an access token.
Access token is the string used when making authenticated requests to the API. The string itself has no meaning to the application using it but represents that the user has authorized a third-party application to access their account. The token has a corresponding duration of access, scope, and potentially other information the server needs.
Refresh token is used to get a new access token when an access token expires. Usage of refresh tokens is optional.
There are 6 Grant types (also known as “Grant flows” or “Authorization flows”):
Authorization Code – is used by confidential and public clients to exchange an authorization code for an access token. After the user returns to the client via the redirect URL, the application will get the authorization code from the URL and use it to request an access token. It is recommended that all clients use the PKCE (Proof Key for Code Exchange) extension with this flow as well to provide better security.
Client Credentials – is used by clients to obtain an access token outside of the context of a user. This is typically used by clients to access resources about themselves rather than to access a user’s resources.
Device Code – is used by browserless or input-constrained devices to exchange a previously obtained device code for an access token.
Refresh Token – is used by clients to exchange a refresh token for an access token when the access token has expired. This allows clients to continue to have a valid access token without further interaction with the user.
Password Grant (Legacy) – is a way to exchange a user’s credentials for an access token. Because the client application must collect the user’s password and send it to the authorization server, it is not recommended that this grant should be used at all anymore. This flow provides no mechanism for things like multifactor authentication or delegated accounts, so is quite limiting in practice.
I will cover step by step Authorization Code and Client Credentials flows as they are the most commonly used.
The Scope is a mechanism in OAuth 2.0 to limit an application’s access to a user’s account. An application can request one or more scopes, this information is then presented to the user in the consent screen, and the access token issued to the application will be limited to the scopes granted.
OAuth does not define any values for scopes, since it is highly dependent on the service’s internal architecture and needs.
Authorization code with PKCE example
This flow/type is considered best practice when using Single Page Apps (SPA) or Mobile Apps.
Here is the step by step example of this flow:
The Client application (SPA or mobile app) generates a secret code verifier and the challenge.
The code verifier is a cryptographically random string using the characters A-Z, a-z, 0-9, and the punctuation characters -._~ (hyphen, period, underscore, and tilde), between 43 and 128 characters long.
Once the client has generated the code verifier, it uses that to create the code challenge. For devices that can perform a SHA256 hash, the code challenge is a BASE64-URL-encoded string of the SHA256 hash of the code verifier. Otherwise, the same verifier string is used as the challenge.
In this URL you can see that as a response we require authorization code (response_type=code), we want to receive it at the specified URL (redirect_uri) and we want permissions for photos and offline access (scope=photo+offline_access).
The Client application calls URL and the user is presented with a form to enter their username and password. Note that this form is opened on the domain of the authorization server and you need a web browser to handle it.
If the username and password are correct, the user is presented with consent screen:
The user was redirected back to the client application with a few additional query parameters in the URL:
The Client application must validate that the returned state value is the same as the one that was sent in the Authorization URL.
Client application exchanges the authorization code for an access token.
The client will build a POST request to the token endpoint with the following parameters:
POST https://authorization-server.com/token grant_type=authorization_code &client_id=Sw3ocQf1A0ePo-GhCxBJFTne &redirect_uri=https://www.oauth.com/playground/authorization-code-with-pkce.html &code=ZgoLC0ZVh0lebknnC4uH3Z8yDHlbB3dbI7dxx187dN_Aw1I0 &code_verifier=G4vxIPr5wAr5b1DCF_p3ZOOyWxstFzAY9Q0UoKUSJru4MbOV
The code_verifier must be sent with the token request. The authorization server will check whether the verifier matches the challenge that was used in the authorization request. This ensures that a malicious party that intercepted the authorization code will not be able to use it. Authorization code can be intercepted easily since it is was returned to the client in the URL. If the client is confidential (webserver) then there is no need to use PKCE – we can use regular authorization code flow and send client secret instead of code verifier in this step.
Authorization server responds from the token endpoint. The response includes the access token and refreshes token.
You can use this flow only on confidential clients that can protect client secrets.
Authentication or Authorization
Authentication is the process of verifying who a user is, while authorization is the process of verifying what they have access to. Everything we covered so far was about authorization. Developers liked OAuth2 and tried to use it for something it was not designed for – authentication.
For example, if we had 2 services that could provide us with user information (such as username, first name, email, etc.), we could use those services to register a new user on our system.
This worked but it was not perfect. The first service could require us to have UserInfo scope but the second could have a different name for this scope, such as UserData. Also, we could get FirstName / LastName from the first service and GivenName / FamilyName from the second. To solve this issue OpenID Connect protocol was made.
OpenID Connect (OIDC) is an authentication protocol, based on the OAuth 2.0 specification. It is just an extension of everything that we mentioned so far. They added a new type of token (ID token in JWT format) that you request just like any other Oauth2 token: response_type=id_token. User data is standardized and easier to consume.
OIDC is out of the scope of this article and I might cover it in detail if there is an interest.