Avoiding false positives in OAuth 2.0 refresh token theft detection

Dan Mercer

Reading time: about 7 min

Topics:

Uncategorized

This article explores a specific edge case that can happen when OAuth 2.0 authorization servers use rotating refresh tokens to detect refresh token theft. That’s a mouthful, so let’s explore those ideas one by one.

What is OAuth 2.0?

OAuth 2.0 is a framework for authorization on the web, where a user can give one service (known as the client) access to data stored in another service (known as a resource server). The framework explains how to do a three-way “handshake” of sorts, where the user grants access via an authorization server and the client obtains an access token. The client then can use that access token to make authorized requests to the resource server (a deeper explainer of OAuth 2.0 is beyond the scope of this article, but here’s a really good explanation by Aaron Parecki).

What are refresh tokens?

Because access tokens can be used to access protected resources, they are usually short-lived (think lifetimes measured in hours). Refresh tokens are a way for an app to get a new access token without re-prompting the user to grant access. The client sends the refresh token to the authorization server in exchange for a new access token. This is often referred to as “offline access” because it allows the app to continue acting on the user’s behalf even when the user is not present.

Why rotate refresh tokens?

If refresh tokens never expire, then a malicious actor with a stolen refresh token can easily get persistent access to the token’s resources. But if refresh tokens do expire, then apps that should have persistent access to certain resources will need a way to do that. Enter: token rotation. Each time the app uses a refresh token, the authorization server issues a new access token and a new refresh token (with a new expiration time). The authorization server then invalidates the refresh token that was just used, since it’s not needed anymore. This has a few benefits:

It allows apps to have persistent access without relying on non-expiring tokens. Yay!
It also allows the authorization server to invalidate refresh tokens more often — the old refresh token is invalidated each time a new refresh token is issued, which happens frequently since access tokens have short lifetimes. This reduces the window of time in which a stolen refresh token is useful.
Lastly, using a clever technique, the authorization server can detect stolen refresh tokens to further mitigate abuse. Let’s explore that next.

Detecting stolen refresh tokens 🕵

As described above, when refresh token rotation is used, each refresh token should only be used once. Because of that, if the authorization server receives multiple “refresh” requests with the same token, it can assume that one of those two requests was a malicious actor with a stolen token. There’s no way to tell which request was the invalid one, but it can then invalidate both requesters’ tokens - in other words, it invalidates the whole “grant”, requiring the user to reauthorize the client. This stops the attack by invalidating any tokens the malicious actor had stolen from that grant. (See Section 4.14.2 of the current “OAuth 2.0 Security Best Current Practice” document for more about this technique.)

Side note: How clients do refreshing

In practice, there are at least two ways for a client to use refresh tokens.

Eagerly: The client uses the refresh token as soon as (or right before) the access token expires, using some kind of recurring refresh job. If access tokens last 1 hour, this means refreshing tokens every hour.
Lazily: When the client needs to use an access token, it checks its expiration time. If it’s expired, it refreshes it, stores the new tokens, and then continues as normal.

Many clients choose the lazy approach because (1) it’s easier to implement and (2) it can be less resouce-heavy, because if the access token isn’t used for a while, it won’t be refreshed until it’s needed again.

Where things go wrong: False positives 💥

Maybe you can already see where I’m going with this. The theft detection strategy described above causes a false positive if a legitimate client refreshes a token multiple times. This can easily happen when (1) a client uses lazy refreshing described above and (2) the token is sometimes needed for multiple things at the same time, such as if it’s used by multiple end users at once, or if an end user does two actions concurrently. For example, imagine a web app that uses OAuth2 to load some data. If the user has multiple tabs open at the same time, each tab tries to request the data, each request invokes a refresh (because the current access token is expired), and whichever refresh happens second then triggers the theft detection, revoking the app’s access. Here’s a step-by-step walkthrough of how the problem happens (made in Lucidchart!): Oddly, the current OAuth2 Security BCP doc doesn’t mention this risk, but the older OAuth2 threat model RFC mentions it offhandedly: “This [theft detection] measure may cause problems in clustered environments, since usage of the currently valid refresh token must be ensured.” In addition, refresh token rotation can cause problems even without the theft detection technique. If a refresh token is used, but the response never makes it to the client (e.g. the network fails to deliver the response), then the client is left with an invalid refresh token and no recourse except asking the user to re-authorize.

Mitigation for clients

To mitigate this problem from the client’s side, you’ll need some kind of locking or mutex around the refresh token. Whenever you refresh a token, you set a lock that tells other threads/flows/etc to wait until it’s refreshed. As database locking tends to be, this can be finicky. You have lots of edge cases to consider. What happens if the process crashes? Or the refresh request fails? Or times out? Or…? If you’re building a generic OAuth2 client system, you probably ought to handle this. But if you’re building an OAuth2 provider (i.e. the authorization server), you should think twice before asking clients to do this. There’s a better way!

Building better authorization servers 💡

To solve the false-positive problem on the authorization server’s side, add a small grace period to refresh token refreshes. After a refresh token is used, for a short window of time, allow it to be used again (and return the same new tokens as the first time it was used). In other words, make the request “idempotent” during that window of time. The window should be no more than 60 seconds or so. Because the “refresh-and-store-new-tokens” process hopefully only takes a few seconds, a minute is plenty of time to allow concurrent requests to settle. This does weaken the breach detection slightly, but it’s very slight. A malicious actor would have to guess when the true client is going to refresh the token. And even if they guess successfully, as long as you give both of them (the malicious actor and the true client) the same new tokens, then the malicious actor will have to make another lucky guess the next time the client refreshes its tokens. (Note: Due to the tighter security requirements of public clients, you might decide to shorten or disable the grace period for those clients.) Although this idea isn’t formalized in the spec yet, several authorization servers support some kind of grace period on refresh tokens, including Auth0, Okta, Fitbit, Slack, and Lucid, which we’ll discuss further next (I’ve also found it in Fauna’s auth “blueprints” code and the Django OAuth toolkit library).

How we do this at Lucid

Lucid's REST API uses OAuth 2.0 with refresh tokens, so we mitigated this edge case in the authorization server to make it easier to build integrations. Whenever a refresh token is rotated, we store the new tokens in an encrypted cache with a short TTL. We then return the tokens from the cache if the refresh token is reused within the grace period. This mitigates the problem for clients without introducing too much complexity in our authorization server implementation.

Conclusion

Security is a constant balancing act between risk and usability. Industry standards like OAuth 2.0 are constantly evolving as risks emerge, usability requirements change, and the industry learns as a whole. The best solutions are ones that give benefits on one side of the balance with no harm to the other. If you’re careful about it, implementing theft detection for refresh tokens can be that kind of idea—it can increase the security of your OAuth 2.0 system, and be completely transparent to the clients consuming your APIs, no matter how they’re architected. Interested in getting started with Lucid’s APIs? Check out our new developer platform! This post was originally published on Dan Mercer’s blog.

About Lucid

Lucid Software is a pioneer and leader in visual collaboration dedicated to helping teams build the future. With its products—Lucidchart, Lucidspark, and Lucidscale—teams are supported from ideation to execution and are empowered to align around a shared vision, clarify complexity, and collaborate visually, no matter where they are. Lucid is proud to serve top businesses around the world, including customers such as Google, GE, and NBC Universal, and 99% of the Fortune 500. Lucid partners with industry leaders, including Google, Atlassian, and Microsoft. Since its founding, Lucid has received numerous awards for its products, business, and workplace culture. For more information, visit lucid.co.