GUIDE
Secure Your Web Apps: Practical Fixes for the Top 5 Vulnerabilities.
GUIDE
Secure Your Web Apps: Practical Fixes for the Top 5 Vulnerabilities.

Web Cache Deception: What It Is and How to Test for It?

This blog covers web cache deception, a vulnerability that occurs when caching systems mistakenly store and serve sensitive user data. It explores how improper caching configurations and URL tricks can lead to sensitive data exposure.

What is Web Caching?

Web caching is the process of temporarily storing copies of web pages and contents like images, CSS files or data so it can be delivered faster the next time someone requests it.

It helps improve the user experience as it serves the content faster and reduces the network traffic from the actual server, as it delivers the content from the cache.

Important Terms for Caching:

Cache Hit: Cache-Hit (response header) occurs when the content is already stored in the CDN and can be delivered quickly.

Cache Miss: Cache-Miss (response header) occurs when the content is not stored, so the request is sent to the origin server to fetch it.

Cache Key: The application uses HTTP Request headers or query parameters to determine whether their cache response is present or not in the caching server.

How does Caching work?

Caching rules define what types of web content should be stored, where they should be stored, and how long they should remain available. These rules help in improving performance, especially for static files like images, CSS, or JavaScript files, but if not configured correctly, could lead to issues like web cache deception. To control caching behavior, several key headers and rules are used.

Here are the most important ones to understand:

1. Cache-Control Header
web-cache-deception-1
Source: https://www.cloudflare.com/img/learning/cdn/glossary/what-is-cache-control/cache-control-header.png

Servers use the Cache-Control header to tell browsers and CDNs how to handle caching for responses. 

Cache-Control: public, max-age=3600

In this example, public allows both browsers and intermediary systems like CDNs to cache the content. The max-age=3600 tells them to keep the content cached for one hour. This setup works well for static content. However, if developers apply it carelessly to dynamic or sensitive responses that contain session tokens, API keys, or users’ personally identifiable information (PII), they risk caching private data and potentially exposing it. 

Cache-Control: private

Another directive, private instructs browsers to cache the response only locally and prevents shared caches like CDNs from storing it. This directive offers more security for user-specific content.

Cache-Control: no-store

The no-store directive tells caches not to store the response neither in memory nor on disk. Developers use it for pages that handle sensitive or user-specific data, such as login screens, bank transactions, or profile settings.

When security is a priority, no-store provides the safest option by ensuring that the data never gets saved, even temporarily.

2. File Type Rules
web-cache-deception-2

File types also determine caching behavior. Static assets like .css, .js, or .jpg are typically safe to cache because they don’t change often and aren’t user-specific.

Sometimes files like .php, .aspx, or app routes like /dashboard contain personalized data. These should only be cached under strict conditions, and usually with the right Cache-Control and Vary headers in place.

3. Query String Handling

In some cases, URLs like  /page?id=1 and /page?id=2 are treated as identical, which can be risky, especially if those IDs correspond to different users or sensitive data. This behavior can cause one user’s private content to be cached and accidentally served to someone else. This risk will be explained in more detail later. 

What is Web Cache Deception?

Web cache deception is a vulnerability that tricks website caching mechanisms into storing private and sensitive information in a place meant only for public data.

Let’s take an example where a user visits an endpoint like https://example.com/B2B/profile. Based on cookies or unique HTTP request values, the server returns unique information such as the user’s PII data, API keys, and other sensitive details. Here's an example of the request:

GET /B2B/profile HTTP/1.1 
Host: example.com
User-Agent: Mozilla/5.0
Cookie: asvaghdva....
Accept: text/html

The server uses the cookie value to fetch a response containing sensitive information and marks it as cacheable with Cache-Control: public  header.

HTTP/1.1 200 OK 
Content-Type: application/json 
Cache-Control: public, max-age=3600 
{ 
 "user_id": "298374", 
 "name": "Jane Doe", 
 "email": "jane.doe@example.com", 
 "address": "1234 Elm Street, Springfield, IL", 
 "phone": "+1-555-1234" 
} 

To test for web cache deception, craft a request like the following by adding a cache key to the URL. In this case, any arbitrary filename and static extension (e.g., .js, .css, .png) can be used for testing.

GET /B2B/profile/abc.js HTTP/1.1 
Host: example.com 
User-Agent: Mozilla/5.0 
Cookie: asvaghdva.... 
Accept: text/html 

The abc.js  path doesn’t actually exist, but the application still serves the same response because the server ignores the extra path segment and loads the original resource. This behavior often results from improper normalization or routing logic that doesn’t validate the full URL. As a result, the server caches the response and stores personal information under the path https://example.com/B2B/profile/abc.js that allows an attacker to access the sensitive information or data by just visiting a URL leading to web cache deception vulnerability.

Why does it happen?

Web cache deception mainly happens because caching layers are not set up correctly. When an application serves both static content (like images or scripts) and dynamic content (like user profiles) through the same caching system without clear rules, this way the cache can get confused. It might store sensitive data that should never be cached which exposes private information to unintended users.

Another common cause is the lack of proper handling of user or session-specific data. Without a proper mechanism to separate the cached contents based on who is requesting it, the cache may serve one user’s private data to someone else. This mix-up usually happens when cache controls aren’t strict or specific enough, making it easier for attackers to exploit these weaknesses.

Using Delimiters to Identify Web Cache Behavior

web-cache-deception-3

A delimiter is a character used to separate different parts of data in URLs. For example,   can be used to trick the server into stopping path parsing early, while the cache might ignore it. Here’s the table of how delimiters can behave:

web-cache-deception-table

How can normalization trick the Backend and Cache servers?

URL normalization is the process where a server "cleans up" a URL before doing anything with it. This means decoding characters such as %2f ( /) and removing unnecessary dots or extensions.

When testing for normalization on the origin server (the backend), the goal is to check if it treats unusual-looking paths the same as normal ones. For example, if there is an endpoint /dashboard/settings but the request is made to /test/..%2fdashboard/settings, and the server still loads the settings page, it is normalizing the path before routing it. However, if a 404 or a different response is returned, the server is treating the modified path literally, which leads to an error.

CDNs often have rules like caching everything under endpoints such as /assets, but not caching /profile because the response contains sensitive information. This is where things get interesting. Suppose a request is sent to /assets/..%2f/profile. After normalization, this points to /profile, a page that normally wouldn’t be cached. However, if the cache doesn’t normalize the path properly and blindly caches based on the initial /assets prefix or suffix, it might serve the sensitive /profile page from the cache to other users. This behavior can be tested on paths that serve static content.

web-cache-deception-4

Another example would be, let’s say a payload like /profile%2f%2e%2e%2fstatic may be interpreted differently by the cache and the origin server as the cache resolves it to /static, while the origin server sees the encoded traversal and returns an error.

To bypass this, test for delimiters (like ;) that the origin server uses to truncate the path, but the cache ignores. For instance, with /profile%2f%2e%2e%2fstatic, the cache still sees /static but the origin server stops at  /profile and returns sensitive dynamic content. This content may then be cached and served to unauthorized users, leading to a cache deception vulnerability.

Case Studies: Web Cache Deception Reports from HackerOne

To better understand how web cache deception plays out in real environments, let's look at actual vulnerability reports submitted to HackerOne and analyze how the attacks were carried out.

Web Cache Deception in Action: The Lyst Report

The report explains a common attack trick where someone adds a fake file extension, like .css, to a URL. This fools the caching system into thinking it’s a static file. As a result, it creates a unique cache key, which can cause sensitive content to get stored in the cache and later exposed to other users.

web-cache-deception-5

As shown in the screenshot, the username, email, user ID, user slug, and other session-specific details are clearly exposed in the response that should have never been cached or accessible to other users.

https://www.lyst.com/shop/trends/mens-dress-shoes/blahblah.css
web-cache-deception-6

This response was stored in the web cache, meaning it could be served to anyone else who visited the same URL, even if they weren’t logged in, which makes it worse. As seen in the screenshot above, the researcher tested the attack by opening the cached URL in incognito mode, which revealed the victim’s logged-in session details.

So, the attack scenario is that when a logged-in user receives a path with a fake file extension, their data gets cached at that particular path. Since it’s not possible to get the content cached with a single request, the attacker hosts an HTML page on an attacker-controlled domain containing a script that sends multiple requests on behalf of the user. This caches the content at a random path, automating the process.

Here’s the original HackerOne submission for further reference: https://hackerone.com/reports/631589

Inside the Shopify Web Cache Deception Report

This one is an excellent example of the web cache deception vulnerability and why it’s worth checking for 404 pages as well while testing, where the caching server mistakenly stored user-specific data that should have remained private.

The issue occurred when the Shopify application returned a custom 404 error page that included personal details of the authenticated user. like their name, email, profile picture, and even a valid CSRF token. The researcher exploited this by crafting URLs that looked like static files (e.g., .css) using path confusion technique. By appending a random string and a .css extension to valid Shopify paths, the cache server treated the page as a static asset and stored it. For example, a URL like:

https://help.shopify.com/es/manual/your-account/copyright-and-trademark/abcdefg.css
web-cache-deception-7

From the screenshot above, we can see that the cf-cache-status header’s value shows “HIT” even though the endpoint is authenticated.

An interesting point to note in this case is that for this web cache deception attack to succeed, both the victim and the attacker must be served by the same public cache node for example, both located in Europe. If they’re served by different cache regions, like the victim in Europe and the attacker in the US, the attacker won’t receive the cached data. It’s also worth mentioning that this wasn’t limited to just one subdomain, several Shopify subdomains were found to be vulnerable. Read the original report.

With access to this kind of data, an attacker could take over accounts or carry out more targeted attacks. It’s a good catch that even something as ordinary as a 404 error page can become a serious security risk if it's cached incorrectly.

Methodology for Web Cache Deception

web-cache-deception-8

Here are the key steps to effectively identify and test for this vulnerability.

Identify a target URL that returns sensitive data

Start by finding a page or endpoint on the website that shows personal or sensitive information like email, API key, address, payment info, account settings, order history, or any page that’s customized for a logged-in user. Make sure the content changes based on the user’s session.

Check the caching headers on the response

Look closely at the HTTP response headers for cache-related fields such as Cache Control, Expires, and Pragma. Headers like Cache-Control: public or a positive max-age value indicate that the content might be cached. If the headers specify no-store or private, caching is likely disabled. In such cases, the static file method can be used to search for other cached resources to manipulate the URL into being cached, as described above.

Modify the URL by adding a fake static file extension or path

Now, try to trick the cache by adding what looks like a static file extension or an extra path segment to the URL. For example, if the original URL is /profile , try  /profile.css or /profile/abc.js. The idea is that caches often aggressively store responses for static file types (like .css, .js, .png ).

Experiment with adding extra path segments, like /profile/static or /profile/assets/script.js, to see if the cache treats these differently. This technique, combined with the methods mentioned in the Normalization section, can help in finding abnormalities.

Send a request to this modified URL and analyze the response

Make a request to the modified URL and check if the server returns the same sensitive content as before. If the response includes user-specific data, it indicates that the cache might treat it as a static file and store it.

Look for caching headers that confirm the response is cached

Check if the response now includes headers that indicate caching, such as an Age or X-Cache: Hit header or headers added by CDN providers. This can help confirm that the content is actually being cached somewhere.

Try accessing the modified URL from a different session or browser

To confirm the vulnerability, use a different browser or an incognito window to access the same modified URL. If the original user’s private information appears without any authentication, it indicates that cached sensitive data is being served to unauthorized users.

Note: In some cases, the caching configuration includes the User-Agent as part of the cache key, which means that the cache deception may only occur when using the same user agent.

How to Prevent Web Cache Deception?

  1. Avoid Caching Sensitive Content: User-specific pages such as account settings, billing information, or user dashboards should not be cached. For that, set HTTP response headers such as Cache-Control: no-store, private, no-cache, and Pragma: no-cache, that won’t store sensitive content.
  2. Use Whitelisted Caching: Define which URLs are allowed to be cached, typically static assets like images, stylesheets, and JavaScript files. Configure the CDN or reverse proxy to cache responses only from specific folders or file extensions.
  3. Validate URL: Web cache deception attacks often rely on appending fake file extensions (like .css , .jpg ) to dynamic URLs. In such cases, applications should strictly validate the structure and patterns of URLs to ensure that invalid or manipulated paths do not resolve to legitimate content.

Conclusion

Web cache deception is a vulnerability that happens when the caching server and the origin server handle the same URL differently. By adding things like encoded characters, special symbols, or fake file extensions to a URL, attackers may trigger the caching of sensitive data. Understanding these behaviors is key to finding and fixing the web cache deception vulnerability. Proper testing and correct cache configurations are essential to prevent private data from being exposed.

Back to Blog
About Meet Sodha
Meet aka Smilehacker is a seasoned cybersecurity researcher with an impressive four-year track record in the field. His journey began in 2019 when he delved into the world of bug bounties and application security, rapidly honing his skills and expertise. Meet possesses a profound understanding of application security and continually augments his knowledge by devouring articles, engaging in hands-on labs, and conducting extensive research in specialized areas. With a keen eye for vulnerabilities and a passion for safeguarding digital landscapes, Meet is a trusted guardian of online security, dedicated to staying one step ahead of cyber threats. More By Meet Sodha
Hacking Web Cache - Deep Dive in Web Cache Poisoning Attacks
Web cache poisoning is an attack where an attacker takes advantage of flaws in the caching mechanism. They attempt to store an altered and malicious response in the cache entry, forcing the website to serve malicious information to its users.  Core Pentester Harsh Bothra deep dives into these attacks and remediations.
Blog
Jan 31, 2023