I am implementing an app which does a lot of networking calls to a rest-api that we also control. Recently we decided to introduce caching headers on the server side to save some valuable networking and server time. As we do not know beforehand for how long the data will be valid, we are not sending Cache-control: max-age or Expires headers, all we do is send a Last-Modified header together with a E-tag, so we always hit the server but responses are pretty fast most of the times with a 304.
Everything seemed to work fine at first, with many requests being cached. However, I am experiencing some random data errors on the app due to the caching.
For some reason I can not understand, at some point requests are being locally cached and used as "updated" data without hitting the server, when they actually are not. The problem keeps there until some time passes. Then everything goes to server normally again, exactly as it would behave with a cache-control header, but without it!. So, my question is:
How can NSURLCache together with NSURLConnection decide that a particular request does not need to go online when the original request did not come with Cache-control: max-age or Expires headers? Has anyone experienced similar effects? And how can I solve it without removing the whole cache?
Some more background info:
I am using AFNetworking, but it relies on NSURLConnection so I do not
think it changes anything
The cache used is the default [NSURLCache sharedURLCache] instance
It is a GET request, and when I check the headers from the cached response this is what I get:
po [response allHeaderFields]
"Access-Control-Allow-Headers" = "Content-Type";
"Access-Control-Allow-Methods" = "GET, POST, DELETE, PUT";
"Access-Control-Allow-Origin" = "*";
Connection = "keep-alive";
"Content-Encoding" = gzip;
"Content-Length" = 522;
"Content-Type" = "application/json";
Date = "Mon, 02 Sep 2013 08:00:38 GMT";
Etag = "\"044ad6e73ccd45b37adbe1b766e6cd50c\"";
"Last-Modified" = "Sat, 31 Aug 2013 10:36:06 GMT";
Server = "nginx/1.2.1";
"Set-Cookie" = "JSESSIONID=893A59B6FEFA51566023C14E3B50EE1E; Path=/rest-api/; HttpOnly";
I can not predict or reproduce when the error is going to happen so solutions that rely on deleting the cache are not an option.
I am using iOS5+
How can NSURLCache together with NSURLConnection decide that a
particular request does not need to go online when...
Section 13.2 of RFC 2616 says:
Since origin servers do not always provide explicit expiration times,
HTTP caches typically assign heuristic expiration times, employing
algorithms that use other header values (such as the Last-Modified
time) to estimate a plausible expiration time. The HTTP/1.1
specification does not provide specific algorithms, but does impose
worst-case constraints on their results. Since heuristic expiration
times might compromise semantic transparency, they ought to used
cautiously, and we encourage origin servers to provide explicit
expiration times as much as possible.
So, it's possible for the URL loading system to decide that the cached data is "fresh enough" even though you haven't provided a specific lifetime for the data.
For best results, you should try to provide a specific lifetime in your response headers. If adding such a header is impossible, perhaps you could change the request instead. if-modified-since or cache-control could each help you avoid cached data.
According to the statement "at some point requests are being locally cached and used as "updated" data without hitting the server" I am pretty sure that your requests are memory cached.
NSURLCache caches the data in memory. not on disk. So let me explain what might be happening with you.
You launches the app.
Makes a web service call
it fetches the data from server
you again makes a call and it fetches the response from memory without making a call to server and displays you the result.
You leaves the app for sometime or restarts the app. It checks if data is their in memory. If it is not available then it again makes a call to server and repeats the same behaviour.
I would recommend you to write your own disk caching for the same instead of relying on NSURLConnection and NSUrlCache. because some of the caching policies aer still not implemented from Apple.
please make sure that your NSURLRequest cache policy is set to NSURLRequestReturnCacheDataElseLoad
here if you are using AFNetworking in AFHTTPClient.m you can override the method
- (NSMutableURLRequest *)requestWithMethod:(NSString *)method
path:(NSString *)path
parameters:(NSDictionary *)parameters
replace the line 470 with this
NSMutableURLRequest *request = [[NSMutableURLRequest alloc] initWithURL:url cachePolicy:NSURLRequestReturnCacheDataElseLoad timeoutInterval:15];
what you are actually doing is to telling the request to load the cache if server is not updated..if server is updated then it will ignore the cache and download the content from server
FYI: NSURLCache stores the data in memory..if you want to store the data in disc you can use my class here
https://github.com/shoeb01717/Disc-Cache-iOS
Related
According to the documentation if you use the default useProtocolCachePolicy the logic is as follows:
If a cached response does not exist for the request, the URL loading system fetches the data from the originating source.
Otherwise, if the cached response does not indicate that it must be revalidated every time, and if the cached response is not stale (past its expiration date), the URL loading system returns the cached response.
If the cached response is stale or requires revalidation, the URL loading system makes a HEAD request to the originating source to see if the resource has changed. If so, the URL loading system fetches the data from the originating source. Otherwise, it returns the cached response.
However, my experimentation (see below) has shown this to be completely false. Even if the response is cached it is never used, and no HEAD request is ever made.
Scenario
I am making a request to a URL that returns ETag and Last-Modified headers which never change. I have made the request at least once so the response is already cached (which I can verify by looking at the cache DB for the app on the iOS simulator)
Using useProtocolCachePolicy (the default)
If I have a URLSession with a URLSessionConfiguration with requestCachePolicy set to useProtocolCachePolicy then the response is cached (I can see it in the cache DB), but the cached response is never used. Repeated requests to the same URL always make a new GET request without If-None-Match or If-Modified-Since headers, so the server always returns HTTP 200 with the full response. The cached response is ignored.
Using reloadRevalidatingCacheData on every URLRequest
If I set the cachePolicy on each URLRequest to reloadRevalidatingCacheData then I see caching in action. Each time I make the request, a GET request is made with the If-None-Match and If-Modified-Since headers set to the values of the ETag and Last-Modified headers, respectively, of the cached response. As nothing has changed, the server responds with a 304 Not Modified, and the locally cached response is returned to the caller.
Using reloadRevalidatingCacheData only on the URLSessionConfiguration
If I only set requestCachePolicy = . reloadRevalidatingCacheData on the URLSessionConfiguration (instead of on each URLRequest) then when the app starts only the first request uses cache headers and gets a 304 Not Modified response. Subsequent requests are normal GET requests without any cache headers.
Conclusion
All the other cache policy settings are basically variants of "only use cached data" or "never use the cache" so are not relevant here.
There is no scenario in which URLSession makes a HEAD request as the documentation claims, and no situation in which it just uses cached data without revalidation based on expiration date information in the original response.
The workaround I will use is to set cachePolicy = .reloadRevalidatingCacheData on every URLRequest to get some level of local caching, as 304 Not Modified response only return headers and no data so there is a saving of network traffic.
If anyone as any better solutions, or knows how to get URLSession working as documented, then I would love to know.
Service response headers should include:
Cache-Control: must-revalidate
Apple will use this instruction to implement .useProtocolCachePolicy as described in documentation.
I've used NSURLRequestReturnCacheDataElseLoad but it does not work, ie, it always loads data from server as every time the data I received is different.
I'm wondering that:
even with NSURLRequestReturnCacheDataElseLoad policy, it also obeys the cache control headers from server's response, regardless of document saying ignoring expire date?
What is the storage policy for [NSURLCache sharedURLCache]? If it is in memory only, then next time I start the app it won't have cache on disk?
I found this very interesting:
NSURLRequestReturnCacheDataElseLoad not loading from cache on first request?
which says
it seems this problem only exists when there's a query in the url.
Is that a confirmed problem?
Thanks
Well this topic is almost 5 years old, but I just recently had the same problem. In my case, I was using a NSURLSessionDownloadTask which, according to Can I use HTTP caching with an NSURLSessionDownloadTask on iOS?, doesn't use caching no matter which caching policy is used. I switched my code to use NSURLSessionDataTask and the NSURLRequestReturnCacheDataElseLoad policy worked as expected.
We are building an iOS app that uses AFNetworking to connect to a server running Tornado. The server includes the header Cache-Control: private, max-age=900 in the response. When running the server on my local machine, I can tell that AFNetworking uses the cached values because there are no requests received by the server on repeated requests from the app. When we deploy the same Tornado server to the test machine, each request from the app results in a request received on the server, ignoring the cached value.
The only difference between the two setups are the URL of the server and the fact that the test server is accessed over a HTTPS connection, while the localhost uses HTTP. Does HTTPS affect the caching by AFNetworking, and if so, how can we get AFNetworking to respect the cache header?
Not sure if its gonna be any help but here it is anyway:
AFNetworking uses NSURLConnection which uses NSURLCache shared cache. AFNetworking absolutely transparent in cache regard and doesn't do anything specific.
My requests are https and were caching just fine.
Cache-Control response directives allow an origin server to override the default cacheability of a response:
private
Indicates that all or part of the response message is intended for a single user and MUST NOT be cached by a shared cache. This allows an origin server to state that the specified parts of the
response are intended for only one user and are not a valid response for requests by other users. A private (non-shared) cache MAY cache the response.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1
If acceptable try to change policy to public .
Log headers on response from the app and look at your cache.db and see if in fact something is caching there.
Try to configure shared cache - something along the lines of
int cacheSizeMemory = 1*1024*1024; // 4MB
int cacheSizeDisk = 100*1024*1024; // 100MB
[[NSURLCache sharedURLCache] setMemoryCapacity:cacheSizeMemory];
[[NSURLCache sharedURLCache] setDiskCapacity:cacheSizeDisk];
Another good read about this here
http://petersteinberger.com/blog/2012/nsurlcache-uses-a-disk-cache-as-of-ios5/
If I download a document using NSURLConnection/NSURLCache that gets cached, edit that document on the server (so Last-Modified and Etag headers change) and then download the document again, the previously cache version is returned. NSURLCache/NSURLConnection makes no attempt to check for a newer resource using If-Modified-Since/If-None-Match headers in the request (which would return a newer version of the resource).
SHould NSURLCache used in conjunction with NSURLConnection check for an updated resource on the server using Last-Modified/Etag headers that have been previously cached? I can't seem to find any documentation to say whether this should happen or if checking for HTTP 304 content is up to the developer.
I'll let other people comment on how to use NSURLCache. I found that the most reliable way to prevent caching with NSURLConnection, proxy servers, and misconfigured web servers, was to append an incrementing number to your URL.
So rather than using http://mycompany.com/path, use http://mycompany.com/path?c=1, http://mycompany.com/path?c=2, http://mycompany.com/path?c=3, etc, etc.
It's a hack, but a good one.
I was looking at Chirpy for css/js minifying,compression, etc.
I noticed it doesn't support caching. It doesn't have any logic for sending expires headers, etags, etc.
The absence of this feature made me question if caching content is not as much of a concern; YSlow! grades this so I'm a little confused. Now I'm researching caching and cannot explain why this css file, SuperFish.css, is being retrieved from cache.
Visit http://www.weirdlover.com (developer of Chirpy)
Look at initial network track. Notice, there is no expiration header for SuperFish.css.
Revisit the page and inspect the network trace again. Now SuperFish.css is retrieved from cache.
Why is the SuperFish.css retrieved from cache upon revisiting the page? This happens even when I close all instances of chrome and then revisit the page.
This seems to fall with in the HTTP specification.
13.4 Response Cacheability
Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh
13.2.2 Heuristic Expiration
Since origin servers do not always provide explicit expiration times, HTTP caches typically assign heuristic expiration times, employing algorithms that use other header values (such as the Last-Modified time) to estimate a plausible expiration time.
It would seem by not providing a cache-control header, and leaving out the expires header the client is free to use a heuristic to generate an expiry date and then caches the response based upon that.
The presence of an etag has no effect on this since the etag is used to re-validate an expired cache entry, and in this case chrome considers the cached entry to be fresh (the same applies to last-modified), thus it hasn't yet expired.
The general principle being if the origin server is concerned with freshness it should explicitly state it.
In this case (when server doesn't return Expires header), the browser should make HTTP request with If-Modified-Since header, and if the server returns HTTP 304 Not modified then the browser gets the data from the cache.
But, I see, nowadays browsers don't do any requests when the data is in the cache. I think they behave this way for better response time.