Can I preload specific assets using service workers? - service-worker

I'm new to service workers and still unsure how they work.
I'm building an educational game using react, ES6, Typescript. This game will eventually have numerous assets such as animations exported by Adobe Animate, audio files, etc.
I want to preload all the assets for an individual level before the game starts. I've been messing with Workbox, but I'm not sure if it allows to preload specific assets according to what level in the game the user is at. In other words, if an user is on Level 1, I want to preload all assets for Level 1, and eventually, after the game starts, preload all assets for Level 2 so the user doesn't have to wait once Level 1 is completed.
Is this possible with service workers and/or Workbox?

Workbox has support for precaching a set of URL derived at build time. A best practice, for applications that might have many megabytes worth of content, is to only cache the assets that you think the majority of the users will end up using.
It sounds like, in your scenario, it would make sense to precache the shared HTML, JavaScript, and CSS that's common throughout your web app, and then use a different strategy for the level-specific assets.
You can use the Cache Storage API directly (either from within your web app's window context, or inside your service worker) to add URLs to a cache outside of the precaching step.
What I would recommend doing is taking advantage of the Cache Storage API to add items to the cache proactively when certain conditions are met in your web app (like, the user is about to complete level 1), and then use a cache-first (or stale-while-revalidate, if you want to include a check for updates to cached resources) strategy in Workbox along with a route that matches those URLs. If a given asset has already been added to the cache then it will be used right away, and if the asset hasn't been added to the cache (because the user progressed from level 1 to level 2 before the cache.addAll() call could complete), things will still work, but they'll be blocked on the network.
This might look like the following in your web app's window context:
const level2Urls = [
'/assets/level2.png',
// ...
];
const level3Urls = [
'/assets/level3.png',
// ...
];
async function cacheAssetUrls(urls) {
const cache = await caches.open('assets');
await caches.addAll(urls);
}
And then in your service worker:
workbox.routing.registerRoute(
new RegExp('/assets/')
workbox.strategies.cacheFirst({
cacheName: 'assets'
})
);

Related

Save on demand urls to cache

I'm learning Workbox and I want to add some articles URLs to cache for X amount of days and I don't know how to do it.
I can handle URLs that I know using precacheAndRoute.
Example:
precacheAndRoute([
{url: '/index.html', revision: '...'},
{url: '/contact.html', revision: '...'},
])
Now, I want to add some URLs that I don't know the path to cache on demand. This's because my project is a blog and each post has his own path.
My proposed scenario is:
A user enters the article, and that article is cached for 30 days, so you can view offline later.
What you're after is called runtime caching. It works as you describe: content is cached as the user navigates through the website. Afterwards the content is available for offline viewing.
Runtime caching maybe be implemented with different strategies. They can eg. accept data only from the cache, from cache or network depending on speed, first cache and update in the background etc. Multiple different strategies which may even be manually configured to fit your needs.
Reading: https://developers.google.com/web/tools/workbox/modules/workbox-strategies#what_are_workbox_strategies, https://developers.google.com/web/fundamentals/instant-and-offline/offline-cookbook, https://web.dev/runtime-caching-with-workbox/
Advice: before implementing anything READ A LOT. That way you can grasp the concepts before you try anything. It might also be that you find something you never thought about in the beginning.

Cache-first Service Worker: how to bypass cache on updated assets?

Here is the scenario:
You have a site that currently cached via a SW. You deploy a new version that includes an updated SW with a cache busting version. The company then announces the new features. People visit the site, however, even though the SW busts it still serves up the previous cache while updating its cache in the background. So visitors that come for the new features don't see them.
Is this the expected experience with ServiceWorkers? What are the recommended strategies to get around this?
It's the expected behavior whenever you serve resources with a cache-first strategy, yes.
There are two options:
Don't use a cache-first strategy. Unfortunately, you lose out on most of the performance benefits of service workers if you use a network-first strategy. I wouldn't recommend going network-first if you can help it.
Adopt the UX pattern of displaying a "Reload for the latest updates" toast message on the screen letting the user know that the cached content has been refreshed, and allowing them to take action to see the latest content. This is, I think, the best approach. If you're using a service worker which gets updated whenever your cached content changes (e.g., one generated by sw-precache), then you can detect these updates by listening for specific service worker controller events, and use those to trigger the message. (Here's an example.)

Storage of user data

When looking at how websites such as Facebook stores profile images, the URLs seem to use randomly generated value. For example, Google's Facebook page's profile picture page has the following URL:
https://scontent-lhr3-1.xx.fbcdn.net/hprofile-xft1/v/t1.0-1/p160x160/11990418_442606765926870_215300303224956260_n.png?oh=28cb5dd4717b7174eed44ca5279a2e37&oe=579938A8
However why not just organise it like so:
https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png
Clearly this would be much easier in terms of storage and simplicity. Am I missing something? Thanks.
Companies like Facebook have fairly intense CDNs. They may look like randomly generated urls but they aren't, each individual route is on purpose and programed to be handled in that manner.
They aren't after simplicity of storage like you would be if you were just using a FTP to connect to a basic marketing website server. While you may put all your images in a /images folder, Facebook is much too complex for this. Dozens of different types of applications accessing hundreds if not thousands of CDNs and servers world wide.
If you ever build a web app, such as a Ruby on Rails app, and you work with a services such as AWS (Amazon Web Services) you'll also encounter what seems like nonsensical urls. But it's all part of the fast delivery network provided within the architecture. Every time you "push" your app up to the server new urls are generated for each unique resource automatically, css files, JavaScript files, image files, etc all dynamically created. You don't have to type in each of these unique urls individually each time you publish the app, the code simply knows where to look for those as a part of the publishing process.
Example: you tell the web app to look for
//= require jquery
and it returns you http://example.com/assets/jquery-eb3e278249152b5b5d5170b73d9dbf52.js?body=1 in your header.
It doesn't matter that the url is more complex than it should be, the application recognizes it, and that's all that matters.
Simply put, I think it can boil down to two main reasons: Security and Cache:
Security - Adding these long unpredictable hashes prevent others from guessing photo URLs and makes it pretty hard to download photos you aren't supposed to.
Consider what would happen if I could easily guess your profile photo URL and download it, even when you explicitly chose to share it only with friends.
Cache - by adding "random" query params to each photo, you make sure each photo instance gets its own URL. Thus you can store the photo in browser's cache for a long time, knowing that whenever you replace it with a new one, the new photo will have a fresh URL and the browser won't keep showing you the old photo.
If you were to keep the same URL for each user's profile photo (e.g. https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png), and then upload a new photo, either one of these can happen:
If you stored the photo in browser's cache for a long time, the browser will keep showing you the cached version (as long as URL is the same, and cache hasn't expired, there's no need to re-download the image).
If, instead, you only keep the image in cache for short period of time, you end up hitting your server much more then actually needed, increasing the load and hurting performance.
I hope this clarifies it.
With your route scheme, how would you avoid strangers to access the pictures of a private account? The hash also prevent bots to downloads all the pictures.
I get your pain :-) I might not stay with describing how this problem could appear more, but rather let me speak of a solution. Well it is normal that in general code while dealing with hashed value or even base64ed value it seems likes mess to deal with, but with an identifier to explain along, it does not remain much!
I use to work in a company where we use to collate Facebook post, using Graph API get its Insights Object and extract information from it for easy passing around within UI and sending back to our Redis cache store; and once we defined a data-structure in TaffyDB how an object organization is going to look like, everything just made sense with its ability to query the useful finite from long junk looking stream of minified Javascript stream
Refer: http://www.taffydb.com/
The extra values in the URL are useful to:
Track access. This is like when a newspaper appends "&homepage" vs. "&email" to an article URL, so their system knows how a reader found the page.
Avoid abuse and control access. Imagine that a user loaded a small, popular pornographic image into a profile image. They could then hijack the CDN to be a free web host for their porn site. But that code is used internally by the CDN to limit the number of views.

How would HTML5 offline manifest/functionality work with ASP.NET MVC 4?

Ok, I'm building a PoC for a mobile application that needs to have offline capabilities, and I have several questions about whether I'm designing the application correctly and also what behavior I will get from the cache manifest.
This question is about including URLs of Controller actions in both the CACHE section of the manifest as well as in the NETWORK section.
I believe I've read some conflicting information online about this. In a few sites I read that including the wild card in the NETWORK section would make the browser try to retrieve everything from the server when it's online, and just use whatever is cached if there is no internet connection.
However, this morning I read the following on Dive into HTML5 : Let's take this offline:
The line marked NETWORK: is the beginning of the “online whitelist” section.
Resources in this section are never cached and are
not available offline. (Attempting to load them while offline will
result in an error.)
So, which information is correct? How would the application behave if I added the URL for a controller action in both the CACHE and the NETWORK sections?
I have a very simple and small PoC working so far, and this is what I've observed regarding this question:
I have a controller action that just generates 4 random numbers and sets them on the ViewBag, and the View will display them on a UL.
I'm not using Output caching at all. The only caching comes from the manifest file.
Before adding the manifest attribute to my Layout.cshtml's html tag, each time I requested the View, I'd get different random numbers every time, and a breakpoint set on the controller action would be hit.
The first time I requested the URL/View after adding the manifest attribute, the breakpoint on the controller is hit 3 times (as opposed to just 1 before). This is already weird and I'll post a separate question about this, I'm just writing it here for reference.
After the manifest and the resources are cached (verified by looking at the Console window on Chrome Dev Tools), everytime I request the View/URL I get the cached version and the breakpoint is never hit again.
This behavior makes me believe that whatever is in the CACHE section will override or ignore anything that is on the NETWORK section, but like I said (and the reason I'm asking here) is because I'm new to working with this and I'm not sure if this is how it's supposed to work or if I'm missing something or not using it correctly.
Any help is greatly appreciated
Here's the relevant section of the cache.manifest:
CACHE MANIFEST
#V1.0
CACHE:
/
/Content/Site.css
/Content/themes/base/jquery-ui.css
NETWORK:
*
/
FALLBACK:
As it turns out, html5 appcache or manifest caching does work differently than I expected it to.
Here's a quote from whatwg.org, which explains it nicely:
Offline Web Applications
The application cache feature works best if the application logic is
separate from the application and user data, with the logic (markup,
scripts, style sheets, images, etc) listed in the manifest and stored
in the application cache, with a finite number of static HTML pages
for the application, and with the application and user data stored in
Web Storage or a client-side Indexed Database, updated dynamically
using Web Sockets, XMLHttpRequest, server-sent events, or some other
similar mechanism.
Legacy applications, however, tend to be designed so that the user
data and the logic are mixed together in the HTML, with each operation
resulting in a new HTML page from the server.
The mixed-content model does not work well with the application cache
feature: since the content is cached, it would result in the user
always seeing the stale data from the previous time the cache was
updated.
While there is no way to make the legacy model work as fast as the
separated model, it can at least be retrofitted for offline use using
the prefer-online application cache mode. To do so, list all the
static resources used by the HTML page you want to have work offline
in an application cache manifest, use the manifest attribute to select
that manifest from the HTML file, and then add the following line at
the bottom of the manifest:
SETTINGS:
prefer-online
NETWORK:
*
so, as it turns out, application cache is not a good fit for pages with dynamic information that are rendered on the server. whatwg.org calls these type of apps "legacy".
for a natural fit with application cache, you'd need to have only the display and generic logic on your html page and retrieve any dynamic information through ajax requests.
hope this helps.

Is there a way to query the HTML5 application cache?

Is there a way to query the contents of the HTML5 application cache?
I'm writing an iOS application that uses a lot of cached web content. Before loading a given page when the app is offline, I'd like to check whether the page exists in the cache. If it doesn't, I'll notify the user that they have to be online to see that content; if it does, I'll go ahead and load it.
Now, iOS has its own URL caching system, and I initially just assumed that I could check the contents of the cache this way:
if ([[NSURLCache sharedURLCache] cachedResponseForRequest:myRequest] != nil) {
// go ahead and load the page
}
else {
// notify the user that the content isn't available
}
Silly me. It seems that iOS's cache and HTML5's cache are unrelated: -cachedResponseForRequest: returns nil for any request, even when I can see that the URL is in the HTML5 application cache (using the Safari web debugger).
So, is there some way that I can query the contents of the HTML5 application cache? It doesn't matter if the answer uses Objective-C code or Javascript, since I can always just execute the relevant JS from Objective-C.
There are two properties of HTML5 AppCache which mean that in normal operation there shouldn't be a need to do so:
AppCache update operations are atomic, either the entire cache is updated, or none of it it
Once an AppCache is created then all files that are in the cache are served from the cache
The end result is that for any given version of the manifest file, any file listed in it that gets loaded into the browser will be consistent with all the other files listed in the manifest. All you should need to check is window.applicationCache.status and check that it is not UNCACHED.
There is another possibility. If you are 'lazily adding' files to the AppCache as described in Dive Into HTML5 then it could be that you're not sure which files are cached. In this case you could adapt one of the approaches for detecting online state, I'm not going to give you a fully tested solution but here is the general idea:
Create a web page containing a unique identifier, something that's unlikely to ever appear normally in a page. The identifier can be in hidden content in an otherwise normal page.
Set this page as the generic FALLBACK in your manifest.
Request pages with AJAX.
Scan the response for the unique identifier, if you find it then you know the page requested is not in the AppCache
Yes,the cache is stored in the Application.db.

Resources