Using Workbox to manage caches with several service worker clients - service-worker

I'm recently implement a service workers on our site with workbox. Due to the structure of our project we're implementing a service worker for each page for instance:
/foo/XXX/
/foo/XYZ/
/foo/XXY/
This is causing that we're creating a service worker for each page.
On the other hand, we're using precaching in our build process in order to precache css and js assets.
I know workbox creates two caches, one for precaching and the other one for the runtime. Becuase we have several service worker our customer have a new cache entry when they visit a new page
workbox-precache-https://www.example.com/foo/XXX-https://www.example.com
workbox-precache-https://www.example.com/foo/XYZ-https://www.example.com
workbox-precache-https://www.example.com/foo/XXY-https://www.example.com
I know workbox provides an option to set the name for the cache.
workbox.core.setCacheNameDetails({
prefix: 'my-app',
suffix: 'v1',
precache: 'custom-precache-name',
runtime: 'custom-runtime-name'
});
My question is, can I use this option to set the cache name as unique ? My approach is that all assets are in the same cache so workbox will be in charge to delete duplicated and manage the cache. Does it make sense?
Thanks a lot

If you call workbox.core.setCacheNameDetails({suffix: 'my-suffix'}) at the very start of your service worker script, and you do that for each service worker registered on your origin, that would be enough to have all of the service workers use a common cache for their precached assets. (Normally the scope of the currently service worker is used as the suffix, to prevent collisions and ensure that each service worker got its own cache, so you'd be overriding that behavior.)
But... I'd be hesitant to actually do this, or at least to test thoroughly before you do, as you're opening yourself up to possible issues. Some things that I'd worry about:
Normally, the install and activate service worker lifecycle events are used to trigger downloading new assets (install) and deleting out of date assets (activate). The activate step will, by default (unless you're using skipWaiting) not fire until after all tabs with active clients are closed, to ensure that nothing is deleted which is still being used by a tab. If you have multiple service workers, each with their own scope and their own lifecycle events, managing the same cache using precaching, then one service worker's activate event might fire while a tab is open still controlled by a different service worker. This could cause entries to be deleted from the precache when they still might be used by that second tab.
I'd be worried about any relative URLs in your precache manifest, as each of those relative URLs would be resolved using the location of the current service worker as the base. If each of the paths of your site have different URL structures, or if /foo/XXX/app.js is fundamentally different than /foo/XYZ/app.js, then an entry of ./app.js in a precache manifest will end up being pretty dangerous if you share a single cache.
What I'd recommend as an alternative, if you really can't go with a single, higher-level service worker, is not to force all the precached assets into a single cache but instead maintain separate, potentially smaller precaches for each service worker, and then use runtime caching with a common cacheName parameter to share the resources that you know are common. I think that's much less likely to be error prone.

Related

New build deployed in a domain (https://example.com) is not getting reflected as the previous build has a service worker running [duplicate]

I'm playing with the service worker API in my computer so I can grasp how can I benefit from it in my real world apps.
I came across a weird situation where I registered a service worker which intercepts fetch event so it can check its cache for requested content before sending a request to the origin.
The problem is that this code has an error which prevented the function from making the request, so my page is left blank; nothing happens.
As the service worker has been registered, the second time I load the page it intercepts the very first request (the one which loads the HTML). Because I have this bug, that fetch event fails, it never requests the HTML and all I see its a blank page.
In this situation, the only way I know to remove the bad service worker script is through chrome://serviceworker-internals/ console.
If this error gets to a live website, which is the best way to solve it?
Thanks!
I wanted to expand on some of the other answers here, and approach this from the point of view of "what strategies can I use when rolling out a service worker to production to ensure that I can make any needed changes"? Those changes might include fixing any minor bugs that you discover in production, or it might (but hopefully doesn't) include neutralizing the service worker due to an insurmountable bug—a so called "kill switch".
For the purposes of this answer, let's assume you call
navigator.serviceWorker.register('service-worker.js');
on your pages, meaning your service worker JavaScript resource is service-worker.js. (See below if you're not sure the exact service worker URL that was used—perhaps because you added a hash or versioning info to the service worker script.)
The question boils down to how you go about resolving the initial issue in your service-worker.js code. If it's a small bug fix, then you can obviously just make the change and redeploy your service-worker.js to your hosting environment. If there's no obvious bug fix, and you don't want to leave your users running the buggy service worker code while you take the time to work out a solution, it's a good idea to keep a simple, no-op service-worker.js handy, like the following:
// A simple, no-op service worker that takes immediate control.
self.addEventListener('install', () => {
// Skip over the "waiting" lifecycle state, to ensure that our
// new service worker is activated immediately, even if there's
// another tab open controlled by our older service worker code.
self.skipWaiting();
});
/*
self.addEventListener('activate', () => {
// Optional: Get a list of all the current open windows/tabs under
// our service worker's control, and force them to reload.
// This can "unbreak" any open windows/tabs as soon as the new
// service worker activates, rather than users having to manually reload.
self.clients.matchAll({type: 'window'}).then(windowClients => {
windowClients.forEach(windowClient => {
windowClient.navigate(windowClient.url);
});
});
});
*/
That should be all your no-op service-worker.js needs to contain. Because there's no fetch handler registered, all navigation and resource requests from controlled pages will end up going directly against the network, effectively giving you the same behavior you'd get without if there were no service worker at all.
Additional steps
It's possible to go further, and forcibly delete everything stored using the Cache Storage API, or to explicitly unregister the service worker entirely. For most common cases, that's probably going to be overkill, and following the above recommendations should be sufficient to get you in a state where your current users get the expected behavior, and you're ready to redeploy updates once you've fixed your bugs. There is some degree of overhead involved with starting up even a no-op service worker, so you can go the route of unregistering the service worker if you have no plans to redeploy meaningful service worker code.
If you're already in a situation in which you're serving service-worker.js with HTTP caching directives giving it a lifetime that's longer than your users can wait for, keep in mind that a Shift + Reload on desktop browsers will force the page to reload outside of service worker control. Not every user will know how to do this, and it's not possible on mobile devices, though. So don't rely on Shift + Reload as a viable rollback plan.
What if you don't know the service worker URL?
The information above assumes that you know what the service worker URL is—service-worker.js, sw.js, or something else that's effectively constant. But what if you included some sort of versioning or hash information in your service worker script, like service-worker.abcd1234.js?
First of all, try to avoid this in the future—it's against best practices. But if you've already deployed a number of versioned service worker URLs already and you need to disable things for all users, regardless of which URL they might have registered, there is a way out.
Every time a browser makes a request for a service worker script, regardless of whether it's an initial registration or an update check, it will set an HTTP request header called Service-Worker:.
Assuming you have full control over your backend HTTP server, you can check incoming requests for the presence of this Service-Worker: header, and always respond with your no-op service worker script response, regardless of what the request URL is.
The specifics of configuring your web server to do this will vary from server to server.
The Clear-Site-Data: response header
A final note: some browsers will automatically clear out specific data and potentially unregister service workers when a special HTTP response header is returned as part of any response: Clear-Site-Data:.
Setting this header can be helpful when trying to recover from a bad service worker deployment, and kill-switch scenarios are included in the feature's specification as an example use case.
It's important to check the browser support story for Clear-Site-Data: before your rely solely on it as a kill-switch. As of July 2019, it's not supported in 100% of the browsers that support service workers, so at the moment, it's safest to use Clear-Site-Data: along with the techniques mentioned above if you're concerned about recovering from a faulty service worker in all browsers.
You can 'unregister' the service worker using javascript.
Here is an example:
if ('serviceWorker' in navigator) {
navigator.serviceWorker.getRegistrations().then(function (registrations) {
//returns installed service workers
if (registrations.length) {
for(let registration of registrations) {
registration.unregister();
}
}
});
}
That's a really nasty situation, that hopefully won't happen to you in production.
In that case, if you don't want to go through the developer tools of the different browsers, chrome://serviceworker-internals/ for blink based browsers, or about:serviceworkers (about:debugging#workers in the future) in Firefox, there are two things that come to my mind:
Use the serviceworker update mechanism. Your user agent will check if there is any change on the worker registered, will fetch it and will go through the activate phase again. So potentially you can change the serviceworker script, fix (purge caches, etc) any weird situation and continue working. The only downside is you will need to wait until the browser updates the worker that could be 1 day.
Add some kind of kill switch to your worker. Having a special url where you can point users to visit that can restore the status of your caches, etc.
I'm not sure if clearing your browser data will remove the worker, so that could be another option.
I haven't tested this, but there is an unregister() and an update() method on the ServiceWorkerRegistration object. you can get this from the navigator.serviceWorker.
navigator.serviceWorker.getRegistration('/').then(function(registration) {
registration.update();
});
update should then immediately check if there is a new serviceworker and if so install it. This bypasses the 24 hour waiting period and will download the serviceworker.js every time this javascript is encountered.
For live situations you need to alter the service worker at byte-level (put a comment on the first line, for instance) and it will be updated in the next 24 hours. You can emulate this with the chrome://serviceworker-internals/ in Chrome by clicking on Update button.
This should work even for situations when the service worker itself got cached as the step 9 of the update algorithm set a flag to bypass the service worker.
We had moved a site from godaddy.com to a regular WordPress install. Client (not us) had a serviceworker file (sw.js) cached into all their browsers which completely messed things up. Our site, a normal WordPress site, has no service workers.
It's like a virus, in that it's on every page, it does not come from our server and there is no way to get rid of it easily.
We made a new empty file called sw.js on the root of the server, then added the following to every page on the site.
<script>
if (navigator && navigator.serviceWorker && navigator.serviceWorker.getRegistration) {
navigator.serviceWorker.getRegistration('/').then(function(registration) {
if (registration) {
registration.update();
registration.unregister();
}
});
}
</script>
In case it helps someone else, I was trying to kill off service workers that were running in browsers that had hit a production site that used to register them.
I solved it by publishing a service-worker.js that contained just this:
self.globalThis.registration.unregister();

Aren't PWAs user unfriendly if the service worker is not immediately active?

I posted another question as a brute-force solution to this one (Angular: fully install service worker before anything else) but I thought I'd make a separate one to discuss the use case for when a service worker is used as intended.
According to the service worker life cycle (https://developers.google.com/web/fundamentals/primers/service-workers/lifecycle), the SW is installed but it's only active once you then reload the page (you can claim() the page but that's only for calls that happen after the service worker is installed). The reasoning is that if and existing version is updated, the old one and the new one do not mix states and caches. I can agree with that decision.
What I have trouble understanding is why it is not immediately active once it is initially installed. Instead, it requires a page reload unless you explicitly define precaching rules in the SW. If you define caching rules with wildcards, it's not possible to precache those so you need the reload.
Given a single page PWA (like Angular), a user will discover the site and browser around on it but the page will never be reloaded during that session. If they then want to use the site offline later, they need to have refreshed or re-opened the tab at least one other time. That seems like a pretty big pitfall to me.
Am I missing something here?
Your understanding of the service worker lifecycle is correct but I do not think the pitfall you mentioned is as severe as you think it is.
If I understand you correctly, the user experience will only be negatively affected if the user loses connectivity during the initial browsing of the page (before the service worker is active) and is missing an offline asset. If this is truly a scenario you want to account for then that offline asset can be pre-cached in the browser-side javascript. Alternatively, as you mentioned, you can skipWaiting() and claim() to make the service worker active without the user refreshing the page.

Why is self.skipWaiting() and self.clients.claim() not default behaviour for service workers

I'm researching service workers for my thesis. I understand how the lifecycle works, but I'm having trouble understanding the default update behaviour of service workers.
When installing a new service worker, while an old one is installed, the service worker will have to wait to activate. With self.skipWaiting() and self.clients.claim() it is possible to fully activate the service worker and control the pages. I don't get why this is not default behaviour. The main reason I can find is to preserve code and data consistency (https://redfin.engineering/service-workers-break-the-browsers-refresh-button-by-default-here-s-why-56f9417694). With some basic understanding of the lifecycle, shouldn't it be possible to preserve both code and data consistency when a service worker updates or am I missing something? Are there any additional reasons?
Also has this behaviour been different in the past? Have skipWaiting() and clients.claim() been added afterwards?
The default - as it is now - is safer in general and doesn't force everyone to come up with all sorts of solutions.
User loads page with main1.js, SWv1 registers 1 second later, site now fully cached
User loads the page again - this time from cache by SWv1, super fast. New SWv2 registers 1 second later, caches new assets (main1.js is now main2.js), takes control via skipWaiting and clientsClaim
Two things can happen now:
Page has loaded with main1.js and the browser has executed whatever that script said. User has interacted with the page etc. Page is running main1.js which expects to be talking to SWv1 but actually the SW in control is SWv2. The script, main1.js, could be sending messages and trying to interact with the SW in a way that only SWv1 understood but v2 doesn't have any idea about. Now the page breaks because of the mismatch.
SWv1 cached all assets that site v1 needed. Thus if main1.js was to lazyload something etc. when user interacted with the page, browser would get that from the cache. As SWv2 has taken control and cached its idea of the assets (these are now newer assets), when main1.js tries to lazyload something originally cached by SWv1 it's not found. Also, because this is now a new deployment, the asset is not on the HTTP server anymore. It would have been in caches handled by SWv1 but SWv2 doesn't know about it. SWv2 knows about a newer version of that file. Page breaks.
It is important to understand that this might not be the case for every site/SW combination. If you have very little logic in the SW script and the main.js doesn't communite with sw.js too much it is possible to build a combination where skipWaiting and clientsClaim don't cause any problems. You can also code in such a way that if an error happens, you'll show the user a notification to refresh.

How to deploy updates to service workers running on customers' sites?

Suppose I provide a push notification service used by different websites. This service requires a service worker to be installed on my customers' sites. I want the architecture to have a few properties:
Completely static resources. The process of installing the service worker files and configuring the JS snippet, etc. only needs to be done once.
The ability to update the service worker at any time. I want to be able to update the service worker at any time to fix bugs, deploy improvements, etc.
Satisfying both of these constraints is difficult because the browser only installs a new version of the service worker if the content of service worker script itself changes. (That is, without considering dependencies specified via importScripts().)
If you can't change the contents of service worker itself, consider creating a "new" service worker by appending a hash to the service worker URL. This will cause the browser to install the "new" service worker every time the hash changes.
That is, replace code like
navigator.serviceWorker.register("/sw.js");
with
navigator.serviceWorker.register(`/sw.js?hash=${HASH}`);
When the "new" service worker is installed, the browser will re-check all imported scripts. (This applies even if the "new" service worker is byte-for-byte identical to the "old" one, because the URLs are different.)
How to generate the hash?
There's a few different ways to generate the hash. A completely random HASH will lead to the browser updating the service worker on every page load, which is unlikely to be what you want.
Two different approaches:
(Best) You know when the imported script changes. In this case, only change HASH when the imported script changes. Ideally, HASH would be a hash of the contents of the imported script itself.
(Okay) Derive the hash from the time. Math.floor(Date.now() / (3600 * 1000)) will cause a "new" service worker to be installed every hour, which will also result in the dependencies being checked. (You'll probably also want to apply some jitter to avoid all clients updating at the same time.)
Suggested architecture
If you provide a service-worker backed service to other websites (e.g. a push notification service), you can provide a completely static service worker and JS install snippet to your customers which allows you to control and trigger updates completely from your site.
Code for customer.com (all static):
JS snippet for customer to include on all HTML pages their site (static):
<script src="https://provider.com/register-sw.html?customer_id=8932e4288bc8">
</script>
Service worker for customer to install at https://example.com/sw.js (static):
importScripts("https://provider.com/imported-sw.js?customer_id=8932e4288bc8");
Code for your site (can be dynamic):
Service worker registration code on your site at https://provider.com/register-sw.html?customer_id=8932e4288bc8 (dynamic):
const HASH = hash_of_file("imported-sw.js");
navigator.serviceWorker.register(`/sw.js?hash=${HASH}`);
"Real" service worker on your site at https://provider.com/imported-sw.js?customer_id=8932e4288bc8 (dynamic):
// when service worker is updated, all clients receive
// update because service worker itself is "new"
self.addEventListener(...);
NOTE: The byte-for-byte requirement is in the process of being changed so that this check extends to imported scripts by default (not just the registered URL itself), but as of April 2017 no browser implements this behavior.

Multiple requests are being made from service worker to cache a resource

As I'm working on building progressive web apps. We are facing weird behaviour for service worker.
1. Clear cache and unregister service worker
2. Go to www.example.com
3. Examine the network calls for resources (JS/CSS)
Expected result:
Only single network request should go for one resource.
Actual result:
Two network requests are being made for each of the resource [
What you're seeing in sw-precache fetching the resources to populate its caches. That happens independently of the initial request made by the controlled page. It's a fairly common model, whether or not you're using sw-precache.
(As an aside, I see that you're explicitly versioning your JS and CSS resources, which is great. You'll notice that sw-precache appends a cache-busting header to its precaching requests right now, meaning that they'll always go against the network instead of the HTTP browser cache. The upcoming 4.0.0 release of sw-precache, which you can use now via the master branch, has a new dontCacheBustUrlsMatching option, which allows you to opt-out of cache-busting for resources that you're explicitly versioning via filenames. Using that option means that the additional sw-precache request to populate its caches will be fulfilled via the HTTP browser cache, skipping a trip to the network.)

Resources