Workbox Warming Cache Questions - service-worker

I have api's that I am caching in my app. I would like to cache the api while the service worker is installing. I came across warming the cache:
import {cacheNames} from 'workbox-core';
self.addEventListener('install', (event) => {
const urls = [/* ... */];
const cacheName = cacheNames.runtime;
event.waitUntil(caches.open(cacheName).then((cache) => cache.addAll(urls)));
});
If you use strategies configured with a custom cache name you can do the same thing; just assign your custom value to cacheName.
1) I am using custom cache names. Would I use an array for multiple cache names? ie const cacheName = [ 'foo-api', 'bar'api']?
2) The url's I use are regexp /foo/. Will those rexexp urls work here?
3) Will I be able to cache the api while the service worker is installing before the browser consumes the api?

You can add as many items to as many caches as you'd like inside of your install handler.
Workbox can use RegExps for routing incoming fetch requests to an appropriate response handler, and I assume that's what you're referring to here. The answer is no, you can't just provide a RegExp if you want to cache URLs in advance—you need to provide a complete list of URLs.
Any caching that you perform inside of an install handler is guaranteed to happen before the service worker activates, and therefore before your fetch handlers start intercepting requests. So yes, this is a way of ensuring that your caches are pre-populated.
A modification of your code could look like:
self.addEventListener('install', (event) => {
const cacheURLs = async () => {
const cache1 = await caches.open('my-first-cache');
await cache1.addAll([
'/url1',
'/url2',
]);
const cache2 = await caches.open('my-second-cache');
await cache2.addAll([
'/url3',
'/url4',
]);
};
event.waitUntil(cacheURLs());
});

Related

Does a service worker allow one to simply use long expiration headers on static assets?

Say I have a service worker that populates the cache with the following working code when its activated:
async function install() {
console.debug("SW: Installing ...");
const cache = await caches.open(CACHE_VERSION);
await cache.addAll(CACHE_ASSETS);
console.log("SW: Installed");
}
async function handleInstall(event) {
event.waitUntil(install());
}
self.addEventListener("install", handleInstall);
When performs cache.addAll(), will the browser use its own internal cache, or will it always download the content from the site? This is important, because it one creates a new service worker release, and there are new static assets, the old version maybe be cached by the service worker.
If not then I guess one still has to do named hashed/versioned static assets. Something I was hoping service workers would make none applicable.
cache.addAll()'s behavior is described in the service worker specification, but here's a more concise summary:
For each item in the parameter array, if it's a string and not a Request, construct a new Request using that string as input.
Perform fetch() on each request and get a response.
As long as the response has an ok status, call cache.put() to add the response to the cache, using the request as the key.
To answer your question, the most relevant step is 1., as that determines what kind of Request is passed to fetch(). If you just pass in a string, then there are a lot of defaults that will be used when implicitly constructing the Request. If you want more control over what's fetch()ed, then you should explicitly create a Request yourself and pass that to cache.addAll() instead of passing in strings.
For instance, this is how you'd explicitly set the cache mode on all the requests to 'reload', which always skip the browser's normal HTTP cache and go against the network for a response:
// Define your list of URLs somewhere...
const URLS = ['/one.css', '/two.js', '/three.js', '...'];
// Later...
const requests = URLS.map((url) => new Request(url, {cache: 'reload'}));
await cache.addAll(requests);

Downloading whole websites with k6

I'm currently evaluating whether k6 fits our load testing needs. We have a fairly traditional website architecture that uses Apache webservers with PHP und a MySQL database. Sending simple HTTP requests with k6 looks simple enough and I think we will be able to test all major functionality with it, as we don't rely on JavaScript that much and most pages are static.
However, I'm unsure how to deal with resources (stylesheets, images, etc.) that are referenced in the HTML that is returned in the requests. We need to load them as well, as this sometimes leads to database requests, which must be part of the load test.
Is there some out-of-the-box functionality in k6 that allows you to load all the resources like a browser would? I'm aware that k6 does NOT render the page and I don't need it to. I only need to request all the resources inside the HTML.
You basically have two options, both with their caveats:
Record your session - you can either export har directly from the browser as shown there or use an extension made for your browser here is firefox and chromes. Both should be usable without a k6 cloud account you just need to set them to download the har and it will automatically (and somewhat silently) download them when you hit stop. And then either use the in k6 har converter (which is deprecated, but still works) or the new har-to-k6 one which.
This method is particularly good if you have a lot of pages and/or resources and even works if you have a single page style of application as it just gets what the browser requested as a HAR and then transforms it into a script. And if there were no dynamic things that need to be inputed (username/password) the final script can be used as is most of the time.
The biggest problem with this approach is that if you add a css file you need to redo this whole exercise. This is even more problematic if you css/js file name change on each change or something like that. Which is what the next method is good for:
Use parseHTML and then find the elements you care about and make a request for them.
import http from "k6/http";
import {parseHTML} from "k6/html";
export default function() {
const res = http.get("https://stackoverflow.com");
const doc = parseHTML(res.body);
doc.find("link").toArray().forEach(function (item) {
console.log(item.attr("href"));
// make http gets for it
// or added them to an array and make one batch request
});
}
will produce
NFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/favicon.ico?v=4f32ecc8f43d
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png?v=c78bd457575a
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png?v=c78bd457575a
INFO[0001] /opensearch.xml
INFO[0001] https://cdn.sstatic.net/Shared/stacks.css?v=53507c7c6e93
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/primary.css?v=d3fa9a72fd53
INFO[0001] https://cdn.sstatic.net/Shared/Product/product.css?v=c9b2e1772562
INFO[0001] /feeds
INFO[0001] https://cdn.sstatic.net/Shared/Channels/channels.css?v=f9809e9ffa90
As you can see some of the urls are relative and not absolute so you will need to handle this. And in this example only some are css, so probably more filtering is needed.
The problem here is that you need to write the code and if you add a relative link or something else you need to handle it. Luckily k6 is scriptable so you can reuse the code :D.
I've followed Михаил Стойков suggestion and written my own function to load resources. You can set the way resources are loaded (batch or sequential gets with options.concurrentResourceLoading).
/**
* #param {http.RefinedResponse<http.ResponseType>} response
*/
export function getResources(response) {
const resources = [];
response
.html()
.find('*[href]:not(a)')
.each((index, element) => {
resources.push(element.attributes().href.value);
});
response
.html()
.find('*[src]:not(a)')
.each((index, element) => {
resources.push(element.attributes().src.value);
});
if (options.concurrentResourceLoading) {
const responses = http.batch(
resources.map((r) => {
return ['GET', resolveUrl(r, response.url), null, {
headers: createHeader()
}];
})
);
responses.forEach(() => {
check(response, {
'resource returns status 200': (r) => r.status === 200,
});
});
} else {
resources.forEach((r) => {
const res = http.get(resolveUrl(r, response.url), {
headers: createHeader(),
});
!check(res, {
'resource returns status 200': (r) => r.status === 200,
});
});
}
}

Workbox redirect the clients page when resource is not cached and offline

Usually whenever I read a blog post about PWA's, the tutorial seems to just precache every single asset. But this seems to go against the app shell pattern a bit, which as I understand is: Cache the bare necessities (only the app shell), and runtime cache as you go. (Please correct me if I understood this incorrectly)
Imagine I have this single page application, it's a simple index.html with a web component: <my-app>. That <my-app> component sets up some routes which looks a little bit like this, I'm using Vaadin router and web components, but I imagine the problem would be the same using React with React Router or something similar.
router.setRoutes([
{
path: '/',
component: 'app-main', // statically loaded
},
{
path: '/posts',
component: 'app-posts',
action: () => { import('./app-posts.js');} // dynamically loaded
},
/* many, many, many more routes */
{
path: '/offline', // redirect here when a resource is not cached and failed to get from network
component: 'app-offline', // also statically loaded
}
]);
My app may have many many routes, and may get very large. I don't want to precache all those resources straight away, but only cache the stuff I absolutely need, so in this case: my index.html, my-app.js, app-main.js, and app-offline.js. I want to cache app-posts.js at runtime, when it's requested.
Setting up runtime caching is simple enough, but my problem arises when my user visits one of the potentially many many routes that is not cached yet (because maybe the user hasn't visited that route before, so the js file may not have loaded/cached yet), and the user has no internet connection.
What I want to happen, in that case (when a route is not cached yet and there is no network), is for the user to be redirected to the /offline route, which is handled by my client side router. I could easily do something like: import('./app-posts.js').catch(() => /* redirect user to /offline */), but I'm wondering if there is a way to achieve this from workbox itself.
So in a nutshell:
When a js file hasn't been cached yet, and the user has no network, and so the request for the file fails: let workbox redirect the page to the /offline route.
Option 1 (not always useful):
As far as I can see and according to this answer, you cannot open a new window or change the URL of the browser from within the service worker. However you can open a new window only if the clients.openWindow() function is called from within the notificationclick event.
Option 2 (hardest):
You could use the WindowClient.navigate method within the activate event of the service worker however is a bit trickier as you still need to check if the file requested exists in the cache or not.
Option 3 (easiest & hackiest):
Otherwise, you could respond with a new Request object to the offline page:
const cacheOnly = new workbox.strategies.CacheOnly();
const networkFirst = new workbox.strategies.NetworkFirst();
workbox.routing.registerRoute(
/\/posts.|\/articles/,
async args => {
const offlineRequest = new Request('/offline.html');
try {
const response = await networkFirst.handle(args);
return response || await cacheOnly.handle({request: offlineRequest});
} catch (error) {
return await cacheOnly.handle({request: offlineRequest})
}
}
);
and then rewrite the URL of the browser in your offline.html file:
<head>
<script>
window.history.replaceState({}, 'You are offline', '/offline');
</script>
</head>
The above logic in Option 3 will respond to the requested URL by using the network first. If the network is not available will fallback to the cache and even if the request is not found in the cache, will fetch the offline.html file instead. Once the offline.html file is parsed, the browser URL will be replaced to /offline.

Workbox precache putting items inside -temp cache, and not using it later [duplicate]

To enable my app running offline. During installation the service worker should:
fetch a list of URLs from an async API
reformat the response
add all URLs in the response to the precache
For this task I use Googles Workbox in combination with Webpack.
The problem: While the service worker successfully caches all the Webpack assets (which tells me that the workbox basically does what it should), it does not wait for the async API call to cache the additional remote assets. They are simply ignored and neither cached nor ever fetched in the network.
Here is my service worker code:
importScripts('https://storage.googleapis.com/workbox-cdn/releases/3.1.0/workbox-sw.js');
workbox.skipWaiting();
workbox.clientsClaim();
self.addEventListener('install', (event) => {
const precacheController = new
workbox.precaching.PrecacheController();
const preInstallUrl = 'https://www.myapiurl/Assets';
event.waitUntil(fetch(preInstallUrl)
.then(response => response.json()
.then((Assets) => {
Object.keys(Assets.data.Assets).forEach((key) => {
precacheController.addToCacheList([Assets.data.Assets[key]]);
});
})
);
});
self.__precacheManifest = [].concat(self.__precacheManifest || []);
workbox.precaching.suppressWarnings();
workbox.precaching.precacheAndRoute(self.__precacheManifest, {});
workbox.routing.registerRoute(/^.*\.(jpg|JPG|gif|GIF|png|PNG|eot|woff(2)?|ttf|svg)$/, workbox.strategies.cacheFirst({ cacheName: 'image-cache', plugins: [new workbox.cacheableResponse.Plugin({ statuses: [0, 200] }), new workbox.expiration.Plugin({ maxEntries: 600 })] }), 'GET');
And this is my webpack configuration for the workbox:
new InjectManifest({
swDest: 'sw.js',
swSrc: './src/sw.js',
globPatterns: ['dist/*.{js,png,html,css,gif,GIF,PNG,JPG,jpeg,woff,woff2,ttf,svg,eot}'],
maximumFileSizeToCacheInBytes: 5 * 1024 * 1024,
})
It looks like you're creating your own PrecacheController instance and also using the precacheAndRoute(), which aren't actually intended to be used together (not super well explained in the docs, it's only mentioned in this one place).
The problem is the helper methods on workbox.precaching.* actually create their own PrecacheController instance under the hood. Since you're creating your own PrecacheController instance and also calling workbox.precaching.precacheAndRoute([...]), you'll end up with two PrecacheController instances that aren't working together.
From your code sample, it looks like you're creating a PrecacheController instance because you want to load your list of files to precache at runtime. That's fine, but if you're going to do that, there are a few things to be aware of:
Your SW might not update
Service worker updates are usually triggered when you call navigator.serviceWorker.register() and the browser detects that the service worker file has changed. That means if you change what /Assets returns but the contents of your service worker files haven't change, your service worker won't update. This is why most people hard-code their precache list in their service worker (since any changes to those files will trigger a new service worker installation).
You'll have to manually add your own routes
I mentioned before that workbox.precaching.precacheAndRoute([...]) creates its own PrecacheController instance under the hood. It also adds its own fetch listener manually to respond to requests. That means if you're not using precacheAndRoute(), you'll have to create your own router and define your own routes. Here are the docs on how to create routes: https://developers.google.com/web/tools/workbox/modules/workbox-routing.
I realised my mistake. I hope this helps others as well. The problem was that I did not call precacheController.install() manually. While this function will be executed automatically it will not wait for additional precache files that are inserted asynchronously. This is why the function needs to be called after all the precaching happened. Here is the working code:
importScripts('https://storage.googleapis.com/workbox-cdn/releases/3.1.0/workbox-sw.js');
workbox.skipWaiting();
workbox.clientsClaim();
const precacheController = new workbox.precaching.PrecacheController();
// Hook into install event
self.addEventListener('install', (event) => {
// Get API URL passed as query parameter to service worker
const preInstallUrl = new URL(location).searchParams.get('preInstallUrl');
// Fetch precaching URLs and attach them to the cache list
const assetsLoaded = fetch(preInstallUrl)
.then(response => response.json())
.then((values) => {
Object.keys(values.data.Assets).forEach((key) => {
precacheController.addToCacheList([values.data.Assets[key]]);
});
})
.then(() => {
// After all assets are added install them
precacheController.install();
});
event.waitUntil(assetsLoaded);
});
self.__precacheManifest = [].concat(self.__precacheManifest || []);
workbox.precaching.suppressWarnings();
workbox.precaching.precacheAndRoute(self.__precacheManifest, {});
workbox.routing.registerRoute(/^.*\.(jpg|JPG|gif|GIF|png|PNG|eot|woff(2)?|ttf|svg)$/, workbox.strategies.cacheFirst({ cacheName: 'image-cache', plugins: [new workbox.cacheableResponse.Plugin({ statuses: [0, 200] }), new workbox.expiration.Plugin({ maxEntries: 600 })] }), 'GET');

Apostrophe CMS: Do we have middleware/hook which can set global.data from API?

My requiment is I have an API which will provide user data. In the Apostrophe CMS I need to access the user data from all the layouts (Header, Main, Footer).
I can see gobal.data which is avaiable everywhere in the template. Likewise I need a hook which will call the API and store the response data in the Apostrophe's global.data.
Please let me know if you need further informations.
You could hit that API on every page render:
// index.js of some apostrophe module
// You should `npm install request-promise` first
const request = require('request-promise');
module.exports = {
construct: function(self, options) {
self.on('apostrophe-pages:beforeSend', async function(req) {
const apiInfo = await request('http://some-api.com/something');
req.data.apiInfo = apiInfo;
// now in your templates you can access `data.apiInfo`
});
}
}
But this will hit that API on every single request which will of course make your site slow down. So I would recommend that you cache the information for some period of time.

Resources