NetworkFirst cache strategy not saving the html content of an url - service-worker

I am first time using serviceworker using workbox, have no prior experience. I want when a user visits a url, the corresponding html must be saved to cache, so when the app is offline, the content of the url serves from the cache. But I don't see the content in the cache from browser devtool. What am I missing here?
import { registerRoute } from "workbox-routing";
import { precacheAndRoute } from "workbox-precaching";
import {skipWaiting} from 'workbox-core';
import {NetworkFirst, CacheFirst} from 'workbox-strategies';
// ...
function jsonResponse(value) {
return new Response(JSON.stringify(value), {
headers: { "Content-Type": "application/json" },
status: 202,
});
}
console.log("Workbox is loaded");
skipWaiting();
/* injection point for manifest files. */
precacheAndRoute(self.__WB_MANIFEST);
const pagesCache = new NetworkFirst({
cacheName: 'pages-cache',
});
registerRoute(
new RegExp("/projects/[-a-z0-9]+/edit$"),
args => {
console.log('Got in here..');
return pagesCache.handle(args);
}
);
so when I go to offline I see.

Your code is mixing both precaching and runtime caching parts from the Workbox library. Some parts of it might work, some not.
By saying "I want when a user visits a url, the corresponding html must be saved to cache, so when the app is offline, the content of the url serves from the cache." it seems that you want what is called a network first caching strategy. The simplest configuration for that using Workbox would be:
import {registerRoute} from 'workbox-routing';
import {NetworkFirst} from 'workbox-strategies';
registerRoute(
new RegExp('/'),
new NetworkFirst()
);
More info about the strategy here: https://developers.google.com/web/tools/workbox/modules/workbox-strategies#network_first_network_falling_back_to_cache
I also suggest you to read a primer on Service Workers themselves. You may find one eg. here https://developers.google.com/web/fundamentals/primers/service-workers/

Related

Notify a web page served from service-worker cache that it has been served from SW cache

I developped a service worker which is serving pages from network first (and caching it) or, when offline, serving it from cache.
Ideally, I would like to inform the user (with a banner, or something like this) that the page has been served from the cache because we detected that he was offline.
Do you have an idea on how to implement this ?
Some ideas I had (but didn't really succeeded to implement) :
Inject some code in the cached response body (like, injecting some JS code triggering a offline-detected event which may or may not have an event listener on the webpage, depending on if I want to do something or not on this webpage while offline).
=> I didn't found how to append stuff in response's body coming from the cache to do this.
Send a postMessage from service worker to the webpage telling that it has been rendered using a cached content.
=> It doesn't seem to be possible as I don't have any MessagePort available in ServiceWorker's fetch event, making it impossible to send any postMessage() to it.
If you have any idea on how to implement this, I would be very happy to discuss about it.
Thanks in advance :)
I looked at different solutions :
Use navigator.onLine flag on HTML page : this is not reliable on my case, as my service worker might served cached page (because of, let's say, a timeout) whilst the browser can consider to be online
Inject custom headers when serving HTML response from cache : I don't see how it might work since response headers are generally not accessible on clientside
Call service-worker's postMessage() few seconds after content is served : problem is, in that case, that fetch event on the "starting" HTML page, doesn't have any clientId yet (we have a chicken & egg problem here, since service worker is not yet attached to the client at the moment the root html page is served from cache)
The only remaining solution I found was to inject some code in the cached response.
Something like this on the service worker side (the main idea is to inject some code on cached response body, triggering a served-offline-page event once DOM content has been loaded) :
async function createClonedResponseWithOfflinePageInjectedEvent(response) {
const contentReader = response.body.getReader();
let content = '', readResult = {done: false, value: undefined };
while(!readResult.done) {
readResult = await contentReader.read();
content += readResult.value?new TextDecoder().decode(readResult.value):'';
}
// Important part here, as we're "cloning" response by injecting some JS code in it
content = content.replace("</body>", `
<script type='text/javascript'>
document.addEventListener('DOMContentLoaded', () => document.dispatchEvent(new Event('served-offline-page')));
</script>
</body>
`);
return new Response(content, {
headers: response.headers,
status: response.status,
statusText: response.statusText
});
}
async function serveResponseFromCache(cache, request) {
try {
const response = await cache.match(request);
if(response) {
console.info("Used cached asset for : "+request.url);
// isCacheableHTMLPage() will return true on html pages where we're ok to "inject" some js code to notify about
if(isCacheableHTMLPage(request.url)) {
return createClonedResponseWithOfflinePageInjectedEvent(response);
} else {
return response;
}
} else {
console.error("Asset neither available from network nor from cache : "+request.url);
// Redirecting on offline page
return new Response(null, {
headers: {
'Location': '/offline.html'
},
status: 307
})
}
}catch(error) {
console.error("Error while matching request "+request.url+" from cache : "+error);
}
}
On the HTML page, this is simple, we only have to write this in the page :
document.addEventListener('served-offline-page', () => {
console.log("It seems like this page has been served as some offline content !");
})

Downloading whole websites with k6

I'm currently evaluating whether k6 fits our load testing needs. We have a fairly traditional website architecture that uses Apache webservers with PHP und a MySQL database. Sending simple HTTP requests with k6 looks simple enough and I think we will be able to test all major functionality with it, as we don't rely on JavaScript that much and most pages are static.
However, I'm unsure how to deal with resources (stylesheets, images, etc.) that are referenced in the HTML that is returned in the requests. We need to load them as well, as this sometimes leads to database requests, which must be part of the load test.
Is there some out-of-the-box functionality in k6 that allows you to load all the resources like a browser would? I'm aware that k6 does NOT render the page and I don't need it to. I only need to request all the resources inside the HTML.
You basically have two options, both with their caveats:
Record your session - you can either export har directly from the browser as shown there or use an extension made for your browser here is firefox and chromes. Both should be usable without a k6 cloud account you just need to set them to download the har and it will automatically (and somewhat silently) download them when you hit stop. And then either use the in k6 har converter (which is deprecated, but still works) or the new har-to-k6 one which.
This method is particularly good if you have a lot of pages and/or resources and even works if you have a single page style of application as it just gets what the browser requested as a HAR and then transforms it into a script. And if there were no dynamic things that need to be inputed (username/password) the final script can be used as is most of the time.
The biggest problem with this approach is that if you add a css file you need to redo this whole exercise. This is even more problematic if you css/js file name change on each change or something like that. Which is what the next method is good for:
Use parseHTML and then find the elements you care about and make a request for them.
import http from "k6/http";
import {parseHTML} from "k6/html";
export default function() {
const res = http.get("https://stackoverflow.com");
const doc = parseHTML(res.body);
doc.find("link").toArray().forEach(function (item) {
console.log(item.attr("href"));
// make http gets for it
// or added them to an array and make one batch request
});
}
will produce
NFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/favicon.ico?v=4f32ecc8f43d
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png?v=c78bd457575a
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/img/apple-touch-icon.png?v=c78bd457575a
INFO[0001] /opensearch.xml
INFO[0001] https://cdn.sstatic.net/Shared/stacks.css?v=53507c7c6e93
INFO[0001] https://cdn.sstatic.net/Sites/stackoverflow/primary.css?v=d3fa9a72fd53
INFO[0001] https://cdn.sstatic.net/Shared/Product/product.css?v=c9b2e1772562
INFO[0001] /feeds
INFO[0001] https://cdn.sstatic.net/Shared/Channels/channels.css?v=f9809e9ffa90
As you can see some of the urls are relative and not absolute so you will need to handle this. And in this example only some are css, so probably more filtering is needed.
The problem here is that you need to write the code and if you add a relative link or something else you need to handle it. Luckily k6 is scriptable so you can reuse the code :D.
I've followed Михаил Стойков suggestion and written my own function to load resources. You can set the way resources are loaded (batch or sequential gets with options.concurrentResourceLoading).
/**
* #param {http.RefinedResponse<http.ResponseType>} response
*/
export function getResources(response) {
const resources = [];
response
.html()
.find('*[href]:not(a)')
.each((index, element) => {
resources.push(element.attributes().href.value);
});
response
.html()
.find('*[src]:not(a)')
.each((index, element) => {
resources.push(element.attributes().src.value);
});
if (options.concurrentResourceLoading) {
const responses = http.batch(
resources.map((r) => {
return ['GET', resolveUrl(r, response.url), null, {
headers: createHeader()
}];
})
);
responses.forEach(() => {
check(response, {
'resource returns status 200': (r) => r.status === 200,
});
});
} else {
resources.forEach((r) => {
const res = http.get(resolveUrl(r, response.url), {
headers: createHeader(),
});
!check(res, {
'resource returns status 200': (r) => r.status === 200,
});
});
}
}

Workbox redirect the clients page when resource is not cached and offline

Usually whenever I read a blog post about PWA's, the tutorial seems to just precache every single asset. But this seems to go against the app shell pattern a bit, which as I understand is: Cache the bare necessities (only the app shell), and runtime cache as you go. (Please correct me if I understood this incorrectly)
Imagine I have this single page application, it's a simple index.html with a web component: <my-app>. That <my-app> component sets up some routes which looks a little bit like this, I'm using Vaadin router and web components, but I imagine the problem would be the same using React with React Router or something similar.
router.setRoutes([
{
path: '/',
component: 'app-main', // statically loaded
},
{
path: '/posts',
component: 'app-posts',
action: () => { import('./app-posts.js');} // dynamically loaded
},
/* many, many, many more routes */
{
path: '/offline', // redirect here when a resource is not cached and failed to get from network
component: 'app-offline', // also statically loaded
}
]);
My app may have many many routes, and may get very large. I don't want to precache all those resources straight away, but only cache the stuff I absolutely need, so in this case: my index.html, my-app.js, app-main.js, and app-offline.js. I want to cache app-posts.js at runtime, when it's requested.
Setting up runtime caching is simple enough, but my problem arises when my user visits one of the potentially many many routes that is not cached yet (because maybe the user hasn't visited that route before, so the js file may not have loaded/cached yet), and the user has no internet connection.
What I want to happen, in that case (when a route is not cached yet and there is no network), is for the user to be redirected to the /offline route, which is handled by my client side router. I could easily do something like: import('./app-posts.js').catch(() => /* redirect user to /offline */), but I'm wondering if there is a way to achieve this from workbox itself.
So in a nutshell:
When a js file hasn't been cached yet, and the user has no network, and so the request for the file fails: let workbox redirect the page to the /offline route.
Option 1 (not always useful):
As far as I can see and according to this answer, you cannot open a new window or change the URL of the browser from within the service worker. However you can open a new window only if the clients.openWindow() function is called from within the notificationclick event.
Option 2 (hardest):
You could use the WindowClient.navigate method within the activate event of the service worker however is a bit trickier as you still need to check if the file requested exists in the cache or not.
Option 3 (easiest & hackiest):
Otherwise, you could respond with a new Request object to the offline page:
const cacheOnly = new workbox.strategies.CacheOnly();
const networkFirst = new workbox.strategies.NetworkFirst();
workbox.routing.registerRoute(
/\/posts.|\/articles/,
async args => {
const offlineRequest = new Request('/offline.html');
try {
const response = await networkFirst.handle(args);
return response || await cacheOnly.handle({request: offlineRequest});
} catch (error) {
return await cacheOnly.handle({request: offlineRequest})
}
}
);
and then rewrite the URL of the browser in your offline.html file:
<head>
<script>
window.history.replaceState({}, 'You are offline', '/offline');
</script>
</head>
The above logic in Option 3 will respond to the requested URL by using the network first. If the network is not available will fallback to the cache and even if the request is not found in the cache, will fetch the offline.html file instead. Once the offline.html file is parsed, the browser URL will be replaced to /offline.

Apostrophe CMS: Do we have middleware/hook which can set global.data from API?

My requiment is I have an API which will provide user data. In the Apostrophe CMS I need to access the user data from all the layouts (Header, Main, Footer).
I can see gobal.data which is avaiable everywhere in the template. Likewise I need a hook which will call the API and store the response data in the Apostrophe's global.data.
Please let me know if you need further informations.
You could hit that API on every page render:
// index.js of some apostrophe module
// You should `npm install request-promise` first
const request = require('request-promise');
module.exports = {
construct: function(self, options) {
self.on('apostrophe-pages:beforeSend', async function(req) {
const apiInfo = await request('http://some-api.com/something');
req.data.apiInfo = apiInfo;
// now in your templates you can access `data.apiInfo`
});
}
}
But this will hit that API on every single request which will of course make your site slow down. So I would recommend that you cache the information for some period of time.

Service worker to save form data when browser is offline

I am new to Service Workers, and have had a look through the various bits of documentation (Google, Mozilla, serviceworke.rs, Github, StackOverflow questions). The most helpful is the ServiceWorkers cookbook.
Most of the documentation seems to point to caching entire pages so that the app works completely offline, or redirecting the user to an offline page until the browser can redirect to the internet.
What I want to do, however, is store my form data locally so my web app can upload it to the server when the user's connection is restored. Which "recipe" should I use? I think it is Request Deferrer. Do I need anything else to ensure that Request Deferrer will work (apart from the service worker detector script in my web page)? Any hints and tips much appreciated.
Console errors
The Request Deferrer recipe and code doesn't seem to work on its own as it doesn't include file caching. I have added some caching for the service worker library files, but I am still getting this error when I submit the form while offline:
Console: {"lineNumber":0,"message":
"The FetchEvent for [the form URL] resulted in a network error response:
the promise was rejected.","message_level":2,"sourceIdentifier":1,"sourceURL":""}
My Service Worker
/* eslint-env es6 */
/* eslint no-unused-vars: 0 */
/* global importScripts, ServiceWorkerWare, localforage */
importScripts('/js/lib/ServiceWorkerWare.js');
importScripts('/js/lib/localforage.js');
//Determine the root for the routes. I.e, if the Service Worker URL is http://example.com/path/to/sw.js, then the root is http://example.com/path/to/
var root = (function() {
var tokens = (self.location + '').split('/');
tokens[tokens.length - 1] = '';
return tokens.join('/');
})();
//By using Mozilla’s ServiceWorkerWare we can quickly setup some routes for a virtual server. It is convenient you review the virtual server recipe before seeing this.
var worker = new ServiceWorkerWare();
//So here is the idea. We will check if we are online or not. In case we are not online, enqueue the request and provide a fake response.
//Else, flush the queue and let the new request to reach the network.
//This function factory does exactly that.
function tryOrFallback(fakeResponse) {
//Return a handler that…
return function(req, res) {
//If offline, enqueue and answer with the fake response.
if (!navigator.onLine) {
console.log('No network availability, enqueuing');
return enqueue(req).then(function() {
//As the fake response will be reused but Response objects are one use only, we need to clone it each time we use it.
return fakeResponse.clone();
});
}
//If online, flush the queue and answer from network.
console.log('Network available! Flushing queue.');
return flushQueue().then(function() {
return fetch(req);
});
};
}
//A fake response with a joke for when there is no connection. A real implementation could have cached the last collection of updates and keep a local model. For simplicity, not implemented here.
worker.get(root + 'api/updates?*', tryOrFallback(new Response(
JSON.stringify([{
text: 'You are offline.',
author: 'Oxford Brookes University',
id: 1,
isSticky: true
}]),
{ headers: { 'Content-Type': 'application/json' } }
)));
//For deletion, let’s simulate that all went OK. Notice we are omitting the body of the response. Trying to add a body with a 204, deleted, as status throws an error.
worker.delete(root + 'api/updates/:id?*', tryOrFallback(new Response({
status: 204
})));
//Creation is another story. We can not reach the server so we can not get the id for the new updates.
//No problem, just say we accept the creation and we will process it later, as soon as we recover connectivity.
worker.post(root + 'api/updates?*', tryOrFallback(new Response(null, {
status: 202
})));
//Start the service worker.
worker.init();
//By using Mozilla’s localforage db wrapper, we can count on a fast setup for a versatile key-value database. We use it to store queue of deferred requests.
//Enqueue consists of adding a request to the list. Due to the limitations of IndexedDB, Request and Response objects can not be saved so we need an alternative representations.
//This is why we call to serialize().`
function enqueue(request) {
return serialize(request).then(function(serialized) {
localforage.getItem('queue').then(function(queue) {
/* eslint no-param-reassign: 0 */
queue = queue || [];
queue.push(serialized);
return localforage.setItem('queue', queue).then(function() {
console.log(serialized.method, serialized.url, 'enqueued!');
});
});
});
}
//Flush is a little more complicated. It consists of getting the elements of the queue in order and sending each one, keeping track of not yet sent request.
//Before sending a request we need to recreate it from the alternative representation stored in IndexedDB.
function flushQueue() {
//Get the queue
return localforage.getItem('queue').then(function(queue) {
/* eslint no-param-reassign: 0 */
queue = queue || [];
//If empty, nothing to do!
if (!queue.length) {
return Promise.resolve();
}
//Else, send the requests in order…
console.log('Sending ', queue.length, ' requests...');
return sendInOrder(queue).then(function() {
//Requires error handling. Actually, this is assuming all the requests in queue are a success when reaching the Network.
// So it should empty the queue step by step, only popping from the queue if the request completes with success.
return localforage.setItem('queue', []);
});
});
}
//Send the requests inside the queue in order. Waiting for the current before sending the next one.
function sendInOrder(requests) {
//The reduce() chains one promise per serialized request, not allowing to progress to the next one until completing the current.
var sending = requests.reduce(function(prevPromise, serialized) {
console.log('Sending', serialized.method, serialized.url);
return prevPromise.then(function() {
return deserialize(serialized).then(function(request) {
return fetch(request);
});
});
}, Promise.resolve());
return sending;
}
//Serialize is a little bit convolved due to headers is not a simple object.
function serialize(request) {
var headers = {};
//for(... of ...) is ES6 notation but current browsers supporting SW, support this notation as well and this is the only way of retrieving all the headers.
for (var entry of request.headers.entries()) {
headers[entry[0]] = entry[1];
}
var serialized = {
url: request.url,
headers: headers,
method: request.method,
mode: request.mode,
credentials: request.credentials,
cache: request.cache,
redirect: request.redirect,
referrer: request.referrer
};
//Only if method is not GET or HEAD is the request allowed to have body.
if (request.method !== 'GET' && request.method !== 'HEAD') {
return request.clone().text().then(function(body) {
serialized.body = body;
return Promise.resolve(serialized);
});
}
return Promise.resolve(serialized);
}
//Compared, deserialize is pretty simple.
function deserialize(data) {
return Promise.resolve(new Request(data.url, data));
}
var CACHE = 'cache-only';
// On install, cache some resources.
self.addEventListener('install', function(evt) {
console.log('The service worker is being installed.');
// Ask the service worker to keep installing until the returning promise
// resolves.
evt.waitUntil(precache());
});
// On fetch, use cache only strategy.
self.addEventListener('fetch', function(evt) {
console.log('The service worker is serving the asset.');
evt.respondWith(fromCache(evt.request));
});
// Open a cache and use `addAll()` with an array of assets to add all of them
// to the cache. Return a promise resolving when all the assets are added.
function precache() {
return caches.open(CACHE).then(function (cache) {
return cache.addAll([
'/js/lib/ServiceWorkerWare.js',
'/js/lib/localforage.js',
'/js/settings.js'
]);
});
}
// Open the cache where the assets were stored and search for the requested
// resource. Notice that in case of no matching, the promise still resolves
// but it does with `undefined` as value.
function fromCache(request) {
return caches.open(CACHE).then(function (cache) {
return cache.match(request).then(function (matching) {
return matching || Promise.reject('no-match');
});
});
}
Here is the error message I am getting in Chrome when I go offline:
(A similar error occurred in Firefox - it falls over at line 409 of ServiceWorkerWare.js)
ServiceWorkerWare.prototype.executeMiddleware = function (middleware,
request) {
var response = this.runMiddleware(middleware, 0, request, null);
response.catch(function (error) { console.error(error); });
return response;
};
this is a little more advanced that a beginner level. But you will need to detect when you are offline or in a Li-Fi state. Instead of POSTing data to an API or end point you need to queue that data to be synched when you are back on line.
This is what the Background Sync API should help with. However, it is not supported across the board just yet. Plus Safari.........
So maybe a good strategy is to persist your data in IndexedDB and when you can connect (background sync fires an event for this) you would then POST the data. It gets a little more complex for browsers that don't support service workers (Safari) or don't yet have Background Sync (that will level out very soon).
As always design your code to be a progressive enhancement, which can be tricky, but worth it in the end.
Service Workers tend to cache the static HTML, CSS, JavaScript, and image files.
I need to use PouchDB and sync it with CouchDB
Why CouchDB?
CouchDB is a NoSQL database consisting of a number of Documents
created with JSON.
It has versioning (each document has a _rev
property with the last modified date)
It can be synchronised with
PouchDB, a local JavaScript application that stores data in local
storage via the browser using IndexedDB. This allows us to create
offline applications.
The two databases are both “master” copies of
the data.
PouchDB is a local JavaScript implementation of CouchDB.
I still need a better answer than my partial notes towards a solution!
Yes, this type of service worker is the correct one to use for saving form data offline.
I have now edited it and understood it better. It caches the form data, and loads it on the page for the user to see what they have entered.
It is worth noting that the paths to the library files will need editing to reflect your local directory structure, e.g. in my setup:
importScripts('/js/lib/ServiceWorkerWare.js');
importScripts('/js/lib/localforage.js');
The script is still failing when offline, however, as it isn't caching the library files. (Update to follow when I figure out caching)
Just discovered an extra debugging tool for service workers (apart from the console): chrome://serviceworker-internals/. In this, you can start or stop service workers, view console messages, and the resources used by the service worker.

Resources