Bypass default caching policy if file is managed by service worker

Bypass default caching policy if file is managed by service worker - service-worker

I am using Google's workbox-cli tool in order to precache some of the files on my website. Is it possible to setup the webserver to return the following in the HTTP response header for all files by default:
cache-control: s-maxage=2592000, max-age=86400, must-revalidate, no-transform, public
But, have the webbrowser use the follwing instead only if the file is going to be precached by the service worker:
cache-control: s-maxage=2592000, max-age=0, must-revalidate, no-transform, public
So, I would like the service worker to change max-age=86400 into max-age=0 in the webserver's response header before precaching the file. This makes the service worker fetch files, that have changed according to the revision in sw.js, from the webserver instead of retrieving them from local cache. Any files not managed by the service worker are cached for 86400 seconds by default.
Some background info
Currently, I am using the following bash script to setup my sw.js:
#!/bin/bash
if [ ! -d /tmp/workbox-configuration ]; then
mkdir /tmp/workbox-configuration
fi
cat <<EOF > /tmp/workbox-configuration/workbox-config.js
module.exports = {
"globDirectory": "harp_output/",
"globPatterns": [
EOF
( cd harp_output && find assets de en -type f ! -name "map.js" ! -name "map.json" ! -name "markerclusterer.js" ! -name "modal.js" ! -name "modal-map.html" ! -name "service-worker-registration.js" ! -name "sw-registration.js" ! -path "assets/fonts/*" ! -path "assets/img/*-1x.*" ! -path "assets/img/*-2x.*" ! -path "assets/img/*-3x.*" ! -path "assets/img/maps/*" ! -path "assets/img/video/*_1x1.*" ! -path "assets/img/video/*_4x3.*" ! -path "assets/js/workbox-*" ! -path "assets/videos/*" ! -path "de/4*" ! -path "de/5*" ! -path "en/4*" ! -path "en/5*" | sort | sed 's/^/"/' | sed 's/$/"/' | sed -e '$ ! s/$/,/' >> /tmp/workbox-configuration/workbox-config.js )
cat <<EOF >> /tmp/workbox-configuration/workbox-config.js
],
"swDest": "/tmp/workbox-configuration/sw.js"
};
EOF
workbox generateSW /tmp/workbox-configuration/workbox-config.js
sed -i 's#^importScripts(.*);$#importScripts("/assets/js/workbox-sw.js");\nworkbox.setConfig({modulePathPrefix: "/assets/js/"});#' /tmp/workbox-configuration/sw.js
sed -i 's/index.html"/"/' /tmp/workbox-configuration/sw.js
uglifyjs /tmp/workbox-configuration/sw.js -c -m -o harp_output/sw.js
On my Nginx webserver the following HTTP header is delivered by default:
more_set_headers "cache-control: s-maxage=2592000, max-age=0, must-revalidate, no-transform, public";
But, if the requested ressource is not handled by the service worker, the default cache-control setting is overwritten:
location ~ ^/(assets/(data/|fonts/|img/(.*-(1|2|3)x\.|maps/|video/.*_(1x1|4x3)\.)|js/(map|markerclusterer|modal|service-worker-registration|sw-registration)\.js|videos/)|(de|en)/((4|5).*|modal-map\.html)) {
more_set_headers "cache-control: s-maxage=2592000, max-age=86400, must-revalidate, no-transform, public";
}
Problem with the current approach (see background info)
I have to keep track of the files and update nginx.confcorrespondingly.
max-age=0 is used also for webbrowsers that don't support service-workers. So, they request the ressources from the webservers on each page visit.
1st Update
My desired precaching behaviour can be illustrated with two of the workbox strategies. I want the service worker to show below behaviour as described in scenario 1 and 2, although cache-control: max-age=86400 is delivered in the HTTP header by the webserver for an asset (e.g. default.js).
Scenario 1: revision in sw.js didn't change
The webpage is accessed, the sw.js file is retrieved from the webserver due to max-age=0 and the webbrowser noticed that the revision for default.js didn't change. In this case, default.js is retrieved from the precache cache:
Scenario 2: revision in sw.js did change
The webpage is accessed, the sw.js file is retrieved from the webserver due to max-age=0 and the webbrowser noticed that the revision of default.js changed. In this case, default.js is retrieved from the webserver:
2nd Update
Basically, the desired strategy is similar to the network-first strategy. But, step 2 is only taken if the revision of the file in sw.js has changed.
3rd Update
If I am not mistaken, there is already some work on this:
self.addEventListener('install', event => {
event.waitUntil(
caches.open(`static-${version}`)
.then(cache => cache.addAll([
new Request('/styles.css', { cache: 'no-cache' }),
new Request('/script.js', { cache: 'no-cache' })
]))
);
});

I don't think you have a comprehensive enough understanding of how service workers actually work.
You define one, or many caches for a service worker to use. You specify what goes in which cache, whether to cache future requests etc
The service worker now intercepts all network requests from the client and then responds to them however you have programmed it to. It can return cached content if available, cached content first while updating over the network, network first and copy to cache in case of no connection, cache for images but not for anything else, only cache GET requests, only cache certain domains, file types etc......
What it caches and for how long each cache is valid is entirely up to you and not influenced by server response headers at all. If you tell your service worker to make a fetch request for a resource then it will load that resource over the network, regardless of any headers or what is already cached locally.
You have total control over the entire caching process, which is very useful but has it's own set of pitfalls.

I used s-max-age instead of s-maxage in the cache-control HTTP header, which lead to some unexpected behaviour with my reverse proxy and workbox service worker. After the fix, the service worker is working as expected.

Related

How and why is Apache intercepting some CORS requests to Rails?

Before Chrome makes a cross-domain AJAX call it makes an OPTIONS check like this:
curl \
'https://fubar.com/users/sign_in' \
-X OPTIONS \
-H 'Access-Control-Request-Method: POST' \
-H 'Origin: http://snafu.com' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36' \
-H 'Access-Control-Request-Headers: content-type' \
--compressed \
--insecure \
--verbose
(I added --insecure and --verbose for testing.)
I can see this request in the Apache logs but it doesn't get to Rails.
127.0.0.1 - - [27/Jul/2018:09:22:44 -0400] "OPTIONS /users/sign_in HTTP/1.1" 200 -
If I remove either the Access-Control-Request-Method or Origin headers then it does pass the request to Rails.
Something about the combination of these two headers seems to be causing Apache to handle the request itself and not give Rails a chance to process it.
I am not setting any headers or defining any rewrite rules in the Apache config; it's basically a vanilla install.
I'm not able to find any documentation or configurations explaining why this would be happening and how to prevent it.

1) OPTIONS HTTP call at 'https://fubar.com/users/sign_in' performed from chrome browser with request headers 'Access-Control-Request-Method: POST' and 'Origin: http://snafu.com'
2) the server at 'https://fubar.com/users/sign_in' receives the request
STEP 1
The routing is handles from either nginx or apache webserver which will first apply their own config rules. This settings are included in the file inside /etc/apache2 or /etc/nginx
For example with nginx you can define rules to add_header to the http response or settings to redirect to another url
for example
add_header "Access-Control-Allow-Origin: '*'"
this will add the header "Access-Control-Allow-Origin: '*'" to all responses.
If for example you apply this response to all OPTIONS requests, then all subsequent http request will be whitelisted from any http origin
STEP 2
once the nginx/apache redirection rules are applied, the rails router receives the request and redirects to your controller.
Here you can still add any header you want inside the controller, you can redirect OPTIONS request to a specific controller action, which can add a specific header, be careful to not add twice the same header as that can cause issues.
In this action you can rewrite the Access-Control-Allow-Origin header in the response from the OPTIONS request to whitelist only specific origin domains (you just need to write a routing rule which is applied only to OPTIONS requests)
The origin domain is written in the header of the request
request headers 'Access-Control-Request-Method: POST' and 'Origin: http://snafu.com'

Batch file to check website status from a text file and restart service based on string

I need some batch guru to assist me in getting this resolved. I have a couple of files via which we are monitoring the response from the websites using wget. When the site is down we get the following response code in test1.txt:
Connecting to 10.x.x.x:443... failed: Bad file descriptor.
whilst when the site is running the response code in test2.txt is
Connecting to 10.x.x.x:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
I do not see any common pattern in both the above outputs based on which I can form a logic. Need some assistance in determining if from the outputs above
if the website is running, do nothing
if the website is down, start service.
Note, we need to do this only on the basis of the output from these files.
Tried the provided solution but it didn't work:
TestScript>wget-1.14.exe --spider --no-check-certificate https://somesite | find "Bad file descriptor" 1>nul
Spider mode enabled. Check if remote file exists.
--2015-10-08 18:15:21-- https://somesite
Connecting to 10.x.x.x:443... failed: Bad file descriptor.
TestScript>if errorlevel 1 (echo site is up ) else (echo site is down )
site is up

Pipe the output of wget to find to look for Bad file descriptor and then use errorlevel:
wget --spider http://someurl 2>&1 | find "Bad file descriptor" >nul
if errorlevel 1 (
echo site is up
) else (
echo site is down
)
2>&1 redirects the messages into the standard output so that it can be piped
--spider makes wget only check the url without saving the result
Alternatively use the file you already have:
if exist test1.txt find "Bad file descriptor" test1.txt >nul
if not errorlevel 1 (echo start the service)

jenkins plugin for triggering build whenever any file changed in a given directory

I am looking for functionality where we have a directory with some files in it.
Whenever any one makes a change in any of the files in the directory, jenkins shoukd trigger a build.
Is there any plugin or mathod for this functionality. Please advise.
Thanks in advance.

I have not tried it myself, but The FSTrigger plugin seems to do what you want:
FSTrigger provides polling mechanisms to monitor a file system and
trigger a build if a file or a set of files have changed.
If you can monitor the directory with a script, you can trigger the build with a HTTP GET, for example with wget or curl:
wget -O- $JENKINS_URL/job/JOBNAME/build

Although slightly related.. it seems like this issue was about monitoring static files on system.. however there are many version control systems for just this purpose.
I answered this in another post if you're using git to track changes on the files themselves:
#!/bin/bash
set -e
job_name="whatever"
JOB_URL="http://myserver:8080/job/${job_name}/"
FILTER_PATH="path/to/folder/to/monitor"
python_func="import json, sys
obj = json.loads(sys.stdin.read())
ch_list = obj['changeSet']['items']
_list = [ j['affectedPaths'] for j in ch_list ]
for outer in _list:
for inner in outer:
print inner
"
_affected_files=`curl --silent ${JOB_URL}${BUILD_NUMBER}'/api/json' | python -c "$python_func"`
if [ -z "`echo \"$_affected_files\" | grep \"${FILTER_PATH}\"`" ]; then
echo "[INFO] no changes detected in ${FILTER_PATH}"
exit 0
else
echo "[INFO] changed files detected: "
for a_file in `echo "$_affected_files" | grep "${FILTER_PATH}"`; do
echo " $a_file"
done;
fi;
You can add the check directly to the top of the job's exec shell, and it will exit 0 if no changes detected.. Hence, you can always poll the top level of the repo for check-in's to trigger a build. And only complete a build if the files in question change.

How to determine which script is being executed in PHP-FPM process

I am running nginx + php-fpm. Is there any way how can I know what is each of the PHP processes doing? Something like extended mod_status in apache, where I can see that apache process with PID x is processing URL y. I'm not sure if the PHP process knows the URL, but getting the script path and name will be sufficient.

After some googling hours and browsing PHP.net bug tracking system I have found the solution. It is available since PHP 5.3.8 or 5.3.9, but doesn't seem to be documented. Based on feature request #54577, the status page supports option full, which will display status of each worker separately. So for example the URL will be http://server.com/php-status?full and sample output looks like:
pid: 22816
state: Idle
start time: 22/Feb/2013:15:03:42 +0100
start since: 10933
requests: 28352
request duration: 1392
request method: GET
request URI: /ad.php?zID=597
content length: 0
user: -
script: /home/web/server.com/ad/ad.php
last request cpu: 718.39
last request memory: 1310720

PHP-FPM has a built in status monitor, though it's not as details as mod_status. From the php-fpm config file /etc/php-fpm.d/www.conf (on CentOS 6)
; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. By default, the status page shows the following
; information:
; accepted conn - the number of request accepted by the pool;
; pool - the name of the pool;
; process manager - static or dynamic;
; idle processes - the number of idle processes;
; active processes - the number of active processes;
; total processes - the number of idle + active processes.
; The values of 'idle processes', 'active processes' and 'total processes' are
; updated each second. The value of 'accepted conn' is updated in real time.
; Example output:
; accepted conn: 12073
; pool: www
; process manager: static
; idle processes: 35
; active processes: 65
; total processes: 100
; By default the status page output is formatted as text/plain. Passing either
; 'html' or 'json' as a query string will return the corresponding output
; syntax. Example:
; http://www.foo.bar/status
; http://www.foo.bar/status?json
; http://www.foo.bar/status?html
; Note: The value must start with a leading slash (/). The value can be
; anything, but it may not be a good idea to use the .php extension or it
; may conflict with a real PHP file.
; Default Value: not set
;pm.status_path = /status
If you enable this, you can then pass the path from nginx to your socket/port for PHP-FPM and you can view the status page.
nginx.conf:
location /status {
include fastcgi_params;
fastcgi_pass unix:/var/lib/php/php-fpm.sock;
}

cgi command line is more convinient:
SCRIPT_NAME=/status \
SCRIPT_FILENAME=/status \
REQUEST_METHOD=GET \
cgi-fcgi -bind -connect 127.0.0.1:9000

You can use strace to show the scripts being run - and many other things - in real time. It's pretty verbose, but it can give you a good overall picture of what's going on:
# switch php-fpm7.0 for process you're using
sudo strace -f $(pidof php-fpm7.0 | sed 's/\([0-9]*\)/\-p \1/g')
The above will attach to the forked processes of php fpm. Use -p to attach to a particular pid.
The above would get the scrip path. To get the urls, you would look at your nginx / apache access logs.
As a side note, to see the syscalls and which ones are taking longest:
sudo strace -c -f $(pidof php-fpm7.0 | sed 's/\([0-9]*\)/\-p \1/g')
Wait a while, then hit Ctr-C

Monitoring URLs with Nagios

I'm trying to monitor actual URLs, and not only hosts, with Nagios, as I operate a shared server with several websites, and I don't think its enough just to monitor the basic HTTP service (I'm including at the very bottom of this question a small explanation of what I'm envisioning).
(Side note: please note that I have Nagios installed and running inside a chroot on a CentOS system. I built nagios from source, and have used yum to install into this root all dependencies needed, etc...)
I first found check_url, but after installing it into /usr/lib/nagios/libexec, I kept getting a "return code of 255 is out of bounds" error. That's when I decided to start writing this question (but wait! There's another plugin I decided to try first!)
After reviewing This Question that had almost practically the same problem I'm having with check_url, I decided to open up a new question on the subject because
a) I'm not using NRPE with this check
b) I tried the suggestions made on the earlier question to which I linked, but none of them worked. For example...
./check_url some-domain.com | echo $0
returns "0" (which indicates the check was successful)
I then followed the debugging instructions on Nagios Support to create a temp file called debug_check_url, and put the following in it (to then be called by my command definition):
#!/bin/sh
echo `date` >> /tmp/debug_check_url_plugin
echo $* /tmp/debug_check_url_plugin
/usr/local/nagios/libexec/check_url $*
Assuming I'm not in "debugging mode", my command definition for running check_url is as follows (inside command.cfg):
'check_url' command definition
define command{
command_name check_url
command_line $USER1$/check_url $url$
}
(Incidentally, you can also view what I was using in my service config file at the very bottom of this question)
Before publishing this question, however, I decided to give 1 more shot at figuring out a solution. I found the check_url_status plugin, and decided to give that one a shot. To do that, here's what I did:
mkdir /usr/lib/nagios/libexec/check_url_status/
downloaded both check_url_status and utils.pm
Per the user comment / review on the check_url_status plugin page, I changed "lib" to the proper directory of /usr/lib/nagios/libexec/.
Run the following:
./check_user_status -U some-domain.com.
When I run the above command, I kept getting the following error:
bash-4.1# ./check_url_status -U mydomain.com
Can't locate utils.pm in #INC (#INC contains: /usr/lib/nagios/libexec/ /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5) at ./check_url_status line 34.
BEGIN failed--compilation aborted at ./check_url_status line 34.
So at this point, I give up, and have a couple of questions:
Which of these two plugins would you recommend? check_url or check_url_status?
(After reading the description of check_url_status, I feel that this one might be the better choice. Your thoughts?)
Now, how would I fix my problem with whichever plugin you recommended?
At the beginning of this question, I mentioned I would include a small explanation of what I'm envisioning. I have a file called services.cfg which is where I have all of my service definitions located (imagine that!).
The following is a snippet of my service definition file, which I wrote to use check_url (because at that time, I thought everything worked). I'll build a service for each URL I want to monitor:
###
# Monitoring Individual URLs...
#
###
define service{
host_name {my-shared-web-server}
service_description URL: somedomain.com
check_command check_url!somedomain.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}

I was making things WAY too complicated.
The built-in / installed by default plugin, check_http, can accomplish what I wanted and more. Here's how I have accomplished this:
My Service Definition:
define service{
host_name myers
service_description URL: my-url.com
check_command check_http_url!http://my-url.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}
My Command Definition:
define command{
command_name check_http_url
command_line $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$
}

The better way to monitor urls is by using webinject which can be used with nagios.
The below problem is due to the reason that you dont have the perl package utils try installing it.
bash-4.1# ./check_url_status -U mydomain.com Can't locate utils.pm in #INC (#INC contains:

You can make an script plugin. It is easy, you only have to check the URL with something like:
`curl -Is $URL -k| grep HTTP | cut -d ' ' -f2`
$URL is what you pass to the script command by param.
Then check the result: If you have an code greater than 399 you have a problem, else... everything is OK! THen an right exit mode and the message for Nagios.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Bypass default caching policy if file is managed by service worker - service-worker

I used s-max-age instead of s-maxage in the cache-control HTTP header, which lead to some unexpected behaviour with my reverse proxy and workbox service worker. After the fix, the service worker is working as expected.

Related

How and why is Apache intercepting some CORS requests to Rails?

Batch file to check website status from a text file and restart service based on string

jenkins plugin for triggering build whenever any file changed in a given directory

How to determine which script is being executed in PHP-FPM process

Monitoring URLs with Nagios

Categories

Resources