Make one instance of multiple uWSGI workers perform a extra function - uwsgi

I have a python flask app running on uWSGI with a config file that specifics it to spawn multiple workers (which I am assuming are identical processes).
Everything works well except for one part: the python app runs a bash command to download an update a database every day using a scheduler, which needs to run only once but multiple processes means that it runs multiple times at the same time, thus corrupting the downloaded file.
Is there a way to run this bash command on only one instance of uWSGI workers? I can't run the bash command as a separate cron job (the database update has to integrate seamlessly with the app).

Check The uWSGI cron-like interface
uWSGI’s master has an internal cron-like facility that can generate
events at predefined times. You can use it
You can set the options for example to:
[uwsgi]
; every two hours
cron = 0 -2 -1 -1 -1 /usr/bin/backup_my_home --recursive
Is that sufficient?

Related

Implement 'Entrypoint' like functionality in Cloud Native Buildpack

I have a multi-process web app. The processes are contributed by different buildpacks. The default process will start the web application. I have a use case in which a given shell script should be executed before the default process invocation.
I have tried the following approach;
Create a custom-buildpack
Create a script that needs to be executed and invoke the web process in it.
Create a new process based on the above shell sciprt by specifying it in launch.toml definition
Make the buildpack launchable
The entrypoint.sh
#!/usr/bin/env bash
# Some fancy stuff..
#Invoke the web process
/cnb/process/web
Create lauch.toml from the build script of custom-buildpack. Make the entrypoint process the default one.
cat > "$layers_dir/launch.toml" << EOL
[[processes]]
type = "entrypoint"
command = "bash"
args = ["$scriptlayer/bin/entrypoint.sh"]
default = true
EOL
echo -e '[types]\nlaunch = true' > "$layers_dir/assembly-scripts.toml"
Truncated pack inspect-image output
Processes:
TYPE SHELL COMMAND ARGS
entrypoint (default) bash bash /layers/gw_assembly-scripts/assembly-scripts/bin/entrypoint.sh
task bash catalina.sh run
tomcat bash catalina.sh run
web bash catalina.sh run
Is there any better CNB native approach to achieve this use case?
You have a couple of options here:
The simplest option would be to add a .profile script to the root of your application. It's a bash script, so anything you can write in bash can be done there, however, it's primarily for initializing your app and setting additional env variables.
This file runs prior to the command in your process type. I looked for documentation on this behavior, but only found it briefly mentioned in the buildpacks spec.
As an example, if I put .profile in the root of my application and inside that file, I write echo 'Hello World!'. I'll see Hello World! printed before any of my process types execute.
If you want to create a buildpack, you can achieve something similar to the .profile script by having your buildpack include an exec.d binary.
This is a binary that's part of your launch image and gets run prior to any of your process types. It allows you to take actions to initialize an application and set additional environment variables dynamically before your application starts.
This mechanism is often used by buildpack authors to provide dynamic behavior at runtime based on changes to environment variables or Kubernetes service bindings. For example, turning on/off features like APM tools, debugging, and metrics.
A few other miscellaneous notes.
Neither of the options above allows you to change the actual process type. The process type that will be executed is selected prior to these options (.profile and exec.d) running and you cannot influence that from within. You can only use them to run things prior to the process type running.
The buildpack spec does not allow for a buildpack to modify the process types for another buildpack. So you cannot create a buildpack that wraps or modifies process types set by another buildpack. That said, a buildpack can override the process types set by another buildpack. Buildpacks that are later in the order group will override earlier buildpacks.
From the spec: A combined processes list derived from all launch.toml files such that process types from later buildpacks override identical process types from earlier buildpacks.
With buildpacks, the entrypoint is always the launcher. The launcher is a process that runs and implements the application side of the buildpack specification. It runs .profile, exec.d binaries, sets up buildpack provide environment variables and eventually launch the specified process type.
If you override the entrypoint for a container then the launcher won't run and none of the things it is supposed to do will happen. Sometimes this is desired, like if you're troubleshooting, but usually you want the launcher to be the entrypoint.

How to split Apache Druid hstorical and middlemanager nodes

In a clustered deployment of druid, the document doesn't mention how to split historical and middlemanager nodes into two (or more) separate nodes. It only mentions the port ranges they can accept. For running data-servers, druid says we must execute this script:
./bin/start-cluster-data-server
In we run this script in two different machines, how one of them act as middlemanager and the other as historical node?
If you cat the quickstart scripts, you'll see that each one calls a Perl script called supervise. Here is what you'll see inside start-cluster-data-server, for example:
exec "$WHEREAMI/supervise" -c "$WHEREAMI/../conf/supervise/cluster/data.conf"
The configuration file following -c contains information about the processes to start. For example, here's the data.conf file:
:verify bin/verify-java
historical bin/run-druid historical conf/druid/cluster/data
middleManager bin/run-druid middleManager conf/druid/cluster/data
You can see then how you can configure your own start scripts and a .conf file to go with it that will spawn only those processes that you want to run on a particular node.

Continuous deployment using LFTP gets "stuck" temporarily after about 10 files

I am using GitLab Community Edition and GitLab runner CI setup to deploy (synchronize) a bunch of JSON files on a server using LFTP. This job however, seems to "freeze" for a few minutes every 10 files roughly. Having to synchronize roughly 400 files sometimes, this job simply crashes because it sometimes takes more than an hour to complete. The JSON files are all 1KB. Neither the source and target servers should have any firewalls rate limiting the FTP. Both are hosted at OVH.
The following LFTP command is executed in orer to synchronize everything:
lftp -v -c "set sftp:auto-confirm true; open sftp://$DEVELOPMENT_DEPLOY_USER:$DEVELOPMENT_DEPLOY_PASSWORD#$DEVELOPMENT_DEPLOY_HOST:$DEVELOPMENT_DEPLOY_PORT; mirror -Rev ./configuration_files configuration/configuration_files --exclude .* --exclude .*/ --include ./*.json"
Job is ran in Docker, using this container to deploy everything. What could cause this?
For those of you coming from google we had the exact same setup. The way to get LFTP to stop hanging when running in a docker or some other CI you can use this command:
lftp -c "set net:timeout 5; set net:max-retries 2; set net:reconnect-interval-base 5; set ftp:ssl-force yes; set ftp:ssl-protect-data true; open -u $USERNAME,$PASSWORD $HOST; mirror dist / -Renv --parallel=10"
This does several things:
It makes it so it won't wait forever or get into a continuous loop
when it can't do a command. This should speed things along.
Makes sure we are using SSL/TLS. If you don't need this remove those
options.
Synchronizes one folder to the new location. The options -Renv can
be explained here: https://lftp.yar.ru/lftp-man.html
Lastly in the gitlab CI I set the job to retry if it fails. This will spin up a new docker instance that gets around any open file or connection limitations. The above LFTP command will run again but since we are using the -n flag it will only move over the files that were missed on the first job if it doesn't succeed. This gets everything moved over without hassle. You can read more about CI job retrys here: https://docs.gitlab.com/ee/ci/yaml/#retry
Have you looked at using rsync instead? I'm fairly sure you can benefit from the incremental copying of files as opposed to copying the entire set over each time.

Bash Script to run three different rails apps on the local server?

I have three apps that I want to run with the rails server at the same time, and I also want the option to kill all the servers from one location.
I don't have much experience with Bash so I'm not sure what command I would use to launch the server for a specific app. Since the script won't be in the app directory plain rails s won't work.
From there, I suppose if I can gather the PIDs of the processes the three servers are running on, I can have the script prompt for user input and whenever something is entered kill the three processes. I'm just unsure of how to get the PIDs.
Additionally, each app has a few environment variables that I wanted to have different values than those assigned in the apps config files. Previously, I was using export var=value before rails s, but I'm not sure how to guarantee each separate process is getting the right variables.
Any help is much appreciated!
The Script
You could try something like the following:
#!/bin/bash
case "$1" in
start)
pushd app/directory
(export FOO=bar; rails s ...; echo $! > pid1)
(export FOO=bar; rails s ...; echo $! > pid2)
(export FOO=bar; rails s ...; echo $! > pid3)
popd
;;
stop)
kill $(cat pid1)
kill $(cat pid2)
kill $(cat pid3)
rm pid1 pid2 pid3
;;
*)
echo "Usage: $0 {start|stop}"
exit 1
;;
esac
exit 0
Save this script into a file such as script.sh and chmod +x script.sh. You'd start the servers with a ./script.sh start, and you can kill them all with a ./script.sh stop. You'll need to fill in all the details in the three lines that startup the servers.
Explanation
First is the pushd: this will change the directory to where your apps live. The popd after the three startup lines will return you back to the location where the script lives. The parentheses around the (export blah blah) create a subshell so the environment variables that you set inside the parentheses, via export, shouldn't exist outside of the parentheses. Additionally, if your three apps live in different directories, you could put a cd inside each of the three parantheses to move to the app's directory before the rails s. The lines would then look something like: export FOO=bar; cd app1/directory; rails s ...; echo $! > pid1. Don't forget that semicolon after the cd command! In this case, you can also remove the pushd and popd lines.
In Bash, $! is the process ID of the last command. We echo that and redirect (with >) to a file called pid1 (or pid2 or pid3). Later, when we want to kill the servers, we run kill $(cat pid1). The $(...) runs a command and returns the output inline. Since the pid files only contain the process ID, cat pid1 will just return the process ID number, which is then passed to kill. We also delete the pid files after we've killed the servers.
Disclaimer
This script could use some more work in terms of error checking and configuration, and I haven't tested it, but it should work. At the very least, it should give you a good starting point for writing your own script.
Additional Info
My favorite bash resource is the Advanced Bash-Scripting Guide. Bash is actually a fairly powerful language with some neat features. I definitely recommend learning how bash works!
Why don't you try capistrano, framework for executing commands in parallel on multiple remote machines, via SSH. Its has lots of recipes to do this.
You are probably better off setting up pow.cx, which would run each server as it's needed, rather than having to spin up and shut down servers manually.
You could use Foreman to run, monitor, and manage your processes.
I realize I'm late to the party here, but after searching the internet for a good solution to this (and finding this page but few others and none with a full solution) and after trying unsuccessfully to get prax working, I decided to write my own solution to this problem and give it back to the community!
Check out my rdev bash script gist - a bash script you put in your ~/bin directory. This will create a new tab in gnome-terminal for each rails app with the app name and port in the tab's title. It verifies the app launched successfully by checking the port is in use and the process is actually running. It also verifies the rails app shutdown is successful by ensuring the port is no longer in use and the process is no longer running.
Setup is super easy, just change these two config values:
# collection of rails apps you want to start in development (should match directory name of rails project)
# note: the first app in the collection will receive port 3000, the second 3001 and so on
#
rails_apps=(app1 app2 app3 etc)
#
# The root directory of your rails projects (~/ is assumed, do not include)
#
projects_root="ruby/projects/root/path"
With this script you can start all your rails apps in one command or stop them all and you can stop, start and restart individual rails apps as well. While the OP requested 3 apps run, this will allow you to run as many as you need with port being assigned in order starting with 3000 for the first app in the list. Each app is started using the proper ruby version thanks to chruby and the .env is sourced on the way up so your app will have everything it needs. Once you are done developing just rdev stop and all your rails apps will be killed and the terminal windows closed.
# Usage Examples:
#
# Show Help
# ~/> rdev
# Usage: rdev {start|stop|restart} [app port]
#
# start all rails apps
# ~/> rdev start
#
# start a single rails app
# ~/> rdev start app port
#
# stop all rails apps
# ~/> rdev stop
#
# stop a single rails app
# ~/> rdev stop app port
#
# restart a single rails app
# ~/> rdev restart app port
For the record, all testing was done on Ubuntu 18.04. This script requires: bash, chruby, gnome-terminal, lsof and takes advantage of the BASH_POST_RC trick.

When calling a shell command via ruby, what context does the command run on?

In a rails application (or sinatra), if I make a call to a shell command, under what context does this command run?
I'm not sure if I am asking my question correctly, but does it run in the same thread as the rails process?
When you shell out, is it possible to make this a asychronous call? If yes, does this mean at the operating system level it will start a new thread? Can it start in a pool of threads instead of a new thread?
If you are using system('cmd') or simply backticks:
`cmd`
Then the command will be executed in the context of a subshell.
If you wish to run multiple of these at a time, you can use Rubys fork functionality:
fork { system('cmd') }
fork { system('cmd') }
This will create multiple subprocessess which run the individual commands in their respective subshells.
Read up on forking here: http://www.ruby-doc.org/core-2.0/Process.html#method-c-fork
It's more than just a new thread, it's a completely separate process. It will be synchronous and control will not return to Ruby until the command has completed. If you want a fire-and-forget solution, you can simply background the task:
$ irb
irb(main):001:0> system("sleep 30 &")
=> true
irb(main):002:0>
$ ps ax | grep sleep
3409 pts/4 S 0:00 sleep 30
You can start as many processes as you want via system("foo &") or`foo &`.
If you want more control over launching background processes from Ruby, including properly detaching ttys and a host of other things, check out the daemons gem. That's more suitable for long-running processes that you want to manage, with PID files, etc., but it's also possible to just launch tasks with it.
There are alternative solutions for managing background processes depending on your needs. The resque gem is popular for queuing and managing background jobs. It requires Redis and some setup, but it's good if you need that level of control.

Resources