What is the uWSGI master process for? - uwsgi

What is the uWSGI master process for?
and/or where can I read more about it?
I have found zero documentation for it.

You can find some information about master here and here
Generally speaking, master file is responsible for graceful reloading your app server (so there is as little as possible down time when you're reloading your app), it manages preforking and enables threading for app instances. It also manages some advanced logging features and keeps your app instances up and running (when one of instance crashes, master will re-launch it). It probably also manages harakiri mode (it kills hanged workers).
Generally speaking, using master is recommended for your apps, unless you're using emperor.
Running emperor (not vassals, but emperor by itself) with master process is recommended only if you need some of benefits that master gives, for example advanced logging. You can also skip master for your vassals, because emperor will do some of work that normally vassal does, but not all of it. I personally use master for vassals.

Related

Distributed Erlang - network split recovery and using heart with distributed applications

I have a standard situation, two distributed Erlang nodes, one master one standby.
When I stop the master the standby comes on - failover, when I start the master the standby stops - takeover. Everything works fine as long as heart is not turned on and there is no network split.
However, when I disconnect the master from the network after 60 seconds or so the standby gives me an error message ** removing (timedout) connection ** and starts up as if the master node stopped. This makes sense to me, it doesn't know if the master is alive or not, and epmd can't connect to the master node so it is removed from the nodes() list. Lets pretend for a moment that this is the desired outcome.
The problem is that, when the connection is restored, I have master and standby running at the same time and the standby is oblivious to the fact that the master is running. Pinging the standby during the masters init does not solve the issue. I checked nodes() on the standby after doing so, it sees the master node but still it continues to run.
My solution for now has been to create a process, that monitors all nodes that are above each node in hierarchy and if any of them are online, can be pinged, the process calls erlang:halt() to terminate the standby node. It works for simple situations, but maybe someone can tell me if there is a better way? I found a similar problem described on Elixir forum so it probably a known erlang problem without an easy solution. https://elixirforum.com/t/distributed-application-network-split/10007
If during a network split you don't want to have two nodes running in parallel I'm guessing an outside monitoring application needs to be used?
The second major issue is heart. If heart is turned on, as is, the failover never happens. If heart is running with a sleep before it calls start it stops the failover node when it calls the application start. So even when it can't start the master, do to it not having access to vital resources for example, it stops the failover node, and doesn't bring it back up after it fails to start the master. I don't know if heart is not supposed to be used with a distributed application or if there is an option to run some erlang code to check if the resources are available before attempting a start the node and before stopping the failover node?
The documentation on heart is not great. Very hard to find any examples of HEART_COMMAND. I found a way to set the HEART_COMMAND to a script from within my application, but there is a limit to how long the argument can be, and it's not as long as stated in the documentation from what I can tell. This for example sets a sleep timer for 60 seconds before calling application start again. It doesn't solve any issues, because in 60 seconds it stops the failover node and hangs if master node can't start.
heart:set_cmd("sleep 60; ./bin/myapp start")
The solution I've ended up with for now is letting heart of the main release start another release, a pre-loader, which does a preliminary check that all resources are available and if they are it starts the main release-application, and if they are not it continues checking forever. This way the main app is running on the failover node without interruption. So the main release has heart turned on, and the pre-loader does not. I ended up using a bash script file because I needed to do more work than I could fit in the heart:set_cmd/1, so I'm still calling heart:set_cmd(Dir ++ "/priv/myHeartScript.sh " ++ Arg1 ++ " " ++ Arg2), but don't get carried away with the Args as there is a limit on size! I also used Environment Variables which I set in vm.args using -env to pass data to the script, such as the pre-loader path/name. This allowed me to avoid having to edit the scrip too during deployment.
If anyone has a better solution PLEASE let me know.
UPDATE
The team at Erlang Solutions was kind enough to shed some light onto the subject. Basically, nobody they know uses the Erlangs built in distributed model. Everything revolves around the data, and as long as it is available on redundant databases you can spin up new applications anytime. They recommend using the cloud hosts that can spin up new servers when one goes down or use a redundant node design, so have 5 nodes up in parallel and if a few go down you can restart them manually or by other means.
As for me, I can say that getting heart to start a pre-loader release/app gets the job done but it gets complicated fast. To launch the app now requires provisioning several extra sys.config/vm.args/rebar.config files. I will be looking into their suggestions for the next iteration.
UPDATE
Moved away from using Erlang distributed model. Using RabbitMQ to send heartbeats to all nodes, including itself. If a node is receiving heartbeats from itself and no other node it's the master, if receiving more than one use any attribute like node name to chose the master. You don't have to use RabbitMQ, but you need to make sure all nodes can reach the same destination and consume from it.
Also, devOps oppose using heart. They prefer to use standard Linux tools to monitor application status and restart it after crash or a server reboot.

What is worker , dyno and zero-downtime deploys in heroku

These three terms have given a lot of importance in understanding different app server in heroku tutorials but I can't understand the purpose and definition of these three terms.
Can anybody have info about that. Kindly share
Thanks
The Heroku reference guide has a lot of information on all of this, and lots more, but in answer to your question;
A dyno is effectively a small virtual server instance set up to run one app (it's behind an invisible load balancer, so you can have any number of them running side-by-side). You don't need to worry about the server admin side of things, as it just takes your source code from a Git push and runs it.
A worker is a type of dyno, usually designed to process tasks in the background (in contrast to a web dyno, which just serves web pages. For example, Rails has ActiveJob, which plugs into something like Resque or Sidekiq, completes tasks which would slow down the web interface if it has to complete them, like sending e-mails, or geocoding addresses.
Zero-downtime deploys is really marketing speak for "if you push your code, it will wait until the new version is up and running before swapping the web interface to use it". It means you don't need to do anything, and your web app won't go offline while it is switched over to.

Any additional considerations when using Faye in a Node.js cluster?

We're planning to run an Express-based server on Node.js in "cluster mode" using Node.js' cluster support. So there will be 1 master process and 'n' (where 'n' is calculated based on the number of CPUs) child processes running on a single machine. We already have a testbed set up using Faye for pubsub in non-cluster mode and it works great.
Are there any additional considerations we need to be aware of when using Faye on top of a Node cluster? For example, since there will be 'n' HTTP server instances, will it be a problem creating a Faye NodeAdapter in each Node process and attaching it to the HTTP server instance in that process?
Thanks.
-brian
I just realized that the answer to my question is fairly obvious. One thing to be aware of is that Faye will need to access shared state across multiple server instances (processes). In a single-server config, you could probably get away with using Faye's memory engine. In a clustered config, you'd need to use Faye's redis engine or some other engine that allows state to be shared by different processes. I'd prefer not to introduce another persistence component just for this purpose so I may look into implementing my own on top of my current persistent store (Neo4j).

How to (or should I) monitor or ensure running of a monitoring software?

I'm writing a system/service monitoring software, and my primary goal is to make it as failsafe as possible.
Right now, I have a binary script which starts the master process, which forks off children which do the actual monitoring and reporting. The master only manages the restarting of children if they fail, and some communication between the children.
Given this level of failsafe, is it advisable to add another layer of monitoring for the master process?
Supposing my code is in a high level language (python et al.), would it make sense to wrap my software in a initscript or shellscript which watches it, or would it be redundant?
This reminds me of this old worm that would consist of 2 processes. If one of the processes was killed, the other one would respawn it and vice versa.
If this software is supposed to run on linux you can simply use /etc/inittab with the respawn option.

Rails best practice: background process/thread?

I'm coming from a PHP environment (at least in terms of web dev) and into the beautiful world of Ruby, so I may have some dumb questions. I imagine there are some fundamentally different options available when not using PHP.
In PHP, we use memcache to store alerts we want to display in a bar along the top of the page. When something happens that generates an alert (such as a new blog post being made), a cron script that runs once every 5 minutes or so puts that information into memcache.
Now when a user visits the site, we look in memcache to find any alerts that they haven't already dismissed and we display them.
What I'm guessing I can do differently in Rails, is to by-pass the need for a cron script, and also the need to look in memcache on every request, by using a Singleton and a polling process running in a separate thread to copy from memcache to this singleton. This would, in theory, be more optimized than checking memcache once-per-request and also encapsulate the polling logic into one place, rather than being split between a cron task and the lookup logic.
My question is: are there any caveats to having some sort of runloop in the background while a Rails app is running? I understand the implications of multithreading, from Objective-C/Java, but I'm asking specifically about the Rails (3) environment.
Basically something like:
class SiteAlertsMap < Hash
include Singleton
def initialize
super
begin_polling
end
# ... SNIP, any specific methods etc ...
private
def begin_polling
# Create some other Thread here, which polls at set intervals
end
end
This leads me into a similar question. We push (encrypted) tasks onto an SQS queue, for things related to e-commerce and for long-running background tasks. We don't use cron for this, but rather we have a worker daemon written in PHP, which runs in the background. Right now when we deploy, we have to shut down this worker and start it again from the new code-base. In Rails, could I somehow have this process start and stop with the rails server (unicorn) itself? I don't think that's something I'd running on the main process in a separate thread, since we often want to control it as a process by itself, but it would be nice if it just conveniently ran when the web application was running.
Threading for background processes in ruby would be a terrible mistake, especially since you're using a multi-process server. Using unicorn with say 4 worker processes would mean that you'd be polling from each of them, which is not what you want. Ruby doesn't really have real threads, it has green threads in 1.8 and a global interpreter lock in 1.9 IIRC. Many gems and libraries are also obnoxiously unthreadsafe.
Using memcache is still your best option and, if you have it set up correctly, you should only see it adding a millisecond or two to the request time. Another option which would give you the benefit of persisting these alerts while incurring minimal additional overhead would be to store these alerts in redis. This would better protect you against things like memcache crashing or server reboots.
For the background jobs you should use a similar approach to what you have now, but there are several off the shelf handlers for this like resque, delayed_job, and a few others. If you absolutely have to use SQS as the backend queue, you might be able to find some code to help you, but otherwise you could write it yourself. This still requires the other daemon to be rebooted whenever there is a code change. In practice this isn't a huge concern as best practices dictate using a deployment system like capistrano where a rule can easily be added to bounce the daemon on deploy. I use monit to watch the daemon process, so restarting it is as easy as telling monit to restart it.
In general, Ruby is not like Java/Objective-C when it comes to threads. It follows the more Unix-like model of process based isolation, but the community has come up with best practices and ways to make this less painful than in other languages. Ruby does require a bit more attention to setting up its stack as it is not as simple as enabling mod_php and copying some files around, but once the choices and architecture is understood, it is easier to reason about how your application works. The process model, in my opinion, is much better for web apps as it isolates code and state from the effects of other running operations. The isolation also makes the app easier to work with in a distributed system.

Resources