Erlang/Elixir on Docker and Hot Code Swap - docker

One of the features of Erlang (and, by definition, Elixir) is that you can do hot code swap. However, this seems to be at odd with Docker, where you would need to stop your instances and restart new ones with new images holding the new code. This essentially seem to be what everyone does.
This being said, I also know that it is possible to use one hidden node to distribute updates to all other nodes over network. Of course, just like that is sounds like asking for trouble, but...
My question would be the following: has anyone tried and achieved with reasonable success to set up a Docker-based infrastructure for Erlang/Elixir that allowed Hot-code swapping? If so, what are the do's, don'ts and caveats?

The story
Imagine a system to handle mobile phone calls or mobile data access (that's what Erlang was created for). There are gateway servers that maintain the user session for the duration of the call, or the data access session (I will call it the session going forward). Those server have an in-memory representation of the session for as long as the session is active (user is connected).
Now there is another system that calculates how much to charge the user for the call or the data transfered (call it PDF - Policy Decision Function). Both systems are connected in such a way that the gateway server creates a handful of TCP connections to PDF and it drops users sessions if those TCP connections go down. The gateway can handle a few hundred thousand customers at a time. Whenever there is an event that the user needs to be charged for (next data transfer, another minute of the call) the gateway notifies PDF about the fact and PDF subtracts a specific amount of money from the user account. When the user account is empty PDF notifies the gateway to disconnect the call (you've run out of money, you need to top up).
Your question
Finally let's talk about your question in this context. We want to upgrade a PDF node and the node is running on Docker. We create a new Docker instance with the new version of the software, but we can't shut down the old version (there are hundreds of thousands of customers in the middle of their call, we can't disconnect them). But we need to move the customers somehow from the old PDF to the new version. So we tell the gateway node to create any new connections with the updated node instead of the old PDF. Customers can be chatty and also some of them may have a long-running data connections (downloading Windows 10 iso) so the whole operation takes 2-3 days to complete. That's how long it can take to upgrade one version of the software to another in case of a critical bug. And there may be dozens of servers like this one, each one handling hundreds thousands of customers.
But what if we used the Erlang release handler instead? We create the relup file with the new version of the software. We test it properly and deploy to PDF nodes. Each node is upgraded in-place - the internal state of the application is converted, the node is running the new version of the software. But most importantly, the TCP connection with the gateway server has not been dropped. So customers happily continue their calls or are downloading the latest Windows iso while we are upgrading the system. All is done in 10 seconds rather than 2-3 days.
The answer
This is an example of a specific system with specific requirements. Docker and Erlang's Release Handling are orthogonal technologies. You can use either or both, it all boils down to the following:
Requirements
Cost
Will you have enough resources to test both approaches predictably and enough patience to teach your Ops team so that they can deploy the system using either method? What if the testing facility cost millions of pounds (because of the required hardware) and can use only one of those two methods at a time (because the test cycle takes days)?
The pragmatic approach might be to deploy the nodes initially using Docker and then upgrade them with Erlang release handler (if you need to use Docker in the first place). Or, if your system doesn't need to be available during the upgrade (as the example PDF system does), you might just opt for always deploying new versions with Docker and forget about release handling. Or you may as well stick with release handler and forget about Docker if you need quick and reliable updates on-the-fly and Docker would be only used for the initial deployment. I hope that helps.

Related

Use Docker to keep track of software versions/installations?

I have an data processing application which is updated on a regular basis. This application has a bunch of dependencies which are also updated every now and then. However, different versions of the software (+dependencies) might produce different results (this is expected). The application is run on a remote computer and it can be accessed through a Web page. Every time the user uses the Web page to do some processing she/he also chooses which version of the software he/she wants to use.
Now I am trying to decide which is the best way of keeping track different software (+dependencies) versions. The simplest way of course is to just compile and install each version of my software and its dependencies in a different folder, and then based on the request the user sends, the appropriate folder is selected. However, this sounds very clunky to me. So I thought I could use Docker to keep track of the different software versions. Do you think that it is a good idea? If yes, what is most appropriate to do every time I have a new version of the software (and/or dependencies): 1) Create a new container from scratch with the new version (and end up having multiple containers), or 2) Update the existing container and commit the changes? (I suppose I can access the older commits of the container, right?)
PS: Keep in mind that the reason I looked into Docker and not a simple virtual machine solution is that the application I am running is a high-performance GPU-based software.
Docker is a reasonable choice. Your repository would contain all of the app versions you wish to publish. Note, you will only realize savings if you organize the resulting app filesystem into layers, of which the lower layers are the least likely to change between versions. This will keep the storage requirements at a minimum.
Then you have to decide how you will process each job. A robust (but complex) solution would be to have one or more API containers which take in processing jobs from your user and "dole" them out to worker containers (one or more from each release version). This would provide the lowest response latency and be non-blocking. You can look at different service discovery models to see how your "worker" containers can register with your "manager" containers. This is probably more than you'd like to bite off, but consider using a good key-value database (another container!) like etcd or a 3rd party service discovery tool like zookeeper/eureka/consul.
A much simpler model would have a single API container with one each of the release containers created, but not started. The API container would start, direct, and then stop the appropriate release container. You would incur the startup latency, but this is the least resource intensive... and easiest to manage. But this is a blocking operation.
Somewhere in the middle, but less user friendly is to have each release container running but listening on different host ports (the app always sees the same port). The user would would connect to the port which is servicing the desired release of the app. You'd have to provide some sort of index to make this useful.

Software setup for hardware enabled application

I have a Raspberry PI that is tightly coupled with a device that I want to control.
The desired setup I want to have would look something like this:
The physical device with interactive hardware controls on the device (speaker, mic, buttons)
A Raspberry PI coupled to the device
On the PI:
A daemon app that reacts to changes from the hardware
A Webinterface that shows the current state of the device and allows to configure the device
The system should somehow be able to update itself with new software when it becomes available (apg-get or some other mechnism).
For the Webinterface I am going to use a rails app, which is not a problem as such. What is not clear to me is the event-driven software that is talking to the hardware through gpio. Firstly, I would prefer to do this using ruby, so that I don't have a big technology gap when developing the solution.
How can I ensure that both apps start up and run in the background when the raspberry PI starts
How do I notify the webapp of an event (e.g. a button was pressed).
I wonder if it makes sense that the two pieces of software have a shared database to communicate.
How to best setup some auto-update-mechanism for both pieces of software without requiring the user to take any actions.
Apps
This will be dependent on the operating system
If you install a lightweight version of Linux, you might be able to create some runtime applications or something. I've never done anything like this; but I know from Windows you can create startup programs -- likewise, you should be able to do something similar in Linux
BTW you wouldn't "run" the Rails app - you'll fire up the server to capture any requests. You'd basically run your app locally in "production" mode - allowing you to send requests, either through localhost, or setup a pseudo domain in the HOSTS file of your box
--
Web App
The web app itself is RESTful, meaning (I believe), it will only act upon having requests sent to it. Because this works over the HTTP protocol, it essentially means you'll need some sort of (web) service to send requests to the web app:
Representational state transfer (REST) is a way to create, read,
update or delete information on a server using simple HTTP calls
Although I've never done this myself, I would use the ruby app on your PI to send HTTP requests to your Rails app. This will certainly add a level of complexity, but will ensure you an interface the two types of data-transfer
The difference you have is Rails / any other web app will only act on request. "Native" applications will run as long as the operating system is operating; meaning you can "listen" for updates from the hardware etc.
What I would do is split the functionality:
Hardware input > send to service
Service > sends to Rails
Rails > sends response to service
Service > processes response
This may seem inefficient, but I think it's the best way to capture local-based input from your hardware. You'll have to use a localhost rails app, running with something like nginx or some other efficient server
--
Database
it would only make sense if they shared the data. You should remember that a database is different than a datatable. A database stores many tables, and is generally meant for a single purpose; whilst a datatable stores a single type of data.
From what you've written, I would recommend using two databases running on the same db server. This will give you the ability to create as many tables as you want for these databases - giving you scope to add as many different pieces of data you wish to each. Sharing data can be done using an API or a web service
--
Updating
Rails app will not need to be "updated" - you'll just need to deploy a fresh version. The beauty of Internet-centric software :)
In terms of your Rasberry-PI "on-board" software update - I don't have much experience with this, so can only recommend

Distributing an Erlang Chat system

I just finished Erlang in Practice screencasts (code here), and have some questions about distribution.
Here's the is overall architecture:
Here is how to the supervision tree looks like:
Reading Distributed Applications leads me to believe that one of the primary motivations is for failover/takeover.
However, is it possible, for example, the Message Router supervisor and its workers to be on one node, and the rest of the system to be on another, without much changes to the code?
Or should there be 3 different OTP applications?
Also, how can this system be made to scale horizontally? For example if I realize now that my system can handle 100 users, and that I've identified the Message Router as the main bottleneck, how can I 'just add another node' where now it can handle 200 users?
I've developed Erlang apps only during my studies, but generally we had many small processes doing only one thing and sending messages to other processes. And the beauty of Erlang is that it doesn't matter if you send a message within the same Erlang VM or withing the same Computer, same LAN or over the Internet, the call and the pointer to the other process looks always the same for the developer.
So you really want to have one application for every small part of the system.
That being said, it doesn't make it any simpler to construct an application which can scale out. A rule of thumb says that if you want an application to work on a factor of 10-times more nodes, you need to rewrite, since otherwise the messaging overhead would be too large. And obviously when you start from 1 to 2 you also need to consider it.
So if you found a bottleneck, the application which is particularly slow when handling too many clients, you want to run it a second time and than you need to have some additional load-balancing implemented, already before you start the second application.
Let's assume the supervisor checks the message content for inappropriate content and therefore is slow. In this case the node, everyone is talking to would be simple router application which would forward the messages to different instances of the supervisor application, in a round robin manner. In case those 1 or 2 instances are not enough, you could have the router written in a way, that you can manipulate the number of instances by sending controlling messages.
However for this, to work automatically, you would need to have another process monitoring the servers and discovering that they are overloaded or under utilized.
I know that dynamically adding and removing resources always sounds great when you hear about it, but as you can see it is a lot of work and you need to have some messaging system built which allows it, as well as a monitoring system which can monitor the need.
Hope this gives you some idea of how it could be done, unfortunately it's been over a year since I wrote my last Erlang application, and I didn't want to provide code which would be possibly wrong.

Is this the right way of building an Erlang network server for multi-client apps?

I'm building a small network server for a multi-player board game using Erlang.
This network server uses a local instance of Mnesia DB to store a session for each connected client app. Inside each client's record (session) stored in this local Mnesia, I store the client's PID and NODE (the node where a client is logged in).
I plan to deploy this network server on at least 2 connected servers (Node A & B).
So in order to allow a Client A who is logged in on Node A to search (query to Mnesia) for a Client B who is logged in on Node B, I replicate the Mnesia session table from Node A to Node B or vise-versa.
After Client A queries the PID and NODE of the Client B, then Client A and B can communicate with each other directly.
Is this the right way of establishing connection between two client apps that are logged-in on two different Erlang nodes?
Creating a system where two or more nodes are perfectly in sync is by definition impossible. In practice however, you might get close enough that it works for your particular problem.
You don't say the exact reason behind running on two nodes, so I'm going to assume it is for scalability. With many nodes, your system will also be more available and fault-tolerant if you get it right. However, the problem could be simplified if you know you only ever will run in a single node, and need the other node as a hot-slave to take over if the master is unavailable.
To establish a connection between two processes on two different nodes, you need some global addressing(user id 123 is pid<123,456,0>). If you also care about only one process running for User A running at a time, you also need a lock or allow only unique registrations of the addressing. If you also want to grow, you need a way to add more nodes, either while your system is running or when it is stopped.
Now, there are already some solutions out there that helps solving your problem, with different trade-offs:
gproc in global mode, allows registering a process under a given key(which gives you addressing and locking). This is distributed to the entire cluster, with no single point of failure, however the leader election (at least when I last looked at it) works only for nodes that was available when the system started. Adding new nodes requires an experimental version of gen_leader or stopping the system. Within your own code, if you know two players are only going to ever talk to each other, you could start them on the same node.
riak_core, allows you to build on top of the well-tested and proved architecture used in riak KV and riak search. It maps the keys into buckets in a fashion that allows you to add new nodes and have the keys redistributed. You can plug into this mechanism and move your processes. This approach does not let you decide where to start your processes, so if you have much communication between them, this will go across the network.
Using mnesia with distributed transactions, allows you to guarantee that every node has the data before the transaction is commited, this would give you distribution of the addressing and locking, but you would have to do everything else on top of this(like releasing the lock). Note: I have never used distributed transactions in production, so I cannot tell you how reliable they are. Also, due to being distributed, expect latency. Note2: You should check exactly how you would add more nodes and have the tables replicated, for example if it is possible without stopping mnesia.
Zookeper/doozer/roll your own, provides a centralized highly-available database which you may use to store the addressing. In this case you would need to handle unregistering yourself. Adding nodes while the system is running is easy from the addressing point of view, but you need some way to have your application learn about the new nodes and start spawning processes there.
Also, it is not necessary to store the node, as the pid contains enough information to send the messages directly to the correct node.
As a cool trick which you may already be aware of, pids may be serialized (as may all data within the VM) to a binary. Use term_to_binary/1 and binary_to_term/1 to convert between the actual pid inside the VM and a binary which you may store in whatever accepts binary data without mangling it in some stupid way.

offline web application design recommendation

I want to know which is the best architecture to adopt for this case :
I have many shops that connect to a web application developed using Ruby on Rails.
internet is not reachable all the time
The solution was to develop an offline system which requires installing a local copy of the distant database.
All this wad already developed.
Now what I want to do :
Work always on the local copy of the database.
Any change on the local database should be synchronized with distant database.
All the local copies should have the same data in other local copies.
To resolve this problem I thought about using a JMS like software eventually Rabbit MQ.
This consists on pushing any sql request into a JMS queue that will be executed on the distant instance of the application which will insert into the distant DB and push the insert or SQL statement into another queue that will be read by all the local instances. This seems complicated and should slow down the application.
Is there a design or recommendation that I must apply to resolve this kind of problem ?
You can do that but essentially you are developing your own replication engine. Those things can be a bit tricky to get right (what happens if m1 and m3 are executed on replica r1, but m2 isn't?) I wouldn't want to develop something like that unless you are sure you have the resources to make it work.
I would look into existing off-the shelf replication solution. If you are already using a SQL DB it probably has some support for it. Look here for more details if you are using MySQL
Alternatively, if you are willing to explore other backends, I heard that CouchDB has great support for replication. I also heard of people using git libraries to do that sort of thing.
Update: After your comment, I realize you already use MySql replication and are looking for solution for re-syncing the databases after being offline.
Even in that case RabbitMQ doesn't help you at all since it requires constant connection to work, so you are back to square one. Easiest solution would be to just write all the changes (SQL commands) into a text file at a remote location, then when you get connection back copy that file (scp, ftp, emaill or whatever) to master server, run all the commands there and then just resync all the replicas.
Depending on your specific project you may also need to make sure there are no conflicts when running commands from different remote location but there is no general technical solution to this. Again, depending on the project, you may want to cancel one of the transactions, notify the users that it happened and so on.
I would recommend taking a look at CouchDB. It's a non-SQL database that does exactly what you are describing automatically. It's used especially in phone applications that often don't have internet or data connectivity. The idea is that you have a local copy of a CouchDB database and one or more remote CouchDB databases. The CouchDB server then takes care of teh replication of the distributed systems and you always work off your local database. This approach is nice because you don't have to build your own distributed replication engine. For more details I would take a look at the 'Distributed Updates and Replication' section of their documentation.

Resources