Smallest server setup for Raspberry Pi - scalability not important - ruby-on-rails

I'm building a system that manages home HVAC and uses a Raspberry Pi running Ruby on Rails 3.2 with Ruby 2.0. Nearly all web content is dynamic. Scalability is not important, as it's only used within a home network and by only one or two simultaneous users. What's important is using minimal memory. Reliability is also very important. Fast is good. Does it make sense to just use the default Webrick server in production, or is there value in fronting it with say Apache or Cherokee and/or replacing it with Passenger or Puma or something else?

Don't use use Webrick in production environments. It lacks support for handling concurrent requests and is optimized for development. Adding a fully blown webserver like Apache would not help you to reduce the memory.
I recommend to use Thin in your scenario. It's a fast and lightweight webserver written in Ruby.

Related

Ruby on Rails server requirements

I use rails for small applications, but I'm not at all an expert. I'm hosting them on a Digital Ocean server with 512MB ram, which seems to be insufficient.
I was wondering what are Ruby on Rails server requirements (in terms of RAM) for a single app.
Besides I can I measure if my server is able to support the number of application on my server?
Many thanks
It depends on how much traffic you think you need to handle. We have two machines (a 32GB RAM, usage see below) with 32 unicorn workers two serve one app with loads of traffic and we have one machine with loads of 2 worker apps that have very few traffic.
We also have to consider the database (which needs the most RAM by far in our case due to big caches we granted it). And on top of that all we have *nix which caches the filesystem in unused RAM.
Conclusion: It is very hard to tell without you telling us what sort of traffic you expect.
Our memory usage on one of the two servers for the big app: https://gist.github.com/2called-chaos/bc2710744374f6e4a8e9b2d8c45b91cf
The output is from a little ruby script I made called unistat: https://gist.github.com/2called-chaos/50fc5412b34aea335fe9

Neo4j rest server v/s embedded

In which mode neo4j database should be used embedded or rest server?
My main concerns are :
Performance
Horizontal scaling (HA,Clustering) - essential as application is very big.
Transactional support(in frameworks like SDN,Grails Plugin,structr etc.)
Deployment server support like amazon,GrapheneDB etc.
Easiness of switching from one to another
Scaling(size of database)
Disclaimer: I'm one of the founders of GrapheneDB.
I'm not an expert in embedded mode so my answer might be biased but I will try my best:
Embedded is more performant at this time than server
Clustering is supported in embedded as well as in server
Transactional support is available in both modes AFAIK. Spring Data, however has currently bad performance over Rest/server.
From my POV embedded has the disadvantage of being coupled to your app/server deployment.
There is one more option which you haven't brought up, which is using unmanaged server extensions.
Using extensions you can get the best of both modes:
You write your code on top of the Java API and it's executed locally, so you get extremely good performance.
You can run the server in server mode, making operations easier and also enabling you to host on a separate remote host, on any cloud environment.
GrapheneDB supports unmanaged extensions and it's the option we currently recommend for scenarios where extra performance is needed.

Optimal Hosting for a Statistical Analysis Rails App

I am developing a web app that performs regression analysis on user data.
on the backend, RoR is taking care of application logic, and all statistical analysis is done by R (since Ruby has poor stat packages).
Given that both R and RoR are single-threaded, and that the app is expected to be used concurrently by several users - I need your advice on the optimal configuration.
for example: should I run the R and RoR machines on separate instances and have RoR communicate with R via REST? run both on the same machine which can be clustered? use Revolution Analytics?
what would be a good hosting configuration to allow scalability of my app?
You could create a proxy to communicate to multiple webservers, and in turn each of these webservers communicates via a proxy to several R_servers. To have the proxy servers balance the load, you can look into something like Nginx's upstream directives.
The diagram below shows 3 webservers (which are exact clones of each other), and 3 R_servers (which are exact clones of each other). Use however many you need, since it's easy to add/remove the webservers or R_servers to scale horizontally.
webserver1 R_server1
/ \ /
proxy - webserver2 - proxy - R_server2
\ / \
webserver3 R_server3
Look at Rserve which, when hosted on Linux, forks off a new instance on every connection.
Connection is over the network, and there are Ruby clients available as indicated by a Google search

How to deploy a [Ruby on Rails] site in a scalable way?

I have been working on my [first] startup for a month now, and while it's probably atleast one more month away from an alpha release, I want to know how to deploy it the right way. The site will have an initial high amount of load (network + CPU) for a new user, so I am thinking of having a separate server/queue for this initial process, so that it doesn't slow down the site for existing users.
Based on my research so far, I am currently leaning towards nginx + haproxy + unicorn/thin + memcached + mysql, and deploying on Linode. However, I have no prior experience in any of the above; hence I am hoping to tap the community's experience.
Does the above architecture seem reasonable? Any suggestions/articles/books that you would recommend?
Is Linode a good choice? Heroku/EY seem too expensive for me (atleast until I have enough revenue), but am I missing some other better option? MediaTemple?
Any good suggestions on the load balancing architecture? I am still reading up on this.
Is it better have 2 separate Rails server instances on 2 separate linodes, or running 1 instance on a linode of twice capacity (in terms of RAM/storage/bandwidth)? How many Linodes should I start with?
Which Linux distribution should I choose? (Linode offers 8 different ones - http://www.linode.com/faq.cfm) Are there any relative advantages/disadvantages between them for a Rails site?
I apologize if any of my questions are stupid or contradictory; please attribute it to my inexperience.
Architecture
You're on the right track. I personally prefer Passenger over thin/unicorn (having run nginx to thin backends for a long while) just for the convenience, but your proposed setup is fairly standard. If you're on Ruby 1.8.7, I'd recommend that you consider REE + Passenger for framework memory savings, though.
Hosting & Load Balancing
Linode is fantastic, and I use them for just about everything I can, but you will need to be aware of RAM limits. Each Rails processes uses a nontrivial amount of RAM, and you'll want to avoid getting the machine into swap. Plan on running enough Rails instances per machine so that your memory allocation is about 90% of the memory on the Linode. You'll likely want another Linode dedicated to your database, though you can start with them both on the same machine; just be prepared to split off MySQL as you grow. You can set up communications between Linodes in the same data center on private IPs, which don't count against your bandwidth quota.
Your scaling strategy should be as horizontal as possible, so I'd recommend just getting a second Linode and adding it to your haproxy pool when you need more horsepower - Linode charges you $20 for 512mb more RAM, or you can just get a whole 'nother Linode (with CPU, RAM, HDD, and bandwidth quota) for that same $20. Seems a no-brainer!). In Rails' case, an instance is an instance is an instance, so it really doesn't matter if it's on the same VM or not, as long as the time to connect to your database machine or whatnot are more or less the same. You could be running 10 Linodes each running 10 Rails processes apiece without much of an issue. Linode also offers IP failover, so that if your primary Linode (with haproxy) goes down, it can fail over automatically to a secondary Linode, which you would then have haproxy running on, and ready to act in the same capacity as the first.
Distribution
Honestly, this is up to you! Many folks go with Ubuntu or Redhat (CentOS/Fedora) distros - I like CentOS myself - but it's really just about what you feel most comfortable with. If you don't have a favorite distro, I would recommend trying Ubuntu/CentOS, as they tend to be quite friendly to the beginner, and have extremely robust community support.
You will probably want to pick a 32-bit distro unless you have a compelling reason to pick a 64-bit distro; 64-bit executables require more RAM than their 32-bit counterparts, and since RAM is likely to be your most precious resource, it makes sense to save it where you can.

Proxy choices: mod_proxy_balancer, nginx + proxy balancer, haproxy?

We're running a Rails site at http://hansard.millbanksystems.com, on a dedicated Accelerator. We currently have Apache setup with mod-proxy-balancer, proxying to four mongrels running the application.
Some requests are rather slow and in order to prevent the situation where other requests get queued up behind them, we're considering options for proxying that will direct requests to an idle mongrel if there is one.
Options appear to include:
recompiling mod_proxy_balancer for Apache as described at http://labs.reevoo.com/
compiling nginx with the fair proxy balancer for Solaris
compiling haproxy for Open Solaris (although this may not work well with SMF)
Are these reasonable options? Have we missed anything obvious? We'd be very grateful for your advice.
Apache is a bit of a strange beast to use for your balancing. It's certainly capable but it's like using a tank to do the shopping.
Haproxy/Nginx are more specifically tailored for the job. You should get higher throughput and use fewer resources at the same time.
HAProxy offers a much richer set of features for load-balancing than mod_proxy_balancer, nginx, and pretty much any other software out there.
In particular for your situation, the log output is highly customisable so it should be much easier to identify when, where and why slow requests occur.
Also, there are a few different load distribution algorithms available, with nice automatic failover capabilities too.
37Signals have a post on Rails and HAProxy here (originally seen here).
if you want to avoid Apache, it is possible to deploy a Mongrel cluster with an alternative web server, such as nginx or lighttpd, and a load balancer of some variety such as Pound or a hardware-based solution.
Pounds (http://www.apsis.ch/pound/) worked well for me!
The only issue with haproxy and SMF is that you can't use it's soft-restart feature to implement the 'refresh' action, unless you write a wrapper script. I wrote about that in a bit more detail here
However, IME haproxy has been absolutely bomb-proof on solaris, and I would recommend it highly. We ship anything from a few hundred GB to a couple of TB a day through a single haproxy instance on solaris 10 and so far (touch wood) in 2+ years of operation we've not had any problems with it.
Pound is an HTTP load balancer that I've used successfully in the past. It includes a dynamic scaling feature that may help with your specific problem:
DynScale (0|1): Enable or disable
the dynamic rescaling code (default:
0). If enabled Pound will periodically
try to modify the back-end priorities
in order to equalise the response
times from the various back-ends. This
value can be overridden for specific
services.
Pound is small, well documented, and easy to configure.
I've used mod_proxy_balancer + mongrel_cluster successfully (small traffic website).

Resources