Ruby/Rails thread safety - ruby-on-rails

I have been hacking with Ruby from time to time, but I haven't done anything big or multithreaded with it. I have heard that MRI only supports green threads and JRuby supports native threads via JVM. However, I stumble upon comments on blogs and discussion groups which say that "Rails is not thread-safe" or that Ruby itself is not thread safe. For example someone commented that there is a problem with the require statement. That sounds a bit fundamental.
I have seen a lot of Java apps which don't handle concurrency properly and I have nightmares about them from time to time :-) But at least you can write thread-safe applications in Java if you really know what you are doing (it's just not easy).
This all sounds quite alarming, can someone elaborate more - what is exactly the problem and how Rails manages to work at all if this is the case? Can I write multithreaded Ruby code which works correctly without race conditions and deadlocks? Is it portable between JRuby and MRI or do I have to hack in JVM specific code to take advantage of the JVM native threads properly?
EDIT:
I should have asked two questions, because people only seem to answer the rails threading stuff (which is nice in itself) and green threading vs. native threading. My concerns on core Ruby issues about thread safety haven't really been addressed. There seems to be at least an (unresolved?) issue with require in certain cases.

First and foremost, Ruby 1.9 (the most recent official release) now uses native (kernel) threads. Previous versions of Ruby used green threads. To answer your question succinctly, prior to 1.9, threads have not commonly been used in Ruby applications large or small precisely because they're not particularly safe or reliable.
This is not particularly alarming because prior to version 2.2 Rails made no attempt to be threadsafe, and so we typically handle asynchronous processing through the use of multiple processes, database record locking, and message queues like Starling. This is generally a pretty reliable way to scale a web application--at least as reliable than incorrectly multithreaded Java applications--and has the added advantage that it becomes easier to scale your application sideways to multiple processors and servers.
I have no idea whether the 'require' issue you've mentioned has been resolved as of 1.9, but I do humbly venture that if you're requiring libraries dynamically in new threads then you have more than one maintainability problem.
If you'd like to avoid threads entirely, Ruby 1.9 also supports fibers, which employ a shared-nothing approach to concurrency and are, from what I gather, generally easier to write and maintain than threads. Performance numbers here.

I really suggest you to watch Jim Weirich`s speech from RubyConf 2008 (it's very funny and informative:) :
https://www.youtube.com/watch?v=fK-N_VxdW7g
This one is nice too:
http://rubyconf2008.confreaks.com/summer-of-code-rails-thread-safety.html

The normal solution for MRI is to run multiple Rails instances, each handling requests independently. Since MRI isn't multithreaded anyway, you can't run multiple Rails instances on top of it. This means you take a memory hit since Rails is loaded once per Ruby process.
Since JRuby supports native threads, you could always run several Rails instances in a single JVM. But with Rails being thread-safe, you can cut it down to one, which means lower memory usage and less JIT compilation.
Charles Nutter (JRuby) has a nice summary.

I think the previous posters covered the Rails cases pretty well, so I won't bother going into that sort of stuff.
It's certainly possible to write threaded Ruby applications. Some of the problems that exist with ruby threads is that they are 'green' in as they are managed by the virtual machine. Currently, the default interpreter (MRI) only has one true system thread that need to be shared by all of the threads that the interpreter has control over.
The downside to this is that if you have a computer with multiple processors or cores you can't have a thread in your application running on some other core. This is a pretty big deal for people running servers and high performance apps.
As for your interpreter-specific-code question: I don't think so. AFAIK you don't have to do anything special to take care of JRuby/JVM threads.
Also: This article over on Igvita that takes a good look at the state of concurrency in Ruby.

Related

The need to write code in another language for rails production app?

I have seen several reference to developers having to write certain parts (background processing, perf. intensive tasks) of their rails apps in a language other than Ruby (Java or C) once user load increased (it seems that Twitter is an ex. of this). It is possible that these were issues that appeared mostly during the Rails 1.0 -2.0 era. However, I wanted to confirm with devs who run real production applications if this indeed still is the case.
In my experience there is rarely, if ever, a need to write the server-side parts of your Rails application in a language, other than Ruby.
The issues suffered by companies like Twitter, are well outside the realm of what most web applications will ever have to deal with, and so for the sake of this discussion, they should be ruled out. Twitter is an extreme outlier.
However, in most non-trivial Rails apps, there is usually a requirement for background job queuing and processing, although these jobs are also usually written in Ruby.
A good rule to adhere to is that you should never make a user wait, and return responses to HTTP requests immediately, without unnecessarily tying up (Ruby-backed) request processors. For example, if a large image is uploaded and it requires a number of intensive transformations - this should be done asynchronously and offline. This is also true of other web application frameworks and languages, regardless of their performance.
If you need more performance for background jobs, you can simply scale out the number of worker processes that are processing jobs. This is also true for serving Rails requests - you can for example add more Unicorn worker processes to process more requests.
Whilst Ruby is indeed significantly slower than C or a warmed up JVM, it is usually "fast enough".
If you are using MRI Ruby, then you should be using Ruby 1.9.3, which is at least twice as fast as the 1.8.x series, and has slightly improved garbage collection. JRuby also performs well, and gets significantly more performant with every release.
Finally, modern Rails apps offload a great deal more processing to the client, and typically contain more JavaScript / CoffeeScript than they used to. They frequently use more partial AJAX updates to the View, which results in less time spent rendering by the Rails stack, and smaller, lighter database queries with significantly less server-side Ruby processing, than would have been seen, say, 4 years ago.
The biggest risks to your application are not performance-related, and will usually be rapidly changing requirements and the need to deliver on time and on budget. This is true of any technology choice, although Rails is known to help mitigate in these areas. I am writing this from the position of a developer that has used many frameworks and languages over the years, which include measurably faster languages and stacks, i.e. PHP, C# / ASP.NET MVC.

What things do Ruby or Rails not handle well? Are there any situations or cases that they're suboptimal for?

I'm trying to come up with things that Ruby (or Rails) either doesn't handle well, or things that are way too hard to do in Ruby.
So far I'm having a tough time, but I figured some people on here MUST have know some things that Ruby or Rails don't handle too well.
Anyone?
Ruby is a language. Rails is a framework. Many of the things Rails isn't good at, such as anything not relating to a web framework, Ruby handles with ease.
The other question of what Ruby as a language is not good at is simple. Anything extremely performance intensive is probably better written in C. Ruby won't run natively on most smart phone devices so mobile apps are out. Ruby is not designed for embedded devices, so powering the next space shuttle launch is also a no go. Furthermore the lack of a maternal instinct make Ruby a bad choice to watch young infants.
There is nothing, it's simply perfect. ;-)
Ok, some downsides:
Ruby has questionable parallelism and threading support. See for more details: http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/
Windows support isn't up to par since most Ruby developers simple don't care (like me)
The stuff you'll most commonly hear about scaling issues is a myth. Unless you're making a second twitter perhaps.
There's very little you flat-out couldn't do in Ruby, but there's a few things you wouldn't want to do, mainly involving highly numerical computation. For most of those you could easily write a binding to a C-based API (or some other more performant library.) Image processing, for example, is something that would be dog slow for any non-trivial example in pure Ruby, but you can use RMagick to do it, which is a binding to the much-faster ImageMagick library.
Just about any other use for Ruby is fair game. I've written GUI apps with it, a lot of system services and more one-off scripts than I could count.
Well, it's a framework, so it optimizes for the most common cases. If your app requires unordinary and bizarre things (eg huge performance requirements, needs to use non-Ruby libraries), then Rails might not be suitable.
It seems to me that whenever a company hits these cases (usually performance rather than functionality or integration with other systems) they have to write their own stuff - Google has Big Table, Facebook has their own webserver, etc.
If you're in this position, you're most likely rolling in money and spending some of it to rewrite your code isn't going to be an issue.
However, Rails is great for most normal apps! I don't think it has any gaps that might cause pitfalls in normal cases.

Ruby on Rails Drawbacks

Ruby on Rails is maybe the most praised web development framework exist. There are tons of reasons for that, but every framework, even the best of its kind, has its drawbacks.
I'd like to know the most common problems you run into when developing Ruby on Rails applications and the issues you often struggle with.
Frequent version updates to the framework make it hard to keep your app up to date - upgrades can break in obscure places.
MonkeyPatching - can cause serious problems for anyone that has to maintain a large codebase over a long period of time.
Performance/Power - Easy to shoot yourself in the foot with memory-consuming database queries using Ruby iterators with ActiveRecord.
But I'll take it over Java or PHP any day
I'm not thrilled about Rails' use of global variables for everything. Model classes find the database connection through a global variable (ActiveRecord::Base.connection), there is the Rails class which is a global access point for things like the logger, the current environment, for caching, etc. ActionMailer makes global variables out of your mailers, and so on and so forth. Rails is built around the use of global variables, so that whatever you do, at any level of the application, you can always reach for a global variable.
This makes testing ugly. If Rails was built on Java, it would make testing really, really hard, but since it's Ruby it just gets ugly. Tests need to stub out a lot of global context in order to run in isolation, and that can easily make tests seem nonsensical. It's not uncommon to see five or ten lines of code that stubs out different global variables, followed by one or two lines of actual test. It's not that five or ten lines of setup for a test is a problem, but without reading the code under test you can't easily see what impact the global state will have, and how it is significant. This makes many tests unnecessarily ugly.
I find it a bit ironic that the Rails community is the most test savvy of any that I've been part of.
Having said that I wouldn't trade Rails for anything that is currently available. The speed at which you can get things done, and the huge number of plugins and gems that removes all the tedious work just blows me away every day.
Like any tech, there's a learning curve. But as a relatively new framework, DHH et al, have been able to "stand on the shoulders of giants" (predecessors) and have produced a great framework.
I've been very happy with choosing Rails as the framework for my commercial sw.
Disadvantages? Not as many libraries as older frameworks such as Java and Perl. -- But there are ways around that problem. Eg call those libraries from Rails or port them.
Performance is usually mentioned in the disadvantages category but cheaper hardware and improvements in later versions of rails have taken care of that. Same with "stability."
For multi-threaded applications, Ruby Threads are called green threads, which are not OS level threads. This can't provide true multi-threading.

What are the current state of affairs on threading, concurrency and forked processes, in Ruby on Rails?

Ruby on Rails does not do multithreaded request-responses very well, or at least, ActiveRecord doesn't.
The notion of only one request-response active at the same time can be a hassle when creating web applications which fork off a shell-command that takes long to finish.
What I'd like are some of your views on these kinds of setups? Is Rails maybe not a good fit for some applications?
Also, what are the current state of affairs in regard to concurrency in Ruby on Rails? What are the best practices. Are there workarounds to the shortcomings?
Rails currently doesn't handle concurrent requests within a single MRI (Matz Ruby Interpreter) Ruby process. Each request is essentally wrapped with a giant mutex. A lot of work has gone into making the forthcoming Rails 2.2 thread-safe, but you're not going to get a lot of benefit from this when running under Ruby 1.8x. I can't comment on whether Ruby 1.9 will be different because I'm not very familiar with it, but probably not I'd have thought.
One area that does look very promising in this regard is running Rails using JRuby, because the JVM is generally acknowledged as being good at multi-threading. Arun Gupta from Sun Microsystems gave some interesting performance figures on this setup at RailsConf Europe recently.
Neverblock allows for non blocking functionality without modifying the way you write programs. It really is an exciting looking project, and was backported to work on Ruby 1.8.x (it relies on Ruby 1.9's fibers). It works with both PostgreSQL and MySQL to perform non-blocking queries. The benchmarks are crazy...
Matz's Ruby 1.8 uses green threads, and Matz's Ruby 1.9 will use native O/S threads. Other implementations of Ruby 1.8, such as JRuby and IronRuby, use native O/S threads. YARV, short for Yet Another Ruby VM, also uses native O/S threads but has a global interpreter lock to ensure that only one Ruby thread is executing at any given time.
If what you run at the shell is not necessary for the rendering of the page (e.g. you're only triggering maintenance tasks or something), you should start them as background processes. Check out starling and workling.
If this doesn't apply to your situation, you'll have to make sure multiple instances of your app servers run. Traditionally people would start multiple instances of Mongrel. But now I'd say the easiest way to have a solid setup is by far using Phusion Passenger. It's an Apache module that lets you easily specify how many instances (min and max) of your app servers you want to have running. Passenger does the rest. And if I recall correctly, it doesn't do stupid round robin for dispatching requests. I think it's by availability.
Ruby 1.9 is adding lightweight Fibers:
http://www.infoq.com/news/2007/08/ruby-1-9-fibers

What can you NOT do in Rails that you can do in another framework?

I'd like to know situations in which I should consider using a framework other than Rails.
Two things. First, Ruby is a relatively young language, and you may run into brick walls when trying to do slightly more esoteric things (like connect to non mainstream or older types of datasources). It also has poor GC, and no kernel threads, both of which are very important for a high performance platform. The main codebase (MRI) is quite hacky (lots of clever obfuscating programmer tricks like macros) and there are parts that are poorly written (gc and thread scheduling leap to mind). Again, it is a very young platform that got very popular very fast.
Secondly, while ruby the language and rails the ideas/paradigm are both phenomenal, ruby and rails the platforms are not. There is a hell of alot in both ruby and rails that is downright ugly, and deployment solutions are in the dark ages compared to what is considered normal for other platforms (php/asp/jsp).
Being accused of trolling here, so I will expound a bit. Due to the threading model, Rails cannot process requests concurrently unless you launch multiple full instances of your rails app. To do that you have two options, the relatively young and still under development passenger (mod_rails), or the tried and tested apache load balancer with multiple mongrel instances behind it.
Either way, the lack of the ability to just just spawn workers means you will want 5-10 full instances of your application running, which incurs a very large overhead (can easily be 300-500megs per app depending on your gems and how big your app is). Because of that, the infrastructure needed to serve rails is a hell of alot more complecated then most other things.
Now, that being said, the situation has been continuously getting better (I mean, passenger is usable now, it wasn't the last time I had to deal with deploying a rails app). I would be very surprised if rails doesn't catch up in the next few years.
Also, rubinius/jruby are doing things the right way, and are moving along at a great pace. I wouldn't be suprised if MRI gets dropped in the next few years in favor of one of those implementations for mainstream rails work.
Ruby on Rails isn't trying to be an end-all be-all web development framework. If you're going to build an application that is predominantly built using CRUD operations, you want to use a lot of AJAX, and you have total control over the database, then Ruby on Rails is one of a few excellent options. If you're doing something else, then there is a probably another framework that is a better match for your requirements.
edit: Matt amended his answer tastefully :) I've removed my own comments pointing out the things he's fixed.
Yes, Ruby definitely has some shortcomings. Green threads being a huge one. But as Matt has said, things are moving along in better directions.
The other posts are pretty much on the money. Decently simple CRUD apps are best suited for rails, though there are other frameworks you can try in Ruby that offer more flexibility.
Here's a great (and might I add objective) example of where not to use rails: Does the Rails ORM limit the ability to perform aggregations?
Wow, way to start a flamewar!
I'm going to start it off by saying that Rails will work for most apps. However, if you need to do a lot of async type work (like messaging between systems, such as getting a request, placing it in a queue and processing it in a different thread, or even on another machine), Rails is probably not your best choice. Ruby, at least at the current time, is not really strong on multithreaded code.
Let the insults fly!

Resources