Recently I upgraded our EC2 instances to c3.large in order to boost more unicorn workers to handle increased traffic to our site. (6xworkers per machine - 2 ec2 instances).
When I did this, I started seeing duplicate records being created! It looks like somehow, that maybe 2 unicorn workers are attempting to process the same request?
In my nginx logs I see ONE [POST] request to a particular rails controller - yet two records are being created in the database. Obviously I'm hitting some kind of race condition issue - but have no idea how to debug this. It doesn't happen all the time, but when it does it's frustrating as well. I also noticed that the two database records are within 5 seconds of each other. (And yes I disable the button when the user clicks the submit button).
Unicorn 4.8.2
Ruby 1.9.3
Rails 3.2.14
Any help would be great! Thanks!
Although it may not fix your problem, upgrade to Ruby 2.0.0. There are some blog mentioning that is a good idea to upgrade to Ruby 2.0.0 if you are using Unicorn.
For example: https://www.digitalocean.com/community/articles/how-to-optimize-unicorn-workers-in-a-ruby-on-rails-app
Specifically:
If you are using Ruby 1.9, you should seriously consider
switching to Ruby 2.0. To understand why, we need to understand a
little bit about forking.
Forking and Copy-on-Write (CoW)
When a child process is forked, it is
the exact same copy as the parent process. However, the actual
physical memory copied need not be made. Since they are exact copies,
both child and parent processes can share the same physical memory.
Only when a write is made-- then we copy the child process into
physical memory.
So how does this relate to Ruby 1.9/2.0 and Unicorn?
Recall the Unicorn uses forking. In theory, the operating system would
be able to take advantage of CoW. Unfortunately, Ruby 1.9 does not
make this possible. More accurately, the garbage collection
implementation of Ruby 1.9 does not make this possible. An extremely
simplified version is this — when the garbage collector of Ruby 1.9
kicks in, a write would have been made, thus rendering CoW useless.
Without going into too much detail, it suffices to say that the
garbage collector of Ruby 2.0 fixes this, and we can now exploit CoW.
Related
our Rails 4 app is running in in a thin -s 16 ... multiprocess server with Apache as frontend and its reverse proxy handling the internal requests. All works fine and well, the performance is OK for our number of users.
Because everything works just out of the box, I didn't really bother about how Thin actually works. I did stumble about all the Fibers and EventMachine goodness recently though and read up a lot on it.
Thin is using EventMachine for handling Rack requests. So by design it would be able to handle multiple requests in parallel with one ruby process. Unfortunately, while the Thin documentation is functional enough in making you able to run the server in a few minutes' worth of work, it is rather quiet about the internal workings.
Am I correct in assuming that all talk about "concurrency" is moot as soon as I run Thin with Rails? ActiveRecord, the DB drivers, the template processing, etc. are likely not EM enabled, not using Fibers etc., hence will block the single process anyways, while at the same time using most of the server-side processing time in DB-heavy applications.
All my research seems to lead to this conclusion; unfortunately the Thin website/docs say nothing about it. Frankly I am baffled how the word "concurrent" is even entering the picture here...
Can someone clear that up for me? Am I missing some fundamental piece of information, or is the "concurrency" touted on http://code.macournoyer.com/thin/ meant in regards to manually crafted Rack servers which don't use Rails or other "heavy" middlewares?
Thanks!
You are right, when using with Rails it will not be concurrent. Any call to db driver / file system will block the entire ruby interpreter. It may be partially changed in Rails 5, but for now you just need to run a lot of worker processes.
If we say Node.js is single threaded and therefore there is just one thread that handles all the requests, what is Rails?
As I understand, Node.js is both the application and the server, but I am lost on what Rails would be? How does Rails handle requests in terms of threads/processes?
Rails can be single-threaded, it can be multi-threaded, it can be multi-process (where each process is single-threaded), or it can be multi-process where each process is multi-threaded.
It really all depends upon the app server you're using, and it kind of depends upon which Ruby implementation you're using. MRI Ruby supports native threads as of 1.9, but it still maintains what's known as a global interpreter lock. The GIL prevents the Ruby interpreter from running in multiple threads at a time. In most cases that's not really a big deal though, because the thing threads are helping with the most is waiting for I/O. If you're using either JRuby or Rubinius, they can actually run Ruby code in multiple threads at a time.
Check out the different app servers and what they offer in terms of concurrency features. Unicorn is a common one for deploying multi-process/single-threaded applications. Puma is a newer app server that's capable of running multi-threaded applications, and I believe they're either adding (or maybe have added by now, I'm not sure) the ability to run multi-process as well. Passenger seems to be able to work in every model I've listed above.
I hope this helps a little. It should at least give you some things to Google for to find more information.
Running a pretty sizable rails app, we've recently gotten around to upgrading it to rails 3.
Our stack is ruby-1.9.3p484, rails 3.2.16 and passenger 4.0.23 running on top of apache.
After throwing some traffic at a couple of our machines, we started noticing a few really strange errors coming down.
Things like random methods not being defined on objects that would obviously have them, instance variables being nil inside AR associations, and objects just being randomly replaced with 'false'. Just all around strange behavior.
Inspecting apache's logs gave us another bit of info, namely that as these errors were coming in, more often than not, their respective processes would keel over as well, on random bits of the app.
Sometimes it would be just a ruby node getting passed in as null, other times it would be just some random string overflowing, just random stuff getting mangled about.
None of this happened during testing, so the only 'reliable' way of reproducing this thus far has been to just throw traffic at the respective machines and see when / if they start exhibiting this behavior.
Having gone through all of this, here's a list of things we've ruled out up to now:
passenger's oob garbage collection
rails 3 itself ( apparently we'd been getting these before as well, but they were far enough apart to not set off any alarms )
serialization / shoving things in and out of memcached
libxml - there were some reports about of version 2.5.0 causing memory corruption, upgrading to 2.7.0 didn't really make a difference
turning off prelinking ( this can cause memory corruption, per https://www.ruby-forum.com/topic/205897 )
Returning the GC settings to stock seems to have alleviated the problem, but we don't really have anything conclusive in that regard. It would seem though that more collections result in a lower occurrence rate for the issue.
Any thoughts on what might be causing this or we could use to help us pinpoint the issue?
I've had 1.9.3-p484 segfault at_exit on test runs as well but I haven't looked into it yet. In my case it seems to be triggered by certain dependencies for the test suite.
We've also had problems with another project on Rails 3 that ended up abandoning their port to Rails 3 and sticking with 2.3 ):
Have you tried running the app outside of Apache / Passenger?
I work with several Rails apps, some on Rails 3.2/Ruby 2.0, and some one Rails 2.3/Ruby 1.8.7.
What they have in common is that, as they've grown and added more dependencies/gems, they take longer and longer to start. Development, Test, Production, console, it doesn't matter; some take 60+ seconds.
What is the preferred way to first, profile for what is causing load times to be so slow, and two, improve the load times?
There are a few things that can cause this.
Too many GC passes and general VM shortcomings - See this answer for a comprehensive explanation. Ruby <2.0 has some really slow bits that can dramatically increase load speeds; compiling Ruby with the Falcon or railsexpress patches can massively help that. All versions of MRI Ruby use GC settings by default that are inappropriate for Rails apps.
Many legacy gems that have to be iterated over in order to load files. If you're using bundler, try bundle clean. If you're using RVM, you could try creating a fresh gemset.
As far as profiling goes, you can use ruby-prof to profile what happens when you boot your app. You can wrap config/environment.rb in a ruby-prof block, then use that to generate profile reports of a boot cycle with something like rails r ''. This can help you track down where you're spending the bulk of your time in boot. You can profile individual sections, too, like the bundler setup in boot.rb, or the #initialize! call in environment.rb.
Something you might not be considering is DNS timeouts. If your app is performing DNS lookups on boot, which it is unable to resolve, these can block the process for $timeout seconds (which might be as high as 30 in some cases!). You might audit the app for those, as well.
Ryan has a good tutorial about speeding up tests, console, rake tasks: http://railscasts.com/episodes/412-fast-rails-commands?view=asciicast
I have checked every methods there and found "spring" the best. Just run the tasks like:
$ spring rspec
The time for your first run of spring will be same as before, but the second and later will be much faster.
Also, from my experience, there will be time you need to stop spring server and restart when there is weird error, but the chance is rare.
For ruby 2 apps, try zeus - https://github.com/burke/zeus
1.8 apps seem to boot much faster than 1.9, spork might help? http://railscasts.com/episodes/285-spork
I have done a few projects using Ruby on Rails. I am going to use JRuby on Rails and hosting it on GAE. In that case what are the stuff that I need to know while developing JRuby apps. I read that
JRuby has the same syntax
I can access Java libraries
JRuby does not have access to some gems/plugins
JRuby app would take some time to load for the first time, so I have to keep it alive by sending
request every 5 mins or so
I cannot use ActiveRecord and instead I must DataMapper
Please correct if I am wrong about any of the statements I have made and Is there anything else that I must know?. Do I need to start reading about JRuby from the scratch or I can go about as usual developing Ruby apps?
I use JRuby everyday.
True:
JRuby has the same syntax
JRuby does not have access to some gems/plugins
I can access Java libraries
Some gems/plugins have jruby-specific versions, some don't work at all. In general, I have found few problems and as the libraries and platforms have matured a lot of the problems have gone away (JRuby has become a lot better).
You can access Java, but in general why would you want to?
False:
JRuby app would take some time to load for the first time, so I have to keep it alive by sending request every 5 mins or so
I cannot use ActiveRecord and instead I must DataMapper
Although I guess it is possible to imagine a server setup where the initial startup/warmup cost of the JVM means you need to ping the server, there is nothing inherent in JRuby that makes this true. If you need to keep the server alive, you should look at your deployment environment. Something similar happens in shared-hosting with passenger where an app can go out of memory after a period of inactivity.
Also, we use ActiveRecord with no problems at all.
afaik, rails 3 is 100% compatible with jruby, so there should be no problem on that path.
like every new platform, you should make yourself comfortable with it by playing around with jruby. i recommend using RVM to do that.
as far as you questions go:
JRuby is just an other runtime like MRI or Rubinus
since JRuby is within the JVM using Java is very easy, but you can also use RJB from MRI
some gems are not compatible, when they use native c libraries, that do not run on JRuby
the JVM and your application container need startup time and some time to load your app, but that is all, there is no need for keep alive, that is wrong
you can use whatever you want, most gems are updated to be compatible with JRuby
#TobyHede mostly covered issues that you thought of you might have so I'll leave it at that.
As for other things to have in mind, it's simply a different interpreter and funny discrepancies will crop up that will take some adaptation.
some methods are implemented differently, such as sleep 10.seconds will throw exception (you have to sleep 10.seconds.to_i) and I remember getting NoMethodError on Symbol class when switching from MRI to JRuby (don't remember which method wasn't implemented), just have in mind slight variations will be there
you will experience hangs and exceptions in gems that otherwise worked for you (pry for example when listing more then one page)
some gems may work differently, pry (again) will exit if you press ctrl+c for example, pretty annoying
slightly slower load times of everything and no zeus
you'll get occasional java exception stack traces with no indication on which line of ruby code it happened
Timeout.timeout often will not work as expected when its wrapped around net code and stars align badly (this has mostly been fixed in jruby core, but it seems to still be an issue with gems that do their own netcode in pure java)
hidden problems with thread-safety in third party code How do you choose gems for a high throughput multithreaded Rails app? - stay away from EventMachine for example
threads will be awesome (due to nativeness and no gil) and fibers will suck (due to no coroutine support in JVM they're ordinary threads), this is why you often won't get a performance boost with celluloid when compared to MRI
you used to run your rails with MRI Ruby as processes in an OS, you knew how to track their PIDs, bloat, run times, kill them, monitor them etc, this part is not evident when you switch to JRuby because everything has turned to threads in a single process. Java world has very good tools to handle these issues, but its something you'll have to learn
killall -9 ruby doesn't do the trick with jruby when your console hangs (which it does more often then before), you have to ps -ef and then track the proper processes without killing your netbeans etc (minor, but annoying)
due to my last point, knowing Java and the JVM will help you get out of tight spots in certain situations (depending on what you intend to do this may be something you actually really need), choice of deployment server will increase or decrease this need (torquebox for example is a bit notorious for this, other deployment options might be simpler, see http://thenerdings.blogspot.com/2012/09/pulling-plug-on-torquebox-and-jruby-for.html)
...
Also, see what jruby team says about differences, https://github.com/jruby/jruby/wiki/DifferencesBetweenMriAndJruby
But yeah, otherwise its "just the same as MRI Ruby" :)