In my Ruby/Rails app (using the default interpreter), I don't believe I've configured anything to make it use multiple threads. But I'm wondering how does this impact opening up a rails console to a production server that's handling regular traffic? Is rails giving my console its own hardware thread that's being used to execute my commands? Does this mean I have to worry about thread safety when modifying mutable storage via the console, say for example a file on disk?
When you start the rails console it loads a completely separate copy of your application from the server. The only thing they share is the database. So, thread safety isn't an issue, but you might still need to be mindful of accessing/mutating shared resources like database records or files.
Related
Due to a limitation of a 3rd party library, I need to use a file with a static name. What happens in Rails if multiple users are trying to write to that file at the same time? EACCESS error?
Is there a way I could circumvent this?
At the Ruby level, what will happen if multiple processes try to write to the file depends on how the library uses the file: whether and how it locks the file before opening it and what mode it opens the file in. It might just work, it might raise an error, or (most likely, if the library does nothing to handle this situation) multiple writers might silently interleave writes with one another in a way that could corrupt the file, or the last writer might win.
At the Rails level, it depends on how you run Rails. If you run a single, normally configured Rails instance on a given server, you won't have any problems, since Rails itself is single-threaded by default. If you run multiple Rails instances (presumably controlled by an application server like Passenger or unicorn) you might have problems.
Assuming the library doesn't handle multiple writers for you, you can work around it in a couple of ways:
Run only one instance of your Rails app on each server (or docker container or chrooted environment).
Fork the library and change it to include the process ID in the file name. That's what I'd do.
I'm currently developing a small-scale web app using Rails, and have been looking into the best way to go about keeping backups of the database. I've decided to use SQLite3 for the database as it's small-scale enough that it makes sense (there will ideally be barely any traffic to the website), but as the application needs to be accessible on-demand 24/7, I want to make sure that any backup method doesn't interrupt things too much.
I've found a fair few old resources online that suggest just copying the database file, but this has obvious locking problems if the file is written to whilst the copy happens. The SQLite built-in .backup command seems to be what I'm after to avoid this, but I can't seem to find a way to trigger that properly from within Rails. Using the ActiveRecord connection.execute('.backup') doesn't work because it's not valid SQL syntax, and whilst there are appropriate methods to call the backup from inside the SQLite3 gem, I'm not sure if it's possible to get down to that object level from within ActiveRecord?
I could just set up a cron job/script that runs the sqlite command-line tool and executes the backup command, but I'm worried that running that concurrently with the Rails server could still potentially present concurrency issues?
We use Unicorn to run 16 instances of a RoR app. We are implementing automated reporting with the report results emailed and/or ftp'd. The reports can take up to a few minutes to generate and so we use a threadpool.
Since we have 16 instances we don't want to have potentially 16 x #_threads connections into our database. Ideally we would have just once of the instances running the scheduled reports.
I can think of a couple of ways to do it:
1) Have one of the 16 instances somehow distinguishable from the others and this is the only instance that can run the reports. I think that this would require some coding with the unicorn api, or possibly we could use a lockfile or have a database column that has the instance number allowed to run the reports.
The disadvantage of this approach is that the instance will be included in the unicorn load balancing and so users will be on the instance while the reports are being generated. However, if the thread is working properly it shouldn't be an issue.
2) Have a separate unicorn deployment for 1 instance that runs the reports and isn't included in the apache/unicorn connection. No one will interact with this instance via the ui - it just runs the reports.
The disadvantage of this approach is that I have to remember to update this instance when deploying and it's another instance to monitor for problems.
I'd prefer #1 for support simplicity, but I'm fine with #2 too.
Does anyone have experience in this?
I originally went with the approach of using a dedicated reporting instance that ran the reports in a thread pool of size=1 (and later just a plain single thread). It appeared like it would work but when I put it under load testing I quickly found situations where activity in the main thread and the reporting thread would block or cause problems on fetches (like returning nil instead of an array)
I did some research and rails/activerecord 2 (which we're currently on) isn't threadsafe.
So now I'm going to try using the whenever gem to run the reports in a rake process. I had this working for awhile but decided against it because I didn't want to maintain an external cron (even though its configured in the app, which is nice for git).
My application is 2 fold. First it runs a thread that reads an Arduino output to get the data and store them in my model. It does that forever as the Arduino is sensing data forever. Second the main web application analyses the data and draws fancy graphs.
When I start the rails server, I would like the reading thread to run once and only once.
My current implementation is with a thread in the main controller of my application (as I need the model to store the data). Each time I refresh the page, a new thread is created. This is not what I want as concurrent access to the Arduino creates false readings.
I guess it is a pretty classic problem but I do not see what is the way of getting that behaviour with Rails. I have been looking and googling for a week now but I am still stuck.
Thanks in advance.
Pierre
Agreed with #rovermicrover, you should consider separating the task of interacting with the arduino from the web app.
To that end, I'd recommend that you create a rake task for that piece. Then you might consider managing the start/stop/restart of that rake task via foreman. It's a nice clean way to go and you can also do capistrano integration pretty easily.
Wrong Tool For the Job. Kind of. Your not going to want the rails app to monitor the Arduino output, rails isn't really ment for something like that. Your best having a separate dedicated app read the Arduino output, and then save the information to a database.
Arduino Output ---> Application Parsing Output ---> DB ---> Rails App
This way your web application can focus on web, and not be torn between jobs.
An interesting way to do this would be to have the parsing application be a ruby app, and use Active Record outside of rails in this instance. While I have never done it, people have used Active Record in simliar setups in pure Ruby apps. Here is an old example.
http://blog.aizatto.com/2007/05/21/activerecord-without-rails/
This way also when you redeploy your rails app, data will still be able to be collected. It also creates a firewall so if your rails app explodes, or goes down the data collection will be unaffected.
I've got a Rails application in which a small number of actions require significant computation time. Rather than going through the complexity of managing these actions as background tasks, I've found that I can split the processing into multiple threads and by using JRuby with a multicore sever, I can ensure that all threads complete in a reasonable time. (The customer has already expressed a strong interest in keeping this approach vs. running tasks in the background.)
The problem is that writing to the Rails logger doesn't work within these threads. Nothing shows up in the log file. I found a few references to this problem but no solutions. I wouldn't mind inserting puts in my code to help with debugging but stdout seems to be eaten up by the glassfish gem app server.
Has anyone successfully done logging inside a Rails ruby thread without creating a new log each time?
I was scratching my head with the same problem. For me the answer was as follows:
Thread.new do
begin
...
ensure
Rails.logger.flush
end
end
I understand your concerns about the background tasks, but remember that spinning off threads in Rails can be a scary thing. The framework makes next to no provisions for multithreading, which means you have to treat all Rails objects as not being thread-safe. Even the database connection gets tricky.
As for the logger: The standard Ruby logger class should be thread safe. But even if Rails uses that, you have no control over what the Rails app is doing to it. For example the benchmarking mechanism will "silence" the logger by switching levels.
I would avoid using the rails logger. If you want to use the threads, create a new logger inside the thread that logs the messages for that operation. If you don't want to create a new log for each thread, you can also try to create one thread-safe logging object in your runtime that each of the threads can access.
In your place I'd probably have another look at the background job solutions. While DRb looks like a nightmare, "bj" seems nice and easy; although it required some work to get it running with JRuby. There's also the alternative to use a Java scheduler from JRuby, see http://www.jkraemer.net/2008/1/12/job-scheduling-with-jruby-and-rails