We're thinking about using the rufus-scheduler gem on a Ruby on Rails project to do regular monitoring of a communication queue. Has anyone have experience using this gem on a Rails project? Anyone have strong preferences of an alternative scheduler?
I find cron and script/runner is usually enough, but I don't think you'll really go wrong with rufus-scheduler.
Just make sure the scheduled tasks you're running are sufficiently abstracted so that if you decide to change your mind later on about how these tasks are run, it isn't a big problem.
I say experiment and run with it if you like it.
Related
I have a Rails application that is importing data from various third parties. The jobs are taking a long time and I am looking into how I can use threads to speed this up. I know nothing about Java so apologies if this makes no sense.
No, JRuby is an alternate Ruby interpreter, so you cannot "switch" to it in the middle of running MRI (the standard Ruby interpreter, written in C).
You can create threads in MRI, but many people use a background job queue to handle this type of problem. If you really wanted to, you could also write a second application in JRuby that your first application made remote calls to.
Yes, you can, as long as you intend to run it in a separate process like a rake task for example,
in Gemfile you'll have to handle both options where gem sets are different, for example:
platform :ruby do
gem "pg"
gem "ruby-oci8", '>= 2.1.4'
end
platform :jruby do
gem "activerecord-jdbcpostgresql-adapter"
end
and you need to decide what to do with Gemfile.lock. One option is to keep 2 Gemfile.lock on git, one for JRuby gems and one for MRI Ruby gems - Gemfile.lock.mri and Gemfile.lock.jruby and then symlink accordingly when running bundle.
The other question is if you should, the above approach is a bit finicky and I'm generally using it for building gems or for playing with porting an app to JRuby.
Spawning multiple processes and background job queues is generally how things are done in MRI Ruby, it seems an overkill to introduce a whole new ruby for purpose of a single import task.
That said, I've had situations (on massive data imports done through ruby) where proper multithreading would have saved me a LOT of hassle, but in the end I always fell back to partitioning the problem and running it in separate processes (every odd id in one rake task, every even id in another, for example - bear in mind you might need to rebuild tables in database if you do this or possibly face very slippery regression issues, do a benchmark definitely).
I'm a new member of ruby on rails, and i'm researching for job schedulers in rails, but i am quite confused because having many schedulers such as rufus, whenever, resque.... Could you show me some information, documents or advice ? thank you so much!
Ruby Toolbox is a good resource to know about when you are considering among various options. It shows which gems are most popular for a particular type of task.
The two categories of tools that apply to your question are Scheduling and Background jobs
Any of resque, delayed_job, rufus-scheduler, Sidekiq, whenever and other gems listed above will be able to help with the requirement, I would recommend delayed_job for a total beginner - as it is easy to setup and learn about.
Best to check out the Railscasts episode on delayed_job to start with.
If you are interested in exploring the other options, it is likely there is a Railscasts episode for that.
Resque, delayed_job and Sidekiq - for background jobs through job queue.
rufus and whenever for scheduling.
Rufus runs inside application when server initilize, 'whenever' runs outside through environment when you deploying application or start it manualy. So Rufus dont work without application, but you need to keep an eye on whenever additionaly.
As far as I know Observer pattern in Ruby on Rails is not made to be asynchronous meaning that Observer's execution will block the action being processed.
I know about delayed_job gem and I really like it but sometimes it looks a bit too heavy for certain purposes.
What about launching a new thread in the Observer's callback?
I spent some time trying to find pros and cons of such approach and failed.
So the question is: are there any serious drawbacks of Observer's threading?
Have you heard about sidekiq? It's the new "hot" gem to do background processing (vs resque or delayedjob).
From the FAQ:
sidekiq uses redis for storage and processes messages in a multi-threaded process.
It's just as easy to set up as resque but more efficient in terms
raw processing speed. Your worker code does need to be thread-safe.
There's also a railscast about it here.
I would recommend using that compared to creating your own thread.
DelayedJob and Sidekiq both present good options, and Rails offers full ActiveJob support now, here are the official docs - https://guides.rubyonrails.org/v4.2/active_job_basics.html
DelayedJob I think has been around the longest, created by the crew that built Shopify.com. It creates a table in your rails app and queues and works off of that. For me it provides the simplest option, as it carries no other dependencies other than your rails app.
Sidekiq offers a great alternative as well. It too is very simple and well-maintained, but instead of using your database, it uses a redis server to manage the jobs. That makes development a bit trickier, as you have to install redis and remember to start it when running your app. Not hard, just a bit extra stuff.
Here's a quick guide comparing the two - DelayedJob vs. Sidekiq, hope that helps.
I'm am going to set up some functionality for my app that is Rails 3.2.3 and on Heroku. The idea is to have a task, or job (or whatever you want to call it) run every day, to make sure user information from the external API is up to date with the user information in my db. I'm curious what is the the best way to set this up? Should it be a cron job that runs a rake task?
Seems like there are quite a few ways to do this and I'm interested in the ways others are doing this. The only way I can think to do it is to run a rake task in a cron job, but would love to figure out what best practices are, or the most simple way to do it. Seems like there are a lot of ways to skin this cat... lots of different tools out there too.
If there was a pure rails way to do this, I think that would be better so I don't have to screw around with every system I place my app onto.
For a simple sync job that runs once a day, I believe having a cronjob would be sufficient and likely more stable in the long run.
Honestly, solutions such as Resque and Sidekiq is a bit overkill in my opinion (for your needs). You're still required to use a scheduler to send messages to these systems.
Check out the gem 'whenever' if you're looking at making the deployment and writing of crontabs easier: https://github.com/javan/whenever/
Railscasts regarding 'whenever': http://railscasts.com/episodes/164-cron-in-ruby
There are two options. They're better than options you mentioned in your question
Resque.
Sidekiq.
Try the later one. It is faster, lightweight and based on multithreading so there isn't interference with system. You'll need to look into scheduler of both the gem for processing everyday.
Hope this helps!
Use the Heroku scheduler add on to the handle scheduling itself. You can have it run a rake task, resque, or whatever.
Here is a few to choose from :
resque (with resque-scheduler. But you have to use redis with it)
rufus-scheduler ( if you want something simple, resque uses rufus-scheduler itself)
You may try delayed_job with a few tricks like this one. Not that great for scheduling but can use your application database.
What is the preferred way to create a background task for a Rails application? I've heard of Starling/Workling and the good ol' script/runner, but I am curious which is becoming the defacto way to manage this need?
Thanks!
Clarification: I like the idea of Rake in Background, but the problem is, I need something that is running constantly or every 10 hours. I am not going to have the luxury of sitting on a web request, it will need to be started by the server asynchronous to the activities occurring on my site.
Ryan Bates created three great screencasts that might really help you:
Rake in Background
Starling and Workling
Custom Daemon
He talks about the various pros and cons for using each one. This should help you get started.
It depends on your needs.
Try out delayed_job, which was created by Tobi delayed_job (last updated 2011), a Shopify founder.
There are forks by DHH deleayed_job (last updated 2008), and collectiveidea delayed_job (last updated 20 days ago as of 6/28/2018).
I usually rely on cronjob scheduling as it gives the flexibility without having to write separate code to schedule it. Anything that can be executed from shell, can be scheduled! Be it any script (ruby / rake task / py / bash / any other you like), cronjob scheduling can be easily achieved.
If running on windows, one can use scheduled tasks
Hope this helps.
async_observer is the best. It doesn't do all kinds of dumb busy wait stuff or lose jobs on worker crashes like starling, no DB polling, etc... and it integrates into rails remarkably well.
I push tons of jobs through it and it pretty much doesn't care.
Most of the plugins that have been mentioned will do the job, but if all you need is a Rake task run on a set schedule, then there's really no need to start throwing more architecture at it.
Just add a cron job which executes
"cd /path/to/rails/app; RAILS_ENV=production rake run:my:task"
Why reinvent the wheel, when Unix like operating systems have been running tasks on a schedule for decades?
I have used the daemons plugin in the past.
While I don't know if it is becoming a standard, I have had great success with BackgroundRB. I have several workers, some are long running tasks triggered by a user action while others are started on a schedule.
Have a look at Taskr. It's basically like cron, but with a RESTful web interface. You can use it to schedule tasks to periodically connect to your Rails app and trigger arbitrary code (via the Taskr4rails plugin). It's meant to fit nicely into a system built around RESTful services, plus it can notify you if a task returns an error, fails to run, etc.