Rails - run rake task or new thread from controller? - ruby-on-rails

I have a class (/lib/updater.rb) that do a large updating of the database (calling external server, calculations,...). Normally this task is called by the cron of the server (rake /lib/tasks/launch_updater.rake that start the updater.rb), but I would like to give the opportunity to start it manually from the client too.
At this moment, from the client, the user can click on a button and launch it in this manner:
# the controller
Thread.new {
Updater.start
}
It is a good solution or is better to launch directly from a rake task?
# something like this from the controller
Rake::Task[params[:task]].reenable
Rake::Task[params[:task]].invoke
The task should be no-blocker (the user should navigate normally on the app without waiting the end of the task).
Which is better and why?

Working a little on my question I found the following notes:
when using a Thread, the CPU used is the same at the CPU of the app (even if you have a multi-core server, the CPU is the same). If you want to use Thread, the Thread task should not be "heavy" or you can go in CPU problems (slow app processing).
when you start a Rake task from the terminal or from a server cron, this should take the CPU with the lighter running process. But if you start a task from the application I think the CPU is even here the same of the app.
the better solution to work with heavy charge task is to use a delayed service, in this manner the job task should take another CPU than the app's CPU without make problems at the performance of the app:
Sidekiq
Delayed Job
...

Related

Real-time profiling of Rails in development environment

How can I do real-time profiling of Rails when it's in a development environment?
I've got a large task being run (it isn't done in the background on the development machine, but that wouldn't help anyway because I need to have the task run before I can interact with the web application further), and the task is taking so long I haven't yet seen it finish. So I assume something like ruby-prof probably won't work, because that relies upon there being a start, and a finish, and calculating how much time was taken up in the meantime by what methods.
Control-C profiling won't work either, because the Rails application doesn't respond to control-C, and doing kill -9 kills it, but doesn't produce a stacktrace.
I'd like something that can display how much time was spent in what methods (or in garbage collection) in the past few seconds or the past few minutes, preferably without killing the Rails process in the process.
I could try reducing the size of the task being run, but that feels like a hack.
I came across https://stackoverflow.com/questions/28370195/rails-real-time-profiler , but that's for applications in production, and doesn't have any answers.
I'm not able to run the application in production mode on my local machine.

Running delayed_job inside the main web process

Can i run delayed_job or similar schedule frameworks inside of the web server eg. thin or unicorn?
If yes how do i start it? (code example would be very cool!)
The reason is that i want to save money during my application is just in a build-up phase and it is hosted on heroku.
Officially
No, there is no supported way to run delayed_jobs asynchronously within the web framework. From the documentation on running jobs, it looks like the only supported way to run a job is to run a rake task or the delayed job script. Also, it seems conceptually wrong to bend a Rack server, which was designed to handle incoming client requests, to support pulling tasks off of some queue somewhere.
The Kludge
That said, I understand that saving money sometimes trumps being conceptually perfect. Take a look at these rake tasks. My kludge is to create a special endpoint in your Rails server that you hit periodically from some remote location. Inside this endpoint, instantiate a Delayed::Worker and call .start on it with the exit_on_complete option. This way, you won't need a new dyno or command.
Be warned, it's kind of a kludgy solution and it will tie up one of your rails processes until all delayed jobs are complete. That means unless you have other rails processes, all incoming requests will block until this queue request is finished. Unicorn provides facilities to spawn worker processes. Whether or not this solution will work will also depend on your jobs and how long they take to run and your application's delay tolerances.
Edit
With the spawn gem, you can wrap your instantiation of the Delayed::Worker with a spawn block, which will cause your jobs to be run in a separate process. This means your rails process will be available to serve web requests immediately instead of blocking while delayed jobs are run. However, the spawn gem has some dependencies on ActiveRecord and I do not know what DB/ORM you are using.
Here is some example code, because it's becoming a bit hazy:
class JobsController < ApplicationController
def run
spawn do
#options = {} # youll have to get these from that rake file
Delayed::Worker.new(#options.merge(exit_on_complete: true)).start
end
end
end
Here's a link to a similar question:
Is it feasible to run multiple processeses on a Heroku dyno?
Bear in mind, as the post says, if you're only using one web dyno, it will be shut down if there's no traffic going to it.
In a similar vein, you might look into:
http://blog.codeship.io/2012/05/06/Unicorn-on-Heroku.html
To save on the need for multiple web dynos whilst you're building your app (although it's still subject to the above shutdown issue).
I would suggest you might look at running on a VPS directly, rather than Heroku (check out the railscast):
http://railscasts.com/episodes/337-capistrano-recipes
Once set up, it's pretty easy to deploy to. Heroku cuts out the devops part for you.
You can run it inside a separate worker of Unicorn, so it shares memory with the master process and get restarted together with the app.
See https://gist.github.com/brauliobo/11298486

background tasks executing immediately and parallelly in rails

our rails web app has to download/unpack archives with html pages from ftp on request for user's viewing through the browser.
the archive can be quite big, so user has to wait until it downloads/unpacks on the server.
i implemented progress bar the way that i call fork/Process.detach in user's request, so that his request is done but downloading/unpacking process continues running in the background. and javascript rendered in his browser pings our server for status until all is ready and then it redirects him to unpacked html pages.
as long as user requests one archive, everything goes smoothly, but if he tries to run 2 or more requests at the same time(so that more forks are started), it seems that only one of them completes, and the rest expires/times outs/gets killed by passenger(?). i suppose its the issue with Passenger/forking.
i am not sure if its possible to fix it somehow so i guess i need to switch to another solution. the solution needs to permit immediate and parallel processing of downloads. so that if user requests multiple archives, he has to see download/decompression progress in all of them at the same time.
i was thinking about running background rake job immediately but it seems very slow to startup(also there's a lot of cron rake tasks happening every minute on our server). reason i liked fork was that it was very fast to start. i know there is delayed job, we also use it heavily for other tasks. but can it start multiple processes at the same time immediately without queues?
solved by keeping the fork and using single dj worker. this way i can have as many processes starting at the same time as needed without trouble with passenger/modifying our product's gemset (which we are trying to avoid since it resulted in bugs in the past)
not sure if forking inside dj worker can cause any troubles, so asked at
running fork in delayed job
if id be free to modify gemset, id probably use resque as wrdevos suggested, or sidekiq, or girl_friday(but thats less probable because it depends on the server running).
Use Resque: https://github.com/defunkt/resque
More on bg jobs and Resque here.
https://github.com/blog/542-introducing-resque

Rails: Delayed_job for queuing but running jobs through cron

Ok, so this is probably evil, however.. here's the question! I want to run a pretty lightweight app on a shared environment (site5). Ideally I would like to use delayed_job for the ease of queueing the mails (~200+ every so often). However, being a shared environment they don't want background processes running all the time (fair enough).
So, my plan, such as it is, is to queue the mails using delayed job, and then every hour or something, spin up a cron job, send a few emails (10 or something small) and then kill the process. And repeat.
Q) Is there a rake jobs:works:1 equivalent task it'd be easy to setup? - pointer would be handy.
I'm quite open to "this is a terrible idea, don't even go there" being the answer.. in which case I might look at another queuing strategy... (or heroku hire-fire perhaps..)
You can get delayed job to process only a certain number of jobs by doing:
Delayed::Worker.new.work_off(10)
You could fire a script to do that from cron or use "rails runner":
rails runner -e production 'Delayed::Worker.new.work_off(10)'
I guess the main issue on whether it is a good idea or not is working out what small value is actually high enough to make sure you process all your jobs in a reasonable time-frame. Also, you've got the overhead of firing up the rails environment every time you want to process, or even check whether you should process, any jobs. That might cause problems in a shared environment if they are particularly strict on spikes of memory or CPU usage.
Why not skip the 'workers' (which are just daemons which look for work else sleep) and have your cron fire a custom rake task of 10.times { MailerJob.first.perform }
You'd just need to require you're app in the line before that so its loaded ofc.

Can I start and stop delayed_job workers from within my Rails app?

I've got an app that could benefit from delayed_job and some background processing. The thing is, I don't really need/want delayed_job workers running all the time.
The app runs in a shared hosting environment and in multiple locations (for different users). Plus, the app doesn't get a large amount of usage.
Is there a way to start and stop processing jobs (either with the script or rake task) from my app only after certain actions/events?
You could call out to system:
system "cd #{Rails.root} && rake delayed_job:start RAILS_ENV=production"
You could just change delayed_job to check less often too. Instead of the 5 second default, set it to 15 minutes or something.
Yes, you can, but I'm not sure what the benefit will be. You say you don't want workers running all the time - what are your concerns? Memory usage? Database connections?
To keep the impact of delayed_job low on your system, I'd run only one worker, and configure it to sleep most of the time.
Delayed::Worker::sleep_delay = 60 * 5 # in your initializer.rb
A single worker will only wake up and check the db for new jobs every 5 minutes. Running this way keeps you from 'customizing' too much.
But if you really want to start a Delayed::Worker programatically, look in that class for work_off, and implement your own script/run_jobs_and_exit script. It should probably look much like script/delayed_job does - 3 lines.
I found this because I was looking for a way to run some background jobs without spending all the money to run them all the time when they weren't needed. Someone made a hack using google app engine to run the background jobs:
http://viatropos.com/blog/how-to-run-background-jobs-on-heroku-for-free/
It's a little outdated though. There is an interesting comment in the thread:
"When I need to send an e-mail, copy a file, etc I basically add it to the queue. At the end of every request it checks if there is anything in the queue. If so then it uses the Heroku API to set the worker to 1. At the end of a worker getting a task done it checks to see if there is anything left in the queue. If not then it sets the workers back to 0. The end result is the background worker will just work for a few seconds here and there. I can do all the background processing that I need and the bill at the end of the month rarely ever reaches 1 hour total worth of work. Even if it does no problem, I'll pay $0.05 for background processing. :)"
If you go to stop a worker, you are given the PID. You can simply kill -9 PID if all else fails.

Resources