Real-time profiling of Rails in development environment - ruby-on-rails

How can I do real-time profiling of Rails when it's in a development environment?
I've got a large task being run (it isn't done in the background on the development machine, but that wouldn't help anyway because I need to have the task run before I can interact with the web application further), and the task is taking so long I haven't yet seen it finish. So I assume something like ruby-prof probably won't work, because that relies upon there being a start, and a finish, and calculating how much time was taken up in the meantime by what methods.
Control-C profiling won't work either, because the Rails application doesn't respond to control-C, and doing kill -9 kills it, but doesn't produce a stacktrace.
I'd like something that can display how much time was spent in what methods (or in garbage collection) in the past few seconds or the past few minutes, preferably without killing the Rails process in the process.
I could try reducing the size of the task being run, but that feels like a hack.
I came across https://stackoverflow.com/questions/28370195/rails-real-time-profiler , but that's for applications in production, and doesn't have any answers.
I'm not able to run the application in production mode on my local machine.

Related

Rails - run rake task or new thread from controller?

I have a class (/lib/updater.rb) that do a large updating of the database (calling external server, calculations,...). Normally this task is called by the cron of the server (rake /lib/tasks/launch_updater.rake that start the updater.rb), but I would like to give the opportunity to start it manually from the client too.
At this moment, from the client, the user can click on a button and launch it in this manner:
# the controller
Thread.new {
Updater.start
}
It is a good solution or is better to launch directly from a rake task?
# something like this from the controller
Rake::Task[params[:task]].reenable
Rake::Task[params[:task]].invoke
The task should be no-blocker (the user should navigate normally on the app without waiting the end of the task).
Which is better and why?
Working a little on my question I found the following notes:
when using a Thread, the CPU used is the same at the CPU of the app (even if you have a multi-core server, the CPU is the same). If you want to use Thread, the Thread task should not be "heavy" or you can go in CPU problems (slow app processing).
when you start a Rake task from the terminal or from a server cron, this should take the CPU with the lighter running process. But if you start a task from the application I think the CPU is even here the same of the app.
the better solution to work with heavy charge task is to use a delayed service, in this manner the job task should take another CPU than the app's CPU without make problems at the performance of the app:
Sidekiq
Delayed Job
...

running fork in delayed job

we use delayed job in our web application and we need multiple delayed jobs workers happening parallelly, but we don't know how many will be needed.
solution i currently try is running one worker and calling fork/Process.detach inside the needed task.
i was trying to run fork directly in rails application previously but it didnt work too good with passenger.
this solution seems to work well. could there be any caveats in production?
one issue which happened to me today and which anyone trying that should take care of was following:
i noticed that worker is down so i started it. something i didnt think about was that there were 70 jobs waiting in queue. and since processes are forked, they pretty much killed our server for around half an hour by starting all almost immediately and eating all memory in process.. :]
so ensuring that there is god watching over the worker is important.
also worker seems to die often but not sure yet if its connected with forking.

Rails: Delayed_job for queuing but running jobs through cron

Ok, so this is probably evil, however.. here's the question! I want to run a pretty lightweight app on a shared environment (site5). Ideally I would like to use delayed_job for the ease of queueing the mails (~200+ every so often). However, being a shared environment they don't want background processes running all the time (fair enough).
So, my plan, such as it is, is to queue the mails using delayed job, and then every hour or something, spin up a cron job, send a few emails (10 or something small) and then kill the process. And repeat.
Q) Is there a rake jobs:works:1 equivalent task it'd be easy to setup? - pointer would be handy.
I'm quite open to "this is a terrible idea, don't even go there" being the answer.. in which case I might look at another queuing strategy... (or heroku hire-fire perhaps..)
You can get delayed job to process only a certain number of jobs by doing:
Delayed::Worker.new.work_off(10)
You could fire a script to do that from cron or use "rails runner":
rails runner -e production 'Delayed::Worker.new.work_off(10)'
I guess the main issue on whether it is a good idea or not is working out what small value is actually high enough to make sure you process all your jobs in a reasonable time-frame. Also, you've got the overhead of firing up the rails environment every time you want to process, or even check whether you should process, any jobs. That might cause problems in a shared environment if they are particularly strict on spikes of memory or CPU usage.
Why not skip the 'workers' (which are just daemons which look for work else sleep) and have your cron fire a custom rake task of 10.times { MailerJob.first.perform }
You'd just need to require you're app in the line before that so its loaded ofc.

How can I monitor recurrent rake tasks run by heroku scheduler?

I just got the last month heroku bill, and the scheduled rake tasks were a relatively heavy burden. We are pretty early in our development process, so we just developed some rake tasks to get the job done recently, and didn't had much concern in theirs optimization.
Now we want to improve theirs performance and theirs heroku processing hours usage. We use New Relic to monitor the webapp performance, but apparently this type of rake tasks are ignored by default, and it's unclear how to override that.
Anyone had a similiar problem? How can I track the scheduled tasks in close to real time to monitor performance, optimize, and don't get suprise bills?
Whilst you can't really monitor rake tasks that well, there are a few little things you can do. One is the use of logging. Output start and end times of tasks to logs, and you can then see what's been happening duration wise. If you couple this with something like the Papertrail add-on then you can do additional interrogation later on.
As for running the jobs themselves, there's a couple of ways that you can run background processes which are dependant on how they need to run:
If you're needing to run jobs on a schedule, there's a few options available. Firstly there's the Heroku scheduler, which is pretty good, but doesn't guarantee executions will happen. Normally you would use this to kick off a rake task which will bring up a one-off dyno for the duration of the task - therefore you need to ensure in development that these tasks are as efficient as possible.
Alternatively, if you're looking at jobs that need a little more control or using a clock process. Essentially this is a dyno running 24/7 that does nothing but kick off other jobs at preset intervals and times. This would normally be done using the clockwork gem. The downside of this approach is that you need to pay for a clock process all the time.
A third approach, and one that might work is delayed job, with it's runat option, allowing you to queue a job to be run in the future (and jobs can re-queue themselves). There are a few issues with this in that a failure can kill the whole chain, and you need a full time worker running to process them all.
Therefore, in order to minimize your bills, ensure that your rake tasks are as performant and reliable, and then choose the scheduling option that suits you. If you're looking at schedules plus user created events, delayed_job might be the best option. If you're looking at a few tasks running periodically, then go scheduler. If you're looking at running lots of time critical jobs on a regular basis, go with clockwork.
Either way, you should be able to constrain a fair amount of processing into just one or two processes depending on your approach.
I know this question is almost 10 years old, but there is a new way!
You can now monitor your Heroku Scheduler jobs using One-off Dyno Metrics. This Heroku add-on gathers metrics for all detached one-off dynos running in your Heroku app. It was created to be an extension of Heroku's Application Metrics and works out of the box.
when you are running on heroku cedar there is a way to get a free setup for your workers. this is no answer to your monitoring question, but it might be interesting anyways: http://blog.nofail.de/2011/07/heroku-cedar-background-jobs-for-free/
You can force the New Relic agent to start in your rake tasks and report their performance data.
Not the answer to the specific question,but...
One method of reducing overhead is using Unicorn server to get multiple workers working on one dyno. It depends on your set up, but most people who've taken the time to test it can comfortably get 3 - 4 worker processes running concurrently. It's a huge boost in clearing cues or tasks. Just be careful not to max out the allocated memory for the dyno.

Rails keeps being rebooted in production Passenger

I'm running an application that kicks off a Rufus Scheduler process in an initializer. The application is running with Passenger in production and I've noticed a couple weird behaviors:
First, in order to restart the server and make sure the initializer gets run, you have to both touch tmp/restart.txt and load the app in a browser. At that point, the initializer fires. The horrible thing is that if you only do the touch, the processes scheduled by Rufus get reset and aren't rescheduled until you load the app in a browser.
This alone I can deal with. But this leads to the second problem: I'll notice that the scheduled process hasn't run, so I load a page and suddenly the log file is telling me that it's running the initializers as if I'd rebooted. So, at some point, Passenger is randomly rebooting as if I'd touched tmp/restart.txt and wiping out my scheduled processes.
I have an incredibly poor understanding of Passenger and Rails's integration, so I don't know whether this occasional rebooting is aberrant or all part of the architecture. Can anyone offer any wisdom on this situation?
What you describe is the way Passenger works. It spawns new instances of the application when traffic warrants them, and shuts them down after periods of inactivity to free resources.
You should read the Passenger documentation, particularly the Resource Control and Optimization section. There are settings which can prevent the application from being shut down by Passenger, if that is what you want.
Using the PassengerPoolIdleTime setting, you could keep at least one process running, but you'll almost certainly want Passenger to start up other instances of the app as necessary. This thread on the Rufus Scheduler Google Group mentions using lock files to prevent more than one process from starting the scheduler, that may be useful to you.

Resources