Rails ActiveJob for infinite loop - ruby-on-rails

The application I am developing needs to have an infinite loop to handle business logic that is completely separate from user input (they would only view it). Since this is a break from tradition MVC, I thought an Active Job would be a good place to put it.
The logic in the loop would poll microcontrollers on the same network. I have little to no ability to change the code on these so I have to adapt to the unique protocol they are using. When the microcontrollers respond, the server will need to do some calculations and store those in the database.
The job would be launched when the server application is. Only one instance of it should exist so I don't want to put it in my models or controllers. I have tried launching it from a couple of places in the config folder but it would cause an initialized constant NameError.
What would be the correct way to launch a job when the server is initialized? Is there another approach you would take?
I am a webdev newb using Ruby 2.2.0 and Rails 4.2.0.

How about using scheduled jobs? For example you could schedule it so that every minute a job is scheduled to run that will poll for whatever you need to be polling for?
Have a look at this Gem:
https://github.com/javan/whenever

Related

Interval based API access and processing different DSL

Background
I'm currently working on a small Rails 5 project that needs to access and process an external API. There is a ruby wrapper gem available for the API, so accessing the data is not a problem.
Problem description
There are two parts of the equation that I am currently missing, and hoping someone out there can help me with.
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
1: I need to call the API, via Rails, every 15 minutes. How can I realize this? I was looking towards Active Job for this, but my research kind of stalled after getting no useful results.
The first problem you can solve using recurring tasks. The main idea is to run the process that will perform some operations every x minutes (or days or whatever fits your problem.
There are several tools that you can use. One of them is built-in the unix system and it is cron. You can read about it in system's manual. You can easily manage it using whenever gem. The main disadvantage is that you need an access to the system's cron which may be non-trivial on non-bare machines (for example Platform as a Service hosts such as Heroku).
You should also take a look at clockwork which does not rely on the system's cron. It uses approach where you have a separate process running all time and it keeps an eye on defined tasks.
In the second approach (having a separate process) you need to remember that time-consuming instructions may "lock" the process and postpone another tasks. In this case, you may want to use background processing such as sidekiq or delayed_job. The idea is to use one process for scheduling tasks at certain time and another process to process those tasks as soon as they appear in the queue.
2: The external API has different domain models and a different domain-specific language than my application. How can I map the different models without changes in Active Record?
You need to create a client that will consume the API and map its responses into models that you have in your application. This way, you don't need to make your model's scheme dependent on the API scheme. Take a look at resource_kit gem - this is a sample solution that uses this approach.
HI hdauven,
processing the API every 15 minutes will affect your server performance,so done it by using sidekiq, it is a background job and use sidetiq it will help you to perform the task every 15 min automatically
You are accessing API, Then why are you worrying about different domain.

Rails best practice: background process/thread?

I'm coming from a PHP environment (at least in terms of web dev) and into the beautiful world of Ruby, so I may have some dumb questions. I imagine there are some fundamentally different options available when not using PHP.
In PHP, we use memcache to store alerts we want to display in a bar along the top of the page. When something happens that generates an alert (such as a new blog post being made), a cron script that runs once every 5 minutes or so puts that information into memcache.
Now when a user visits the site, we look in memcache to find any alerts that they haven't already dismissed and we display them.
What I'm guessing I can do differently in Rails, is to by-pass the need for a cron script, and also the need to look in memcache on every request, by using a Singleton and a polling process running in a separate thread to copy from memcache to this singleton. This would, in theory, be more optimized than checking memcache once-per-request and also encapsulate the polling logic into one place, rather than being split between a cron task and the lookup logic.
My question is: are there any caveats to having some sort of runloop in the background while a Rails app is running? I understand the implications of multithreading, from Objective-C/Java, but I'm asking specifically about the Rails (3) environment.
Basically something like:
class SiteAlertsMap < Hash
include Singleton
def initialize
super
begin_polling
end
# ... SNIP, any specific methods etc ...
private
def begin_polling
# Create some other Thread here, which polls at set intervals
end
end
This leads me into a similar question. We push (encrypted) tasks onto an SQS queue, for things related to e-commerce and for long-running background tasks. We don't use cron for this, but rather we have a worker daemon written in PHP, which runs in the background. Right now when we deploy, we have to shut down this worker and start it again from the new code-base. In Rails, could I somehow have this process start and stop with the rails server (unicorn) itself? I don't think that's something I'd running on the main process in a separate thread, since we often want to control it as a process by itself, but it would be nice if it just conveniently ran when the web application was running.
Threading for background processes in ruby would be a terrible mistake, especially since you're using a multi-process server. Using unicorn with say 4 worker processes would mean that you'd be polling from each of them, which is not what you want. Ruby doesn't really have real threads, it has green threads in 1.8 and a global interpreter lock in 1.9 IIRC. Many gems and libraries are also obnoxiously unthreadsafe.
Using memcache is still your best option and, if you have it set up correctly, you should only see it adding a millisecond or two to the request time. Another option which would give you the benefit of persisting these alerts while incurring minimal additional overhead would be to store these alerts in redis. This would better protect you against things like memcache crashing or server reboots.
For the background jobs you should use a similar approach to what you have now, but there are several off the shelf handlers for this like resque, delayed_job, and a few others. If you absolutely have to use SQS as the backend queue, you might be able to find some code to help you, but otherwise you could write it yourself. This still requires the other daemon to be rebooted whenever there is a code change. In practice this isn't a huge concern as best practices dictate using a deployment system like capistrano where a rule can easily be added to bounce the daemon on deploy. I use monit to watch the daemon process, so restarting it is as easy as telling monit to restart it.
In general, Ruby is not like Java/Objective-C when it comes to threads. It follows the more Unix-like model of process based isolation, but the community has come up with best practices and ways to make this less painful than in other languages. Ruby does require a bit more attention to setting up its stack as it is not as simple as enabling mod_php and copying some files around, but once the choices and architecture is understood, it is easier to reason about how your application works. The process model, in my opinion, is much better for web apps as it isolates code and state from the effects of other running operations. The isolation also makes the app easier to work with in a distributed system.

Load Ruby on Rails models without loading the entire framework

I'm looking to create a custom daemon that will run various database tasks such as delaying mailings and user notifications (each notice is a separate row in the notifications table). I don't want to use script/runner or rake to do these tasks because it is possible that some of the tasks only require the create of one or two database rows or thousands of rows depending on the task. I don't want the overhead of launching a ruby process or loading the entire rails framework for each operation. I plan to keep this daemon in memory full time.
To create this daemon I would like to use my models from my ruby on rails application. I have a number of rails plugins such as acts_as_tree and AASM that I will need loaded if I where to use the models. Some of the plugins I need to load are custom hacks on ActiveRecord::Base that I've created. (I am willing to accept removing or recoding some of the plugins if they need components from other parts of rails.)
My questions are
Is this a good idea?
And - Is this possible to do in a way that doesn't have me manually including each file in my models and plugins?
If not a good idea
What is a good alternative?
(I am not apposed to doing writing my own SQL queries but I would have to add database constraints and a separate user for the daemon to prevent any stupid accidents. Given my lack of familiarity with configuring a database, I would like to use active record as a crutch.)
It sounds like your concern is that you don't want to pay the time- or memory- cost to spin up the rails stack every time your task needs to be run? If you plan on keeping the daemon running full-time, as you say, you can just daemonize a process that has loaded your rails stack and will only have to pay that memory- or time-related penalty for loading the stack one time, when the daemon starts up.
Async_worker is a good example of this sort of pattern: It uses beanstalk to pass messages to one or more worker processes that are each just daemons that have loaded the full rails stack.
One thing you have to pay attention to when doing this is that you'll need to restart your daemonized processes upon a deploy so they can reload your updated rails stack. I'm using this for a url-shortener app (the single async worker process I have running sits around waiting to save referral data after the visitor gets redirected), and it works well, I just have an after:deploy capistrano task that restarts any async worker(s).
You can load up one aspect of Rails such as ActiveRecord but when you get right down to it the cost of loading the entire environment is not much more than just loading ActiveRecord itself. You could certainly just not include aspects like ActionMailer or some of the side bits but I'm going to guess that you're not going to see much win out of it.
What I would suggest instead is either running through runner/console like you said you didn't want to but rather than bootstrapping each time, try to batch things so that you're doing 1000 at a time instead of 1. There are a lot of projects that use this style, some of the bulk mailers spring to mind if you want examples. DJ (delayed_job) does similar by storing a bit in the database saying that this code needs to be run at some point in the future using the environment stack but it tries to batch together as much as it can so you may get win from that.
The other option is to have a persistent mini-rails app with as much stripped out as possible so that the memory usage is lower which can listen for requests and do your bidding when you want it to. This would be more memory but the latency of bootstrapping would be essentially nullified.
Lastly, as an afterthought, this would be a great use for Postgres.

Best rails solution for a mailer that runs every minute

I have an application that checks a database every minute for any emails that are supposed to be sent out at that time. I was thinking about making this a rake task that would be run by a cron job every minute. Would there be a better solution to this?
From what I have read, this isn't ideal because rake has to load the entire rails environment every minute and this becomes expensive.
Thoughts?
Thanks.
You can use backgroundrb. This, however, will eat up memory away from your main Rails app as it will spawn one Ruby instance exclusive to backgroundrb.
You can also define a SystemController (or equivalent) in your main application, with various actions corresponding to the various household tasks your application should perform. You can "prod" it from crontab using wget or curl, the advantage being that it shares resources with your main application. Depending on how paranoid or you are, or on how vulnerable to DOS (or other types of attacks) exposing such a controller to, possibly, the outside world, you may choose to block access to this controller's URL from addresses other than the loopback (ideally in your reverse proxy, alternatively from the controller itself.)
One really simple method would be to have a script that does..
while true do
check_and_send_messages()
sleep 60
end
..which means you are not constantly respawning the Rails environment.
Obviously it has various flaws, but also has some benefits (for example, with your 1-Rake-per-minute, it the Rake task takes more than one minute, Rake will be running multiple times at once)
Also, the Railscasts episodes Rake in Background, Starling and Workling, and Custom Daemon might give you some ideas (they are describing exactly this task)
It turns out there's actually something built just for this: ar_mailer. ar_mailer queues up the e-mails into the DB and then sends them out periodically using the ar_mailer command. You can call ar_mailer every minute.
The nice thing about ar_mailer is that it basically requires very little change in terms of how you already send e-mails. You just need to inherit from ar_mailer instead of ActiveMailer. Using this method, you won't have to worry about running rake tasks in the background, forking processes, or anything like that - and in effect you get a real mail server with queued messages that are deleted when the mail is actually sent. This feature is important if you have a system that sends out large numbers of e-mail enmass. I've used ar_mailer to build a social network - so I can attest to its robustness.
Here's a good article that talks about ar_mailer in depth. I would strongly advise against rolling your own solution here as Eric has built a time-tested solution to this very problem.
I do what Vlad suggested (#2), with only local requests honored, and I'm paranoid enough to also require a specific query string tacked on to the url.
I have several periodic actions set up this way.

Best practice for Rails App to run a long task in the background?

I have a Rails application that unfortunately after a request to a controller, has to do some crunching that takes awhile. What are the best practices in Rails for providing feedback or progress on a long running task or request? These controller methods usually last 60+ seconds.
I'm not concerned with the client side... I was planning on having an Ajax request every second or so and displaying a progress indicator. I'm just not sure on the Rails best practice, do I create an additional controller? Is there something clever I can do? I want answers to focus on the server side using Rails only.
Thanks in advance for your help.
Edit:
If it matters, the http request are for PDFs. I then have Rails in conjunction with Ruport generate these PDFs. The problem is, these PDFs are very large and contain a lot of data. Does it still make sense to use a background task? Let's assume an average PDF takes about one minute to two minutes, will this make my Rails application unresponsive to any other server request during this time?
Edit 2:
Ok, after further investigation, it seems my Rails application is indeed unresponsive to any other HTTP requests after a request comes in for a large PDF. So, I guess the question now becomes: What is the best threading/background mechanism to use? It must be stable and maintained. I'm very surprised Rails doesn't have something like this built in.
Edit 3:
I have read this page: http://wiki.rubyonrails.org/rails/pages/HowToRunBackgroundJobsInRails. I would love to read about various experiences with these tools.
Edit 4:
I'm using Passenger Phusion "modrails", if it matters.
Edit 5:
I'm using Windows Vista 64 bit for my development machine; however, my production machine is Ubuntu 8.04 LTS. Should I consider switching to Linux for my development machine? Will the solutions presented work on both?
The Workling plugin allow you to schedule background tasks in a queue (they would perform the lengthy task). As of version 0.3 you can ask a worker for its status, this would allow you to display some nifty progress bars.
Another cool feature with Workling is that the asynchronous backend can be switched: you can used DelayedJobs, Spawn (classic fork), Starling...
I have a very large volume site that generates lots of large CSV files. These sometimes take several minutes to complete. I do the following:
I have a jobs table with details of the requested file. When the user requests a file, the request goes in that table and the user is taken to a "jobs status" page that lists all of their jobs.
I have a rake task that runs all outstanding jobs (a class method on the Job model).
I have a separate install of rails on another box that handles these jobs. This box just does jobs, and is not accessible to the outside world.
On this separate box, a cron job runs all outstanding jobs every 60 seconds, unless jobs are still running from the last invocation.
The user's job status page auto-refreshes to show the status of the job (which is updated by the jobs box as the job is started, running, then finished). Once the job is done, a link appears to the results file.
It may be too heavy-duty if you just plan to have one or two running at a time, but if you want to scale... :)
Calling ./script/runner in the background worked best for me. (I was also doing PDF generation.) It seems like the lowest common denominator, while also being the simplest to implement. Here's a write-up of my experience.
A simple solution that doesn't require any extra Gems or plugins would be to create a custom Rake task for handling the PDF generation. You could model the PDF generation process as a state machine with states such as submitted, processing and complete that are stored in the model's database table. The initial HTTP request to the Rails application would simply add a record to the table with a submitted state and return.
There would be a cron job that runs your custom Rake task as a separate Ruby process, so the main Rails application is unaffected. The Rake task can use ActiveRecord to find all the models that have the submitted state, change the state to processing and then generate the associated PDFs. Finally, it should set the state to complete. This enables your AJAX calls within the Rails app to monitor the state of the PDF generation process.
If you put your Rake task within your_rails_app/lib/tasks then it has access to the models within your Rails application. The skeleton of such a pdf_generator.rake would look like this:
namespace :pdfgenerator do
desc 'Generates PDFs etc.'
task :run => :environment do
# Code goes here...
end
end
As noted in the wiki, there are a few downsides to this approach. You'll be using cron to regularly create a fairly heavyweight Ruby process and the timing of your cron jobs would need careful tuning to ensure that each one has sufficient time to complete before the next one comes along. However, the approach is simple and should meet your needs.
This looks quite an old thread. However, what I have down in my app, which required to run multiple Countdown Timers for different pages, was to use Ruby Thread. The timer must continue running even if the page was closed by users. Ruby makes it easy to write multi-threaded programs with the Thread class. Ruby threads are a lightweight and efficient way to achieve parallelism in your code. I hope this will help other wanderers who is looking to achieve background: parallelism/concurrent services in their app. Likewise Ajax makes it a lot easier to call a specific Rails [custom] action every second.
This really does sound like something that you should have a background process running rather than an application instance(passenger/mongrel whichever you use) as that way your application can stay doing what it's supposed to be doing, serving requests, while a background task of some kind, Workling is good, handles the number crunching. I know that this doesn't deal with the issue of progress, but unless it is absolutely essential I think that is a small price to pay.
You could have a user click the action required, have that action pass the request to the Workling queue, and have it send some kind of notification to the user when it is completed, maybe an email or something. I'm not sure about the practicality of that, just thinking out loud, but my point is that it really seems like that should be a background task of some kind.
I'm using Windows Vista 64 bit for my
development machine; however, my
production machine is Ubuntu 8.04 LTS.
Should I consider switching to Linux
for my development machine? Will the
solutions presented work on both?
Have you considered running Linux in a VM on top of Vista?
I recommend using Resque gem with it's resque-status plug-in for your heavy background processes.
Resque
Resque is a Redis-backed Ruby library for creating background jobs,
placing them on multiple queues, and processing them later.
Resque-status
resque-status is an extension to the resque queue system that provides
simple trackable jobs.
Once you run a job on a Resque worker using resque-status extension, you will be able to get info about your ongoing progresses and ability to kill a specific process very easily. See examples:
status.pct_complete #=> 0
status.status #=> 'queued'
status.queued? #=> true
status.working? #=> false
status.time #=> Time object
status.message #=> "Created at ..."
Also resque and resque-status has a cool web interface to interact with your jobs which is so cool.
There is the brand new Growl4Rails ... that is for this specific use case (among others as well).
http://www.writebetterbits.com/2009/01/update-to-growl4rails.html
I use Background Job (http://codeforpeople.rubyforge.org/svn/bj/trunk/README) to schedule tasks. I am building a small administration site that allows Site Admins to run all sorts of things you and I would run from the command line from a nice web interface.
I know you said you were not worried about the client side but I thought you might find this interesting: Growl4Rails - Growl style notifications that were developed for pretty much what you are doing judging by the example they use.
I've used spawn before and definitely would recommend it.
Incredibly simple to set up (which many other solutions aren't), and works well.
Check out BackgrounDRb, it is designed for exactly the scenario you are describing.
I think it has been around for a while and is pretty mature. You can monitor the status of the workers.
It's a pretty good idea to develop on the same development platform as your production environment, especially when working with Rails. The suggestion to run Linux in a VM is a good one. Check out Sun xVM for Open Source virtualization software.
I personally use active_messaging plugin with a activemq server (stomp or rest protocol). This has been extremely stable for us, processing millions of messages a month.

Resources