how do you convert a separate Ruby program into a rake task? - ruby-on-rails

Sorry for what seems like a basic question, but I'm stumped.
I have a Rails app that relies in part on third party data that I download periodically (generally daily) and integrate into the app's database. The Ruby code that I use to get the third party data is in separate Ruby files, i.e., not integrated as code in any of my controllers. So, I run these programs when needed via rails runner program.rb (probably not relevant, but all of them use the mechanize gem in gathering the third party data).
I want to try using Heroku's scheduler to make my data gathering more automated, and the recommendation for doing this is to set up rake tasks. Is there a rake task equivalent for rails runner program.rb?
Thanks

As far as converting a rails command into a rake task it's pretty easy, you just namespace it like this:
# /lib/tasks/my_task.rake
namespace :mytask do
desc 'the description of your task'
task :my_task => :environment do
puts 'my task was executed'
end
end
Then you'd call it with 'rake mytask:my_task' later on, which you should be able to do with Heroku's scheduler. Also I'd like to recommend the 'whenever' gem for setting things up on a schedule as I have no experience with Heroku but have found that gem to be fantastic.

Related

Rake :environment does not work in production

I have a bunch of rake tasks that modify models in my rails project. They all work just fine in development, but in production they fail to load up associated model and service classes.
The problem seems to come from the :environment declaration. My tasks take the form
task :my_task => :environment do
#modify models
end
The documentation says that :environment loads the rails environment so that you can interact with any file in the rails system, but apparently this is not the case in production?
Is there a way to load needed files in production? Or should I not be using the :environment task at all? Seems really weird to have the code behave one way in development and another in production (testing this is gonna be a pain).
Seems to be an issue with the way rake tasks don't eager-load. This answer may be the droids you're looking for: Rails 3 rake task can't find model in production

How to test rake tasks, cron jobs, and

I know in Rails application, I can write tests for controllers and models by using Rspec.
But:
How about to test some rake task? What is the good way to test some rake task?
How about to test a cron job which run certain rake task every day at a fixed time?
Can Rspec also be used for above two scenarios in Rails app development or are there some other ways to implement those tests?
In addition:
I have a rake task which is used to update the database of the Rails app by fetching data from another database and insert to the app database (clean the app database first of course)
I would like to test these, how to do it?
Instead of having code in your rake tasks, do something like this:
desc "Charge Customers Daily"
task :charge_customers => :environment do
CustomerCharges.create
end
That way, you can write rspec tests in the customer_charges_spec.rb file as you normally would.
Maybe this helps :
http://www.philsergi.com/2009/02/testing-rake-tasks-with-rspec.html
To test a cron job, you can just redirect to a log file on cron itself, like :
command > /tmp/log.txt 2>&1
Generally, if you intend on doing db insertions and stuff in a rake task, i would reconsider and write that as a separate ruby module. I think it's much more flexible this way.

How to handle one-off deployment tasks with capistrano?

I am currently trying to automate the deployment process of our rails app as much as possible, so that a clean build on the CI server can trigger an automated deployment on a test server.
But I have run into a bit of a snag with the following scenario:
I have added the friendly_id gem to the application. There's a migration that creates all the necessary tables. But to fill these tables, I need to call a rake task.
Now, this rake tasks only has to be called once, so adding it to the deployment script would be overkill.
Ideally, I am looking for something like migrations, but instead of the database, it should keep track of scripts that need to be called during a deployment. Does such a beast already exist?
Looks like after_party gem does exactly what you want.
I can't think of anything that does exactly what you want, but if you just need to be able to run tasks on remote servers in a one off fashion you could always use rake through capistrano.
There's an SO question for that here: How do I run a rake task from Capistrano?, which also links to this article http://ananelson.com/said/on/2007/12/30/remote-rake-tasks-with-capistrano/.
Edit: I wonder if it's possible to create a migration which doesn't do any database changes, but just invokes a rake task? Rake::Task["task:name"].invoke. Worth a try?
I would consider that running that rake task is part of the migration to using friendly_id. Sure, you've created the tables, but you're not done yet! You still have to do some data updates before you've truly migrated.
Call the rake task from your migration. It'll update the existing data and new records will be handled by your app logic in the future.

Recurring tasks in a Ruby On Rails application: Cron or other?

I am currently writing an application that pulls new information from RSS sources and has to update those RSS sources in a certain frequency. Currently I am pulling only when the user requests a feed but I want to change that behavior to automatic periodic fetching.
I was writing a shellscript that would interact with the database and gets started periodically via cron - but this is lots of double effort so I was wondering what would be the "Rails Way" or "Ruby Way" to do this. I am using Ubuntu, Apache and Passenger. Can you suggest better methods that are maybe even included in the application, so I can easily deploy the app to another machine without having to mingle with cron?
I would suggest doing something like a rake task and using the whenever gem to generate your cron job to run the rake task.
Check out, http://railscasts.com/episodes/164-cron-in-ruby, for more information on the whenver gem.
The main benefit of the whenever gem is that it keeps your application requirements (i.e. the cron job running every x hours, in the application) inside your application, increasing the portability of your application.
I recommend a combination of the two above. You want a rake task, even if you have a direct method already created. This is because server admin stuff that you'd want to run in cron, you might also want to run from the command line occasionally, and this is what rake tasks are good for.
The whenever plugin sounds cool, although I can't vouch for it. Of course, it's good to know how to do things from scratch, then use plugins to make your life easier. Here's the from-scratch way.
Create a new file, lib/tasks/admin.rake
Inside, create the task itself:
namespace :admin
desc "Updates all RSS feeds"
task :rss => :environment do
RssFeed.update_all
end
end
This assumes you have an RssFeed class, and the update_all method does what you'd expect. You can call this from the command line:
rake admin:rss
And you can add this to cron (by calling crontab -l as the web user) and adding this line:
10 0 * * * cd /path/to/rails/app && rake RAILS_ENV=production admin:rss
There are a variety of solutions. For the simplest setup, you can use script/runner in your crontab something like so:
10 0 * * * /home/myuser/myproject/script/runner -e production ModelName.methodname
Methodname must be a static method on your model. You need to reference the project by full path, otherwise it will not be found most likely in the cron environment. Check your crontab man page for info on the crontab syntax if you're not familiar. The above, for example, runs the script at the 10th minute of the 0th hour of every day (at 12:10am, in short).
If you need a more powerful solution, you could use BackgroundRB. BackgroundRB runs a daemon and supports tasks that schedule, and can put results in a database. They even have a simple communication protocol to allow your web processes to request a task be completed, and then have a way to retrieve the result. This allows you to control background jobs right from the web interface, rather than a crontab which just "happens".
There is a good bit more setup needed for BackroundRB to work, but it may be worth it if jobs need to be controlled.
Try using whenever. Eventhough in the end it will create a cron, but the scheduling definition will be written inside your application using Ruby DSL.
For small teams and personal projects, the whenever gem is great. But if your company has an ops team separate from the development team, it might not be ideal.
At my last job, the ops team needed to be able to see the cron we were installing so they could be confident it wouldn't have any side effects for the system. So a DSL solution wasn't going to work. But we (the developers) wanted the cron scripts in version control.
So to compromise, we checked text files with the raw cron, similar to this:
10 0 * * * cd /path/to/rails/app && rake RAILS_ENV=production admin:rss
And we added a step to the capistrano script that installed that to the crontab as part of the deploy.
Try webmin setup in your server. If your hosted site provide it. Go to the below mentioned URL. It's easy to set up and user freiendly.
URL is:
http://your_ip_address:10000/
I have used this in many of my application it's worked for me to schedule cron jobs.

I have a Rails task: should I use script/runner or rake?

For ad hoc Rails tasks we have a few implementation alternatives, chief among which would seem to be:
script/runner some_useful_thing
and:
rake some:other_useful_thing
Which option should I prefer? If there's a clear favourite then when, if ever, should I consider using the other? If never, then why would you suppose it's still present in the framework without deprecation warnings?
The difference between them is that script/runner boots Rails whereas a Rake task doesn't unless you tell it to by making the task depend on :environment, like this:
task :some_useful_task => :environment do
# do some useful task
end
Since booting Rails is expensive, it might be worth skipping if you can avoid it.
Other than that, they are roughly equivalent. I use both, but lately I've used script/runner executing a script separately more.
Passing parameters to a rake task is a pain in the butt, to say the least. You either need to resort to environment variables or a very hackish parameter system that is not intuitive and has lots of caveats.
If your task needs to handle command line arguments gracefully then writing a script is the way to go.
Luke Francl mentions script/runner booting up Rails. That's true. But if you don't want to boot up rails then just run the script as is without script/runner. So the only real difference between scripts and rake tasks are their aesthetics. Choose whatever feels right to you.
I use rake tasks for little tasks (one or two lines). Anything more complicated goes into the script/ directory. I'll break this rule if I think other developers will expect the code to live in one place over another.
FWIW there seems to be some movement away from using script runner in favor of rake:
Update (4/25/2009): I recommend using rake tasks as opposed to script/runner for recurring tasks.
Also, as per this post you can use rake for recurring tasks just fine:
If I then wanted this to run nightly on my production database at midnight, I might write a cronjob that looks something like this:
0 0 * * * cd /var/www/apps/rails_app/ && /usr/local/bin/rake RAILS_ENV=production utils:send_expire_soon_emails
Corrected based on comment 2 down. Give them the karma!
FWIW - Rails 3.0+ changes how you initialize the Rails system in a standalone script.
require File.dirname(__FILE__) + '/config/environment'
As mentioned above you can also do:
rails runner script/<script name>
Or put all the code in a Rake task, but I have a lot of legacy code from Rails 2; so I didn't want to go down that path immediately.
Each has its advantages and disadvantages.
One thing I've done is just write normal ruby scripts and put them in the script/maintenance directory.
All you need to do to load rails and get access to all your models, etc, is put require '../../config/environment.rb' at the top of your file, then you're away.
For one off commands script/runner can be fine. For anything repeated, a rake task is easier in the long-run, and has a summary if you forget what it does.
In Rails 3.0+, the config/environment.rb requires the config/application.rb, that requires the config/boot.rb.
So, to load an app in Rails 3, you still only have to require the environment.rb
I got the impression script/runner was primarily for periodic tasks. E.g., a cron job that runs:
SomeClass.update_from_web('http://www.sourcefordata.gov/')

Resources