Rails polling RSS every 6 hours - ruby-on-rails

I am seeking advice on how to poll RSS with a given interval such as 6 hours.
The following code works, it reads and parses the feed and adds it to the database. It only adds new feeds. Here is the class method:
def self.update_from_feed(feed_url)
feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing") #Example feed
feed.entries.each do |entry|
unless exists? :guid => entry.id
create!(
:name => entry.title,
:url => entry.url,
:published_at => entry.published,
:guid => entry.id
)
end
end
end
How do I run this entire class method lets say, every 6 hours? I'm new to Ruby (and Rails), so any help would be appreciated with an example. I want to avoid running an external cron job if possible. I want it to run every 6 hours within the code if that makes sense. Thanks

No, it doesn't make sense to not use cron for this. This is what cron is for, it's in the name of the program.
If you don't like the cron syntax, that's cool, there's a gem called whenever, https://github.com/javan/whenever that gives a nice Ruby syntax for generating a cron job and a command that lets you do that.
However, for the love of god, do not try to invent a new way of doing this, unless you're adding some killer features. Use cron, move on.

Related

Background thread for email sending in Ruby on Rails

I need to run a background thread in the Ruby on Rails application that should send an emails when certain date has occurred, depending on the values in DB (and email body should contain some info from this DB).
What is the best way to achieve such behavior?
I'm using Ruby on Rails 4.1.4 btw.
Thanks in advance.
You would be better off using a framework like Sidekiq or Resque than doing it yourself.
With Sidekiq, you can use the Sidekiq Pro or various third-party projects to schedule jobs. See Recurring jobs on Sidekiq's wiki for projects that provide scheduling capability.
You can use whenever gem to perform background jobs according to your requirement.
Check out the github docs here. Whenever
Ryan Bates created a great Railscast about Whenever: Cron jobs Intro in rails
In config/schedule.rb
every 2.hours do
runner User.send_email_to_users
end
In User model
def self.send_email_to_users
//write your logic here and call to action to send mails.
UserMailer.send_mail_persons(user).deliver
//pass any other data if required in arguments.
end
In app/mailers/user_mailer.rb
def send_mail_persons(user)
#user = user
mail :to => #user.email, :subject => "Amusement.", :from => MAIL_ADDRESS
end
Create a html template as per requirement, app/views/user_mailer/send_mail_persons.html.erb
lib/tasks/mail_task.rake
namespace :mail_task do
desc "Send Mail"
task :send_mail => :environment do
...
end
end
Command line
rake mail_task:send_mail

Using FeedJira to create RSS aggregator/reader

I am trying to create my own rss reader app in ruby on rails. I want to be able to store various news stories in my database that I can pull from later to display each story with its headline, image, summary, etc. in a nice layout. I am working with the feedjira library and am also pretty new to RoR. I know that these two commands in the rails console fetch rss feeds and somehow parse them:
urls = %w[http://feedjira.com/blog/feed.xml https://github.com/feedjira/feedjira/feed.xml]
feeds = Feedjira::Feed.fetch_and_parse urls
While these two commands work on rss feeds, I was wondering how I could configure my database/model and then save the news entries I get from Feedjira into the db. I tried watching the railscast on this issue but it seemed a bit out of date. Any help on this issue would be immensely appreciated! Thanks in advance!
Here's one way:
Create a model such as this:
class Entry < ActiveRecord::Base
attr_accessible :guid, :source_site_id, :url, :title, :summary, :description, :published_at
def self.update_from_feed(feed_name)
feed = Feed.find_by_name(feed_name)
feed_data = Feedjira::Feed.fetch_and_parse(feed.feed_url)
add_entries(feed_data.entries, feed)
end
private
def self.add_entries(entries, feed)
entries.each do |entry|
break if exists? :entry_id => entry.id
create!(
:entry_id => entry.id,
:feed_id => feed.id,
:url => entry.url,
:title => entry.title.sanitize,
:summary => entry.summary.sanitize,
:description => entry.content.sanitize,
:published_at => entry.published
)
end
end
end
end
You can then call this from the cli / cron or whatever with, for example:
rails runner -e development 'Entry.update_from_feed("feedname")'
This runs the update_from_feed method in the context of your Rails app using a separate rails instance (a bit like rails console), but doesn't impact the running Rails instance.
In this example, there's a separate model which has name and feed_urls, so there's a lookup of the url based on the provided name.
This code doesn't use the ability of Feedjira to check for updates, so dupe checking is baked in.
(This guthub issue says to avoid using the #update method.
Note that the use of break assumes that new entries are always added to the top of the feed. If you don't trust the feed, then replace break if with unless. The url can be used as an alternative unique id.
Edit:
Here's a version of the update_from_feed method that takes advantage of Feedjira's ability to process multiple feeds:
def self.update_all
feed_urls = Feed.pluck :feed_url
feeds = Feedjira::Feed.fetch_and_parse(feed_urls)
feed_urls.each do |feed_url|
feed = Feed.find_by_feed_url(feed_url)
add_entries(feeds[feed_url].entries, feed)
end
end
pluck returns all the rows of the specified column(s) (:feed_url in this case) in an array. Equally you could change it to accept an array of names, from which it looks up an array of URLs to pass to feedjira.
Finally, if you wanted a self-looping method, you could include:
def self.update_all_periodically(frequency = 15.minutes)
loop do
update_all_from_feed
sleep frequency.to_i
end
end
Then this:
rails runner -e development 'Feed.update_all_periodically'
won't return until you break the process, and will update all feeds at the default frequency, or that specified as an optional argument.
If you wanted to run the updates asynchronously in your main Rails process, then a background worker such as Sidekiq, Resque or DelayedJob will do the... job. :)
Scheduling the fetching and parsing of al these feeds can be incredibly hard and time consuming, which means you shoud absolutely not do it from inside the Rails app itself. At best, you should do it using an 'offline' script.
You could also simply rely on existing APIs like Superfeedr and its rack middleware.

Communicating between 2 Rails applications

I have 2 separate rails applications (app a, app b). Both of these apps maintain a customer list. I would like to run a rake task once a day and have app b pull in select customers from app a.
This is the way I have attempted to solve this. If I am going down the wrong road please let me know.
I am using JBuilder to generate the JSON
My issue is with how to have App B set an id in app A, so that the system knows the customer has already been transfered over.
Im assuming I have to do a put request similar to what I have done to get the customers list, but I am having issues getting that to work.
App A
Customers Model
scope :for_export, :conditions => {:verified => true, :new_system_id => nil ...}
Customers Controller
skip_before_filter :verify_authenticity_token, :only => [:update]
#
def index
#customers = Customer.for_export
end
def update
#customer = Customer.find(params[:id])
if #customer.update_attributes(params[:customer])
render :text => 'success', :status => 200
end
end
App B
rake task
task :import_customers => :environment do
c = Curl::Easy.new("http://domain.com/customers.json")
c.http_auth_types = :basic
c.username = 'username'
c.password = 'password'
c.perform
a = JSON.parse(c.body_str)
a.each do |customer|
customer = Customer.create(customer)
#put request back to server a to update field
end
end
end
end
What I have is currently working, Im just not sure if this is the correct method, and also how to initiate a put request to call the update method in the customer controller.
Thanks!
Ryan
I'm sorry that I'm not answering your question, but I am giving you an alternative. What you are trying to create sounds a lot like an ETL job. You may want to consider having a batch job move a copy of your customers table from app a over to app b periodically, and then have another batch job import that table into app b's database. I know, it's a little clunky, but it's a very popular and reliable pattern to solve your problem.
Also, if both apps are in the same data center, then you may want to create a read-only database view of app a's customer data and then have app b read that using SQL calls. It's a slightly cheaper and easier way to integrate the two apps than the option that I listed above.
Good luck!

delayed_job vs. cron

I have a system where users come in to go through an application process that has multiple parts - sometimes users will save their progress and come back later.
I want to send users an e-mail if they haven't come back in 48 hours - would it be best to do this using cron, delayed_job, or whenever?
I've noticed that whenever I run operations in the console (such as bundle install or rake db:migrate) it runs cron as well, which makes me suspicious that we may have instances where users get multiple reminders in the same day.
What are your recommendations for this?
First of all, Whenever and Cron are synonymous. All Whenever does is provide a way for you to write cronjobs using Ruby (which is awesome, I love Whenever).
Delayed_job is not the answer here. You definitely want to use cronjobs. Create a method on your Application model that will get applications which have an updated_at value of < 2.days.ago and e-mail its applicant.
def notify_stale_applicants
#stale_applications = Application.where('updated_at < ?', 2.days.ago) # or 48.hours.ago
#stale_applications.each do |app|
UserMailer.notify_is_stale(app).deliver
end
end
And your UserMailer:
def notify_is_stale(application)
#application = application
mail(:to => application.user.email, :from => "Application Status <status#yourdomain.com>", :subject => "You haven't finished your Application!"
end
Using whenever to create this cron:
every :day, :at => '8am' do
runner 'Application.notify_stale_applicants'
end

how do i use delayed_job on heroku to send out emails that are batched?

I read the documentation on workers and delayed_job and couldn't follow exactly, so wanted to get head-start with some strategy and sample code.
I have a controller which I use to send emails one by one. Now each day I want to check which emails need to be sent for the day, and then send them through heroku as a delayed_job.
How do I begin to approach this? thanks.
This is what I'm coming up with based on the answers:
Using the 'whenever' gem, I created the following schedule.rb
every 1.day, :at => '4:30 am' do
heroku = Heroku::Client.new(ENV['HEROKU_USER'], ENV['HEROKU_PASS'])
heroku.set_workers(ENV['HEROKU_APP'], 1)
Contact.all.each do |contact|
contact_email = contact.email_today
unless contact.email_today == "none"
puts contact.first_name
puts contact_email.days
puts contact.date_entered
Delayed::Job.enqueue OutboundMailer.deliver_campaign_email(contact,contact_email)
end
end
heroku.set_workers(ENV['HEROKU_APP'], 0)
end
To determine whether I should send an email today or not, I created the method for contact.rb:
def email_today
next_event_info = self.next_event_info # invokes method for contact
next_event = next_event_info[:event]
delay = next_event_info[:delay]
if next_event.class.name == "Email" && from_today(self, next_event.days) + delay < 0 #helper from_today
return next_event
else
return "none"
end
end
Does this look right? I am developing on windows and deploying to heroku so don't know how to test it...thanks!
If you're sending emails out once a day, you probably want to start by add the cron addon to your application, which will fire a rake task once-per day.
Obviously, you'll also need to add the delayed_job plugin (http://docs.heroku.com/delayed-job). In addition, your app will need to be running at least one worker.
Then it's just a matter of doing your mail work from within your cron rake task. For example, if you had a mailer called 'UserMailer', your cron could look something like this:
#lib/cron.rb
task :cron => :environment do
User.all.each do |user|
Delayed::Job.enqueue UserMailer.deliver_notification(user)
end
end
If you're only using background tasks to send these emails, you could probably add also some logic in your cron task, and your mailer methods to add and remove workers as required, which will save you having to pay for the workers while they're not in use.

Resources