Passing a Block to a delayed_job - ruby-on-rails

I have a function that is marked to be handled asynchronously by delayed_job:
class CapJobs
def execute(params, id)
begin
unless Rails.env == "test"
Capistrano::CLI.parse(params).execute!
end
rescue
site = Site.find(id)
site.records.create!(:date => DateTime.now, :action => "Task Failure: #{params[0]}", :type => :failure)
site.save
ensure
yield id
end
end
handle_asynchronously :execute
end
When I run this function I pass in a block:
capjobs = CapJobs.new
capjobs.execute(parameters, #site.id) do |id|
asite = Site.find(id)
asite.records.create!(:date => DateTime.now, :action => "Created", :type => :init)
asite.status = "On Demo"
asite.dev = true
asite.save
end
This works fine when run without delayed_job but when run with it I get the following error
2012-08-13T09:24:36-0300: [Worker(delayed_job host:eagle pid:12089)] SitesHelper::CapJobs#execute_without_delay failed with LocalJumpError: no block given (yield) - 0 failed attempts
2012-08-13T09:24:36-0300: [Worker(delayed_job host:eagle pid:12089)] PERMANENTLY removing SitesHelper::CapJobs#execute_without_delay because of 1 consecutive failures.
2012-08-13T09:24:36-0300: [Worker(delayed_job host:eagle pid:12089)] 1 jobs processed at 0.0572 j/s, 1 failed ...
It seems not to pick up the block that was passed in. Is this not the correct way of doing this or should I find a different method?

delayed_job works by saving your jobs into a data store (most often your primary database) and then loading the jobs out of this data store in a background process, where it is handled/executed.
To save a job into the database, delayed_job needs to somehow save what method to call on which object with what arguments. This is done by serializing everything into a string (delayed_job uses yaml for that). Unfortunately, blocks cannot be serialized. So the background worker does not know about the block argument and calls the method without it. This results in the LocalJumpError when the method is trying to yield to the block.

I found a method of doing this. It is kind of hacky but works well. I found this article that talks about creating a SerializableProc class. If I pass this to the function then everything works great.

Most people would treat this as an abstraction problem.
The proc code is probably not changing from run-to-run (except vars) and so you should make the block code into a class or instance method. Pass the name of that method, and then call it in your execute method, like
#some_data = CapJobs.send( target_method )
or perhaps-better-even
#some_data = DomainSpecificModel.send( target_method )

Related

Speed up rake task by using typhoeus

So i stumbled across this: https://github.com/typhoeus/typhoeus
I'm wondering if this is what i need to speed up my rake task
Event.all.each do |row|
begin
url = urlhere + row.first + row.second
doc = Nokogiri::HTML(open(url))
doc.css('.table__row--event').each do |tablerow|
table = tablerow.css('.table__cell__body--location').css('h4').text
next unless table == row.eventvenuename
tablerow.css('.table__cell__body--availability').each do |button|
buttonurl = button.css('a')[0]['href']
if buttonurl.include? '/checkout/external'
else
row.update(row: buttonurl)
end
end
end
rescue Faraday::ConnectionFailed
puts "connection failed"
next
end
end
I'm wondering if this would speed it up, Or because i'm doing a .each it wouldn't?
If it would could you provide an example?
Sam
If you set up Typhoeus::Hydra to run parallel requests, you might be able to speed up your code, assuming that the Kernel#open calls are what's slowing you down. Before you optimize, you might want to run benchmarks to validate this assumption.
If it is true, and parallel requests would speed it up, you would need to restructure your code to load events in batches, build a queue of parallel requests for each batch, and then handle them after they execute. Here's some sketch code.
class YourBatchProcessingClass
def initialize(batch_size: 200)
#batch_size = batch_size
#hydra = Typhoeus::Hydra.new(max_concurrency: #batch_size)
end
def perform
# Get an array of records
Event.find_in_batches(batch_size: #batch_size) do |batch|
# Store all the requests so we can access their responses later.
requests = batch.map do |record|
request = Typhoeus::Request.new(your_url_build_logic(record))
#hydra.queue request
request
end
#hydra.run # Run requests in parallel
# Process responses from each request
requests.each do |request|
your_response_processing(request.response.body)
end
end
rescue WhateverError => e
puts e.message
end
private
def your_url_build_logic(event)
# TODO
end
def your_response_processing(response_body)
# TODO
end
end
# Run the service by calling this in your Rake task definition
YourBatchProcessingClass.new.perform
Ruby can be used for pure scripting, but it functions best as an object-oriented language. Decomposing your processing work into clear methods can help clarify your code and help you catch things like Tom Lord mentioned in the comments on your question. Also, instead of wrapping your whole script in a begin..rescue block, you can use method-level rescues as in #perform above, or just wrap #hydra.run.
As a note, .all.each is a memory hog, and is thus considered a bad solution to iterating over records: .all loads all of the records into memory before iterating over them with .each. To save memory, it's better to use .find_each or .find_in_batches, depending on your use case. See: http://api.rubyonrails.org/classes/ActiveRecord/Batches.html

Ruby on Rails - weird behaviour logic by delayed job

I am doing the delayed_job by tobi and when I run the delayed_job but the fbLikes count is all wrong and it seems to increment each time I add one more company. Not sure wheres the logic wrong. The fbLikes method I tested before and it work(before I changed to delayed_job)
not sure where the "1" come from...
[output]
coca-cola
http://www.cocacola.com
Likes: 1 <--- Not sure why the fbLikes is 1 and it increment with second company fbLikes is 2 and so on...
.
[Worker(host:aname.local pid:1400)] Starting job worker
[Worker(host:aname.local pid:1400)] CountJob completed after 0.7893
[Worker(host:aname.local pid:1400)] 1 jobs processed at 1.1885 j/s, 0 failed ...
I am running the delayed_job in Model and trying to run the job of
counting the facebook likes
here is my code.
[lib/count_rb.job]
require 'net/http'
class CountJob< Struct.new(:fbid)
def perform
uri = URI("http://graph.facebook.com/#{fbid}")
data = Net::HTTP.get(uri)
return JSON.parse(data)['likes']
end
end
[Company model]
before_save :fb_likes
def fb_likes
self.fbLikes = Delayed::Job.enqueue(CountJob.new(self.fbId))
end
the issue is coming from
before_save :fb_likes
def fb_likes
self.fbLikes = Delayed::Job.enqueue(CountJob.new(self.fbId))
end
the enqueue method will not return the results of running the CountJob. I believe it will return whether the job successfully enqueued or not and when you are saving this to the fb_likes value it will evaluate to 1 when the job is enqueued successfully.
You should be setting fbLikes inside the job that is being run by delayed_job not as a result of the enqueue call.
before_save :enqueue_fb_likes
def fb_likes
Delayed::Job.enqueue(CountJob.new(self.fbId))
end
Your perform method in the CountJob class should probably take the model id for you to look up and have access to the fbId and the fbLikes attributes instead of just taking the fbId.
class CountJob< Struct.new(:id)
def perform
company = Company.find(id)
uri = URI("http://graph.facebook.com/#{company.fbid}")
data = Net::HTTP.get(uri)
company.fbLikes = JSON.parse(data)['likes']
company.save
end

Rails find_or_create_by where block runs in the find case?

The ActiveRecord find_or_create_by dynamic finder method allows me to specify a block. The documentation isn't clear on this, but it seems that the block only runs in the create case, and not in the find case. In other words, if the record is found, the block doesn't run. I tested it with this console code:
User.find_or_create_by_name("An Existing Name") do |u|
puts "I'M IN THE BLOCK"
end
(nothing was printed). Is there any way to have the block run in both cases?
As far as I understand block will be executed if nothing found. Usecase of it looks like this:
User.find_or_create_by_name("Pedro") do |u|
u.money = 0
u.country = "Mexico"
puts "User is created"
end
If user is not found the it will initialized new User with name "Pedro" and all this stuff inside block and will return new created user. If user exists it will just return this user without executing the block.
Also you can use "block style" other methods like:
User.create do |u|
u.name = "Pedro"
u.money = 1000
end
It will do the same as User.create( :name => "Pedro", :money => 1000 ) but looks little nicer
and
User.find(19) do |u|
..
end
etc
It doesn't seem to me that this question is actually answered so I will. This is the simplest way, I think, you can achieve that:
User.find_or_create_by_name("An Existing Name or Non Existing Name").tap do |u|
puts "I'M IN THE BLOCK REGARDLESS OF THE NAME'S EXISTENCE"
end
Cheers!

Memcached always miss (rails)

I have a class with this method:
def telecom_info
Rails.cache.fetch("telecom_info_for_#{ref_num}", :expires_in=> 3.hours) do
info = Hash.new(0)
Telecom::SERVICES.each do |source|
results = TelecomUsage.find(:all,
:joins=>[:telecom_invoice=>{ :person=> :org_person}],
:conditions=>"dotted_ids like '%#{ref_num}%' and telecom_usages.ruby_type = '#{source}'",
:select=>"avg(charge) #{source.upcase}_AVG_CHARGE,
max(charge) #{source.upcase}_MAX_CHARGE,
min(charge) #{source.upcase}_MIN_CHARGE,
sum(charge) #{source.upcase}_CHARGE,
avg(volume) #{source.upcase}_AVG_VOLUME,
max(volume) #{source.upcase}_MAX_VOLUME,
min(volume) #{source.upcase}_MIN_VOLUME,
sum(volume) #{source.upcase}_VOLUME
")
results = results.first
['charge', 'volume'].each do |source_type|
info["#{source}_#{source_type}".to_sym] = results.send("#{source}_#{source_type}".downcase).to_i
info["#{source}_min_#{source_type}".to_sym] = results.send("#{source}_min_#{source_type}".downcase).to_i
info["#{source}_max_#{source_type}".to_sym] = results.send("#{source}_max_#{source_type}".downcase).to_i
info["#{source}_avg_#{source_type}".to_sym] = results.send("#{source}_avg_#{source_type}".downcase).to_i
end
end
return info
end
end
As you can see, this is an expensive call, and it is called ALOT for each request so I want to cache it. The problem is that memcached does not seem to work, in the log file, I am getting:
Cache read: telecom_info_for_60000000
Cache miss: telecom_info_for_60000000 ({})
The weird thing is that I know memcached is working since it does cache the results of some other functions I have in another model.
Any suggestions? I am running Rails 2.3.5 on REE 1.8.7
Replace return info with info.
Rails.cache.fetch("telecom_info_for_#{ref_num}", :expires_in=> 3.hours) do
# ...
info
end
The return keyword always returns from the current method, which means that info is never returned to your call to Rails.cache.fetch, nor is the rest of that method ever executed. When the last statement simply is info, this is the value that will be given to Rails.cache.fetch, and you will allow the method to finish its duty by storing this value in the cache.
Compare the following:
def my_method
1.upto(3) do |i|
# Calling return immediately causes Ruby to exit the current method.
return i
end
end
my_method
#=> 1
As a rule of thumb: always omit return unless you really mean to exit the current block and return from the current method.

How can I output a calculated value using .detect in Ruby on Rails? (or alternative to .detect)

I currently have the following code:
events.detect do |event|
#detect does the block until the statement goes false
self.event_status(event) == "no status"
end
What this does is output the instance of event (where events is a string of different Models that all collectively call Events) when the event_status method outputs a "no status".
I would like the output to also include the value for delay where:
delay = delay + contact.event_delay(event)
event_delay method hasn't been written, but it would be similar (maybe redundant but I'll deal with that later) to event_status in looking at the delay between when an event was done and when it was supposed to be done.
Here is how event_status looks currently for reference:
def event_status target
# check Ticket #78 for source
target_class= target.class.name
target_id = target_class.foreign_key.to_sym
assoc_name = "contact_#{target_class.tableize}"
r = send(assoc_name).send("find_by_#{target_id}", target.id)
return "no status" unless r
"sent (#{r.date_sent.to_s(:long)})"
end
My concept of output should be [event,delay] so that, for example, I can access it as Array[:event] or Array[:delay] to get at the value.
****I was thinking maybe I should use yield on a method, but haven't quite put the pieces together (should the block passed to the method be the delay =+ for example, I think it is).**
I am not wed to the .detect method, it's what I started with and it appears to work, but it isn't allowing me to run the tally alongside it.
It's not entirely clear what you're asking for, but it sounds like you're trying to add up a delay until you reach a certain condition, and return the record that triggered the condition at the same time.
You might approach that using Enumerable#detect like you have, but by keeping a tally on the side:
def next_event_info
next_event = nil
delay = 0
events.detect do |event|
case (self.event_status(event))
when "no status"
true
else
delay += contact.event_delay(event)
false
end
end
[ next_event, delay ]
end
Update for if you want to add up all delays for all events, but also find the first event with the status of "no status":
def next_event_info
next_event = nil
delay = 0.0
events.each do |event|
case (self.event_status(event))
when "no status"
# Only assign to next_event if it has not been previously
# assigned in this method call.
next_event ||= event
end
# Tally up the delays for all events, converting to floating
# point to ensure they're not native DB number types.
delay += contact.event_delay(event).to_f
end
{
:event => next_event,
:delay => delay
}
end
This will give you a Hash in return that you can interrogate as info[:event] or info[:delay]. Keep in mind to not abuse this method, for example:
# Each of these makes a method call, which is somewhat expensive
next_event = next_event_info[:event]
delay_to_event = next_event_info[:delay]
This will make two calls to this method, both of which will iterate over all the records and do the calculations. If you need to use it this way, you might as well make a special purpose function for each operation, or cache the result in a variable and use that:
# Make the method call once, save the results
event_info = next_event_info
# Use these results as required
next_event = event_info[:event]
delay_to_event = event_info[:delay]

Resources