I have trying to save a new record with delayed job. The code in question is below:
#method being called:
ibo.add_to_database(params[:url])
#method definition
def add_to_database(url)
feed = Feeds.new do |f|
f.url = url
f.title = self.feed_title if self.feed_title
f.link = self.site_link if self.site_link
f.image = self.feed_image if self.feed_image
end
feed.save!
end
handle_asynchronously :add_to_database
I get absolutely no errors, and the job is removed from the database as it should be. Except there is no change to the Feeds model. Anyone have any ideas what gives?
delayed_job runs as a daemon thread, so the first thing to do would be to check whether that it is running:
ps ax | grep delayed_job
the next thing I would check the log of actual delayed job, it would probably have you error description:
less log/delayed_job.log
Other then that, your code sniplet looks fine.
Related
I have a sidekiq worker that waits for a change to happen to a record made by a remote client. Something like the following:
#myworker async process to wait for client to confirm status
perform(myRecordID)
sendClient(myRecordID)
didClientAcknowledge = false
while !didClientAcknowledge
didClientAcknowledge = myRecords.find(myRecordID).status == :ACK_OK
if didClientAcknowledge
break
end
# wait for client to perform an update on the record to confirm status
sleep 5.seconds
end
Rails.logger.info("client got the message")
end
my problem is that although I can see that the client has in fact performed the acknowledgement and updated the record with correct status update (ACK_OK), my sidekiq thread continues to see the old status for myRecord.
I'm sure my logic is flawed here but it seems like the sidekiq process does not "see" changes to the DB...but if I used my rails console I can see that the client has in fact updated the DB as expected...
Thanks!
Edit 1
ok so here's a thought, instead of the loop, I'll schedule another call to the worker within 5 seconds... so here's the updated code:
perform(myRecordID, retry_count)
retry_count -= 1
if retry_count < 1
return
end
sendClient(myRecordID)
didClientAcknowledge = false
if !didClientAcknowledge
didClientAcknowledge = myRecords.find(myRecordID).status == :ACK_OK
if didClientAcknowledge
Rails.logger.info("client got the message")
return
end
# wait for client to perform an update on the record to confirm status
myWorker.perform_in(5.seconds)
end
Rails.logger.info("client got the message")
end
This seems to work, but will test a bit more..one challenge is having a retry count which means I need to maintain some sort of variable between calls to the worker...
edit2 possibly this can be done by passing in the time to the first call and then checking if a timeout has been surpassed before invoking the next instance...assuming time does not stand still as well inside the async call...
edit3 Adding the retry_count argument allows us to control how many times this worker will be spawned...
So I'm building a website that calls a third-party API that can take from 20 seconds to 30 minutes to return a result. But I can't know this duration in advance so need to poll it frequently to check if the work is done (returns "COMPLETE" and the result) or not (returns "IN_PROGRESS"). Also, this API might be called many times from many users at the same time.
So I created a Sidekiq worker that checks the API every 5 seconds until it receives "COMPLETE", and only then it ends. But I've read that Sidekiq should only be doing short-lived jobs, and I'm struggling to get my head around how should I do it. Also I've been trying to search for an answer but I suspect I don't know the words to find what I'm looking for.
I'm sure there is a way I can tell my workers to call the API once, and if the result is "IN_PROGRESS" end but make sure another worker will do another API call to check, and so on and so on until the result is "COMPLETE".
Also, I guess this is also handy to better distribute the load in case many users demand the use of said API, because fewer workers can do more of this short-lived jobs.
This is my worker, which I hope clarifies what I'm doing right now:
class ThingProgressWorker
include Sidekiq::Worker
def perform(id)
#thing = Thing.find(id)
#thing_api_call = ThingAPICall.new // This uses the ruby library of the API
completed = false
while completed == false
result = #thing_api_call.get_result( { thing_job_name: #thing.job_name })
if !result.include? "COMPLETED"
completed = false
sleep 5
else
completed = true
#thing.status = "completed"
#thing.save
break
end
end
end
end
So if the API takes ten minutes to go from "IN_PROGRESS" to "COMPLETED" this worker will be busy for that long, which I recon is not advised at all.
I've been thinking about this for some hours now and can't think of how should I do to make each API call its own job without having a worker busy until the API is done.
The only solution I've thought so far is having a master worker that calls another worker for each API call, but then I'll still have a worker busy for as long as the API takes to send the result.
I'd appreciate any help or directions!
Thanks in advance
Try to call the worker with a delay. for example:
class ThingProgressWorker
include Sidekiq::Worker
def perform(id)
#thing = Thing.find(id)
#thing_api_call = ThingAPICall.new // This uses the ruby library of the API
result = #thing_api_call.get_result( { thing_job_name: #thing.job_name })
if !result.include? "COMPLETED"
ThingProgressWorker.perform_in(1.minute, id)
else
completed = true
#thing.status = "completed”
#thing.save
end
end
end
This will add the worker to the queue but will not run it immediately but in the time you specify.
In a normal test using human and browser, everything is work as expected. However, when I use rspec, I can see that I have:
D, [2014-08-16T13:48:09.510013 #19418] DEBUG -- : SQL (0.6ms) UPDATE "system_flights_cacheds" SET "client_stuff" = '{"captcha":"656556"}' WHERE "system_flights_cacheds"."guid" = '5647046e-4194-498e-a0d7-512614b147d8'
But I cannot believe that actually my database record is not updated. Previously I used .save, but with no success in fact it creates SAVEPOINT.
My code in trouble is basically an API endpoint:
cache = System::Flights::Cached.search_cache options
# update database, when the captcha is present. this way, the worker
# when updating the database can see the changes and act accordingly!
if cache && params[:captcha]
# remember, anyone can (basically) see the captcha. thus,
# this is a bit paranoid, only allow captcha update
# if the user is same! in the json, if not forgotten,
# captcha is only displayed when the user_id is equal
server_stuff = cache.server_stuff.with_indifferent_access
if server_stuff[:user_id] == current_user.id
cache.time_renewed = 10
cache.client_stuff_will_change!
cache.client_stuff ||= {}
cache.client_stuff[:captcha] = params[:captcha]
# cache.save!
cache.update_columns(client_stuff: cache.client_stuff)
end
else
# only spawn worker if there is no captcha parameter passed
spawn_search_worker({user_id: current_user.id, options: options})
end
The client can reach this anytime and it will span worker. When a new record is already in database but is_processed is false, the worker will quit. Thus, calling this multiple times will be ok as also be a means to check status if the work is done or not.
The worker will wait the client to enter for a captcha. So, we have class like WaitableLogin, that do basically:
max_repeat = 3 # 14
# annul flag, if set to true, the data will not get persisted.
annul = false
while max_repeat > 0
# interval of 5 secs that worker can check the database
sleep 5
max_repeat -= 1
# break if captcha already entered by client
# seek from the database if the client has posted
# the captcha text
cache = System::Flights::Cached.search_cache options
client_stuff = nil
client_stuff = cache.client_stuff.with_indifferent_access if cache && cache.client_stuff
if client_stuff && client_stuff[:captcha]
captcha_text = client_stuff[:captcha]
airline.fill_captcha(captcha_text).finalize_login
puts "SOMEHOW I AM HERE: #{captcha_text}"
# remove all server's stuff
cache.server_stuff_will_change!
cache.server_stuff.clear
cache.save!
annul = airline.in_login_page?
end
end
So, WaitableLogin will check if the client_stuff is updated. If it is, then we know that client has submitted the captcha (through the Endpoint, the worker will check if captcha is a param and will update the database if there's captcha field).
The control then transferred back to the Worker. You can see that there's a lot of code that use cache at many parts of the codes across files, cache is just variable name nothing to do with its semantic meaning in Rails or whatever.
When I run normally on browser, I don't see any problem. In fact, no SAVEPOINT even if I use .save. I thought, it is creating some bug somewhere with that SAVEPOINT so I decided to try using .update_columns. But, again, with no success.
This is what the test looks like
before(:each) do
System::Flights::Cached.delete_all
end
describe "requests" do
it "should process 2a1c1i" do
cached = nil
post("/api/v1/x.json", {
access_token: CommonFlightData::ACCESS_TOKEN,
business_token: CommonFlightData::BUSINESS_TOKEN,
captcha: ""
}.merge!(CommonFlightData.oneway_1a(from: "8-9-2014")))
puts "enter the captcha: "
captcha = STDIN.gets.chomp
puts "Entered: #{captcha}"
post("/api/v1/x.json", {
access_token: CommonFlightData::ACCESS_TOKEN,
business_token: CommonFlightData::BUSINESS_TOKEN,
captcha: captcha
}.merge!(CommonFlightData.oneway_1a(from: "8-9-2014")))
sleep 10
SO what am I missing at, I tired. No error raised. When I check .inspect after update_columns, it seems all is updated. But, when you see at the database, nothing is updated.
EDIT: I put lock_version so that I have optimistic locking (by default, I think). And turn out, as expected, it was set to 2.
EDIT 2: If I command an edit from a rails console at the time when the code asking for captcha, IT UPDATES the data. SO, why the RSpec spec that run the api endpoint to submit a captcha won't update the row. All real-life no spec-in-origin code is finely executed.
Nokogiri works fine for me in the console, but if I put it anywhere... Model, View, or Controller, it times out.
I'd like to use it 1 of 2 ways...
Controller
def show
#design = Design.find(params[:id])
doc = Nokogiri::HTML(open(design_url(#design)))
images = doc.css('.well img') ? doc.css('.well img').map{ |i| i['src'] } : []
end
or...
Model
def first_image
doc = Nokogiri::HTML(open("http://localhost:3000/blog/#{self.id}"))
image = doc.css('.well img')[0] ? doc.css('.well img')[0]['src'] : nil
self.update_attribute(:photo_url, image)
end
Both result in a timeout, though they work perfectly in the console.
When you run your Nokogiri code from the console, you're referencing your development server at localhost:3000. Thus, there are two instances running: one making the call (your console) and one answering the call (your server)
When you run it from within your app, you are referencing the app itself, which is causing an infinite loop since there is no available resource to respond to your call (that resource is the one making the call!). So you would need to be running multiple instances with something like Unicorn (or simply another localhost instance at a different port), and you would need at least one of those instances to be free to answer the Nokogiri request.
If you plan to run this in production, just know that this setup will require an available resource to answer the Nokogiri request, so you're essentially tying up 2 instances with each call. So if you have 4 instances and all 4 happen to make the call at the same time, your whole application is screwed. You'll probably experience pretty severe degradation with only 1 or 2 calls at a time as well...
Im not sure what default value of timeout.
But you can specify some timeout value like below.
require 'net/http'
http = Net::HTTP.new('localhost')
http.open_timeout = 100
http.read_timeout = 100
Nokogiri.parse(http.get("/blog/#{self.id}").body)
Finally you can find what is the problem as you can control timeout value.
So, with tyler's advice I dug into what I was doing a bit more. Because of the disconnect that ckeditor has with the images, due to carrierwave and S3, I can't get any info direct from the uploader (at least it seems that way to me).
Instead, I'm sticking with nokogiri, and it's working wonderfully. I realized what I was actually doing with the open() command, and it was completely unnecessary. Nokogiri parses HTML. I can give it HTML in for form of #design.content! Duh, on my part.
So, this is how I'm scraping my own site, to get the images associated with a blog entry:
designs_controller.rb
def create
params[:design][:photo_url] = Nokogiri::HTML(params[:design][:content]).css('img').map{ |i| i['src']}[0]
#design = Design.new(params[:design])
if #design.save
flash[:success] = "Design created"
redirect_to designs_url
else
render 'designs/new'
end
end
def show
#design = Design.find(params[:id])
#categories = #design.categories
#tags = #categories.map {|c| c.name}
#related = Design.joins(:categories).where('categories.name' => #tags).reject {|d| d.id == #design.id}.uniq
set_meta_tags og: {
title: #design.name,
type: 'article',
url: design_url(#design),
image: Nokogiri::HTML(#design.content).css('img').map{ |i| i['src']},
article: {
published_time: #design.published_at.to_datetime,
modified_time: #design.updated_at.to_datetime,
author: 'Alphabetic Design',
section: 'Designs',
tag: #tags
}
}
end
The Update action has the same code for Nokogiri as the Create action.
Seems kind of obvious now that I'm looking at it, lol. I dwelled on this for longer than I'd like to admit...
I'm trying to call two lengthy commands in a when statement, but for some reason, because of its syntax, it performs two of the commands twice when it is called :
#email = Email.find(params[:id])
delivery = case #email.mail_type
# when "magic_email" these two delayed_jobs perform 2x instead of 1x. Why is that?
when "magic_email" then Delayed::Job.enqueue MagicEmail.new(#email.subject, #email.body)
Delayed::Job.enqueue ReferredEmail.new(#email.subject, #email.body)
when "org_magic_email" then Delayed::Job.enqueue OrgMagicEmail.new(#email.subject, #email.body)
when "all_orgs" then Delayed::Job.enqueue OrgBlast.new(#email.subject, #email.body)
when "all_card_holders" then Delayed::Job.enqueue MassEmail.new(#email.subject, #email.body)
end
return delivery
How can I make it so that when I hit when "magic_email", it only renders both those delayed jobs once ?
I have tried this with following example:
q = []
a = case 1
when 1 then q.push 'ashish'
q.push 'kumar'
when 2 then q.push 'test'
when 4 then q.push 'another test'
end
puts a.inspect #######["ashish", "kumar"]
This is working fine. It means your case-when syntax is ok. It might be you have aome other problem.
You are calling return delivery and delivery varible may be having the value to call the delayed job again. It depends on what the then statement returns, so try not to return anything if possible. I believe you want to do the delayed job and not return anything by using the function.
Perhaps you should just have the case and dont store it in any variable. I mean delivery variable has no purpose here.