Andrew
I am new for ROR Developer. i have one table to insert car images. but, that images are remote url. I have to insert 60,000 rows. i got like this "error execution terminated". Can you help how do i fix this issue?
Here My Code:
namespace :db do
task :load_photo => :environment do
require 'rubygems'
require 'open-uri'
require 'net/http'
require 'paperclip'
Website.find_in_batches(:conditions=>["image_url is not null"]) do |websites|
websites.each do |website|
begin
url = URI.parse(website.image_url)
Net::HTTP.start(url.host, url.port) do |http|
if http.head(url.request_uri).code == "200"
Car.update_attribute(:photo,open(url))
end
end
rescue Exception => e
end
end
end
end
end
I would suggest you to not rescue all Exception like you did with :
rescue Exception => e
end
then you will have (and be able to give us) more information about the error generated.
Notice that it is a good practice to rescue only exception you want.
Related
I have a rake task that loops through rows in CSV file, and inside that loop, there's a begin/rescue block to catch any possible raised exception. But when I run it, it keeps on saying 'rake aborted!' and it is not entering the rescue block
CSV.foreach(path, :headers => true) do |row|
id = row.to_hash['id'].to_i
if id.present?
begin
# call to mymethod
rescue => ex
puts "#{ex} error executing task"
end
end
end
...
def mymethod(...)
...
begin
response = RestClient.post(...)
rescue => ex
raise Exception.new('...')
end
end
Expected: It should finish looping all the rows of the CSV
Actual result: It stops after reaching the 'raise' exception saying that:
rake aborted!
Exception: error message here
...
Caused by:
RestClient::InternalServerError: 500 Internal Server Error
You can use next to skip the faulty step of loop:
CSV.foreach(path, :headers => true) do |row|
id = row.to_hash['id'].to_i
if id.present?
begin
method_which_doing_the_staff
rescue SomethingException
next
end
end
end
And raise the exception inside your method:
def method_which_doing_the_staff
stuff
...
raise SomethingException.new('hasd')
end
I solved this issue by just commenting out the line that is raising an exception because it seems like it the quickest fix for now.
# raise Exception.new('...')
I'm still open to other suggestions if there are any better ways to do it.
I use the capybara-webkit gem to scrape data from certain pages in my Rails application. I've noticed, what seems to be "random" / "sporadic", that the application will crash with the following error:
Capybara::Webkit::ConnectionError: /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/bin/webkit_server failed to start.
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/server.rb:56:in `parse_port'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/server.rb:42:in `discover_port'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/server.rb:26:in `start'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/connection.rb:67:in `start_server'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/connection.rb:17:in `initialize'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/driver.rb:16:in `new'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit/driver.rb:16:in `initialize'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit.rb:15:in `new'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-webkit-1.11.1/lib/capybara/webkit.rb:15:in `block in <top (required)>'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-2.7.1/lib/capybara/session.rb:85:in `driver'
from /home/daveomcd/.rvm/gems/ruby-2.3.1/gems/capybara-2.7.1/lib/capybara/session.rb:233:in `visit'
It happens even after it's already connected and accessed a website multiple times before. Here's a code snippet of what I'm using currently...
if site.url.present?
begin
# Visit the URL
session = Capybara::Session.new(:webkit)
session.visit(site.url) # here is where the error occurs...
document = Nokogiri::HTML.parse(session.body)
# Load configuration options for Development Group
roster_table_selector = site.development_group.table_selector
header_row_selector = site.development_group.table_header_selector
row_selector = site.development_group.table_row_selector
row_offset = site.development_group.table_row_selector_offset
header_format_type = site.config_header_format_type
# Get the Table and Header Row for processing
roster_table = document.css(roster_table_selector)
header_row = roster_table.css(header_row_selector)
header_hash = retrieve_headers(header_row, header_format_type)
my_object = process_rows(roster_table, header_hash, site, row_selector, row_offset)
rescue ::Capybara::Webkit::ConnectionError => e
raise e
rescue OpenURI::HTTPError => e
if e.message == '404 Not Found'
raise "404 Page not found..."
else
raise e
end
end
end
I've even thought perhaps I don't find out why it's happening necessarily - but just recover when it does. So I was going to do a "retry" in the rescue block for the error but it appears the server is just down - so I get the same result when retrying. Perhaps someone knows of a way I can check if the server is down and restart it then perform a retry? Thanks for the help!
So after further investigating it appears that I was generating a new Capybara::Session for each iteration of my loop. I moved it outside of the loop and also added Capybara.reset_sessions! at the end of my loop. Not sure if that helps with anything -- but the issue seems to have been resolved. I'll monitor it for the next hour or so. Below is an example of my ActiveJob code now...
class ScrapeJob < ActiveJob::Base
queue_as :default
include Capybara::DSL
def perform(*args)
session = Capybara::Session.new(:webkit)
Site.where(config_enabled: 1).order(:code).each do |site|
process_roster(site, session)
Capybara.reset_sessions!
end
end
def process_roster(site, session)
if site.roster_url.present?
begin
# Visit the Roster URL
session.visit(site.roster_url)
document = Nokogiri::HTML.parse(session.body)
# processing code...
# pass the session that was created as the final parameter..
my_object = process_rows( ..., session)
rescue ::Capybara::Webkit::ConnectionError => e
raise e
rescue OpenURI::HTTPError => e
if e.message == '404 Not Found'
raise "404 Page not found..."
else
raise e
end
end
end
end
end
I have a file alert_import in lib/models/alert_import', I would like to use in my task sth like this:
task :send_automate_alerts => :environment do
# STDERR.puts "Path is #{$:}"
Rake.application.rake_require '../../lib/models/alert_import'
ai = AlertImport::Alert.new(2)
ai.send_email_with_notifcations
end
In this code I get error:
Can't find ../../lib/models/alert_import
in AlertImport I have:
module AlertImport
class Alert
def initialize(number_days)
#number_days = number_days
end
def get_all_alerts
alerts = { }
Organization.automate_import.each do |o|
last_import = o.import_histories.where(import_type: "automate").last
last_successful_import = ImportHistory.last_automate_successful_import(o)
if last_import
if last_import.created_at + #number_days.days >= Time.now
alerts[o.id] ="Error during last automate import Last successful import was #{ last_successful_import ? last_successful_import.created_at : "never"}" if last_import.status == "failure"
alerts[o.id] ="Error during last automate import - status pending Last successful import was #{ last_successful_import ? last_successful_import.created_at : "never"}" if last_import.status == "pending"
else
alerts[o.id] = "There were no new files uploaded within #{#number_days} days"
end
else
alerts[o.id] = "The import was never triggered at all"
end
end
alerts
end
def send_email_with_notifcations
alerts =get_all_alerts
unless alerts.empty?
AlertMailer.email_notifications(alerts).deliver
end
end
end
end
The correct solution is:
desc "Send alerts about automate imports"
task :send_automate_alerts => :environment do
require "#{Rails.root}/lib/models/alert_import"
ai = AlertImport::Alert.new(2)
ai.send_email_with_notifcations
end
In Rails 3.x, I've had success by first importing the file using require and then including the module to the namespace. Here's how it would look:
require 'models/alert_import'
namespace :alerts
include AlertImport
desc 'Send alerts about automate imports'
task send_automate_alerts: :environment do
ai = AlertImport::Alert.new(2)
ai.send_email_with_notifcations
end
end
I tried a few options, most notably trying rake require, but it looks like the documentation for rake_require is incorrect. It specifically will not include files that don't end in .rake
So in the end, I did it "from scratch" - something like this:
```
namespace :my_namespace do
task :my_task do
require File.join(Rails.root, 'app', 'services', 'my_module.rb')
class Wrapper
include MyModule
end
Wrapper.new.the_method_I_need(args)
end
end
Done.
Most probably your path wrong, you can do as follow
task :send_automate_alerts => :environment do
# STDERR.puts "Path is #{$:}"
Rake.application.rake_require "#{Rails.root}/lib/models/alert_import"
ai = AlertImport::Alert.new(2)
ai.send_email_with_notifcations
end
"#{Rails.root}" this will give you the current path of your project
Your path is wrong, you can try:
task :send_automate_alerts => :environment do
# STDERR.puts "Path is #{$:}"
Rake.application.rake_require "#{Rails.root}/lib/models/alert_import"
ai = AlertImport::Alert.new(2)
ai.send_email_with_notifcations
end
Regards!
check out there http://rake.rubyforge.org/classes/Rake/Application.html#M000099
define correct path
Having trouble setting up a Rake Task. Here is the code:
task :fetch_games => :environment do
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::XML(open(url))
games = doc.xpath('//game')
games.each do |game|
#data = Game.new(
:name => game.at('name').text,
:publisher => game.at('publisher').text,)
#data.save
if #data.save
puts "Success"
else
puts "Didn't work"
end
end
end
It runs without error but in the database the entries show: "--- !ruby/object:Nokogiri::XML::Element {}
"
Any help would be awesome. Thanks!
Figured it out myself the xpath syntax was incorrect. I need to use:
/game
instead of
//game
I have a rake task to load car's image from the table websites using paperclip. The image stored in database as a remote link.
Here is my code and i'm using ruby 1.8.7, rails 2.3.8 and DB mysql.
namespace :db do
task :load_photo => :environment do
require 'rubygems'
require 'open-uri'
require 'net/http'
require 'paperclip'
begin
images =Website.find(:all,:conditions=>["image_url is not null"])
images.each do |photo|
url = URI.parse(photo.image_url)
Net::HTTP.start(url.host, url.port) do |http|
if http.head(url.request_uri).code == "200"
Car.update_attribute(:photo,open(url))
end
end
end
rescue Exception => e
end
end
end
Run above rake task by db:load_photo. In my table (Website) has 60,000 rows. Rake task running upto 10000 rows only and execution terminated with an error message "execution expired".
Can any one help me to figure this out?
Thanks in advance.
You may find it more performant to run it in batches, active record has a find_in_batches method which stops loading all the records into memory at one time.
http://ryandaigle.com/articles/2009/2/23/what-s-new-in-edge-rails-batched-find
You could change your code to look like:
namespace :db do
task :load_photo => :environment do
require 'rubygems'
require 'open-uri'
require 'net/http'
require 'paperclip'
Website.find_in_batches(:conditions=>["image_url is not null"]) do |websites|
websites.each do |website|
begin
url = URI.parse(website.image_url)
Net::HTTP.start(url.host, url.port) do |http|
if http.head(url.request_uri).code == "200"
Car.update_attribute(:photo,open(url))
end
end
rescue Exception => e
end
end
end
end
end
I can only guess, but it looks like you're making a little DoS attack to the server you're pulling images from.
You can try to play with a little delay between sequential requests (like "sleep 1").
Also, if your "execution expired" is a Timeout::Error exception, then you can't catch it with
rescue Exception => e
because Timeout::Error is not a subclass of StandardError, it is a subclass of the Interrupt class. You have to catch it explicitly, like so:
rescue Timeout::Error => e