how to skip/ignore malformed CSV when using CSV.foreach? - ruby-on-rails

I tried to read large csv file
but the csv are on bad condition
so some of it's line throwing CSV::MalformedCSVError
I just want to ignore the error line and move onto next line
I tried to add begin rescue but seems my code is not working, it stopped at the error
my current code
require 'csv'
begin
CSV.foreach(filename, :headers => true) do |row|
Moulding.create!(row.to_hash)
end
rescue
next
end

I don't think you can do it with the foreach method because the exception does not seem to be raised within the block but rather within the foreach method itself, but something like this should work. In this case the exception is raised on the call to shift, which you can then rescue out of.
require 'csv'
csv_file = CSV.open("test.csv", :headers => true)
loop do
begin
row = csv_file.shift
break unless row
p row
rescue CSV::MalformedCSVError
puts "skipping bad row"
end
end
BTW your code above does not run because when you moved begin rescue to surround the foreach method , next is no longer valid in that context. Commenting out the next statement the code runs but when the exception is raised in the foreach method the method just ends and program moves on to the rescue block and does not read any more lines from the file.

Related

Escaping ensure in console/rake/rails

Given something like this
def infinite
puts Time.now
rescue => err
puts err.message
ensure
infinite
end
When you run this in console/rake and hit ctrl-c - nothing happens. How do you escape this with CTRL-C?
Use catch instead which is an alternative control flow.
catch executes its block. If throw is not called, the block executes normally, and catch returns the value of the last expression evaluated.
Ruby searches up its stack for a catch block whose tag has the same object_id as the throw (symbols are almost always used as the have the same object_id). When found, the block stops executing and returns val (or nil if no second argument was given to throw).
def infinate
catch(:done) do
begin
infinite
rescue SystemExit, Interrupt => e
puts e.message
throw :done
end
end
end
Using ensure with a condition like that is semantically wrong as the whole point of ensure is to run code that always should be run.
Using rescue to create an infinite loop via recursion seem overly complicated and could cause a SystemStackError later on.
Why not use an actual loop:
def infinite
loop do
begin
puts Time.now
rescue => err
puts err.message
end
end
end
With the above, Ctrl-C works just fine, because rescue without an explicit exception class will only handle StandardErrors.
I'm not sure if this is the proper solution but this worked for me:
def infinite
puts Time.now
rescue SystemExit, Interrupt
#skip_ensure = true
puts 'SystemExist/Interrupt'
rescue => err
puts err.message
ensure
infinite unless #skip_ensure
end

Exception not falling in models

I wrote a function in ruby on rails model like below
def sorted_exp
begin
exp.order("currently_active = true DESC, TO_DATE(to_date, 'MM-DD-YYYY') DESC")
rescue
exp.order("currently_active = true DESC")
end
end
but there are few entries in to_date column due to which exception falls like 'september 2018'. When I tried to handle exception in model, it failed, and does not go in the rescue section. I don't know why it does not catch the error in model here, and why it does not return a query in the rescue section.
The exception raised is the following:
PG::InvalidDatetimeFormat: ERROR: invalid value "se" for "MM"
In the sorted_exp method, the output of the query is not being used. Rails actually executes the call to the DB when the value of the call is being used. In this case, the value of this is probably being used in some other function and the error is being raised from there, pointing to this line: exp.order("currently_active = true DESC, TO_DATE(to_date, 'MM-DD-YYYY') DESC")
I'm not sure of your exact use case, but the only way of catching the exception in this block would be to use values that the query is supposed to return, like counting the number of objects returned(Again, it depends on your use case).
For example, the following query raises an error inspite of being in a begin..rescue block:
begin
User.order("TO_DATE(users.created_at, 'MONTH-YYYY') DESC")
rescue
puts "In rescue block"
end
This raises the following error:
ActiveRecord::StatementInvalid: PG::UndefinedFunction: ERROR: function to_date(timestamp without time zone, unknown) does not exist
However, when the output of this query is used in the begin block itself, the exception gets caught. For example:
begin
sorted_users = User.order("TO_DATE(users.created_at, 'MONTH-YYYY') DESC")
count = sorted_users.count
rescue
puts "In rescue block"
end
The output for this is:
In rescue block
This is because the query was actually executed in the begin block itself, and hence, got caught by our rescue block.

Why doesn't calling next within a rescue block within a transaction within a loop work?

I have a loop like this:
# Iterate a list of items
req_wf_list.each do |req_wf|
# Begin a transaction
ReqWf.transaction do # ReqWf is an ActiveRecord model class
# Do some things
# ...
# 1. I want to be able to continue processing with the
# next iteration of the loop if there is an error here
# 2. I also want to rollback the transaction associated with
# this particular iteration if I encounter an error
begin
# Do something that might return an error
rescue
# Do some error processing
puts "Caught such and such error"
# Don't complete transaction (rollback),
# don't "do some more things",
# proceed to next item in req_wf_list
next
end
# Do some more things
# Shouldn't make it here if there is an error but I do indeed make it here
# ...
# End transaction
end
# End loop
end
Now, I would expect that calling "next" within the rescue block would cause the transaction associated with that particular iteration of the loop to rollback and for execution to resume at the top of the next iteration of the loop. Instead, execution appears to resume at the "Do some more things" line. It is as if the "next" statement is completely ignored. What am I missing?
Most likely that in this case next applies to transaction so you are in a nested loop situation.
This is an example of what can be done to solve the issue
req_wf_list.each do |req_wf|
catch :go_here do #:missingyear acts as a label
ReqWf.transaction do
throw :go_here unless something #break out of two loops
end
end #You end up here if :go_here is thrown
end
But in general, it is not a good practice to use next. You should be able to put a global begin .. rescue and have all the conditions inside of it, so that nothing else gets executed once you catch an error.
Update
I did some a small test and the behavior is as you expect it.
loop = [1,2,3]
loop.each do |value|
puts "value => #{value}"
ActiveRecord::Base.transaction do
puts "Start transaction"
begin
raise
rescue
puts "ActiveRecord::StatementInvalid"
next
end
puts "Should not get here!"
end
end
The output is the following:
value => 1
Start transaction
ActiveRecord::StatementInvalid
value => 2
Start transaction
ActiveRecord::StatementInvalid
value => 3
Start transaction
ActiveRecord::StatementInvalid
Is it possible that you had another error in your code before the next was being called ?
In any case, using the next statement is not the best option as I said before.

Rails importing CSV fails due to mal-formation

I get a CSV:MalFormedCSVError when I try to import a file using the following code:
def import_csv(filename, model)
CSV.foreach(filename, :headers => true) do |row|
item = {}
row.to_hash.each_pair do |k,v|
item.merge!({k.downcase => v})
end
model.create!(item)
end
end
The csv files are HUGE, so is there a way I can just log the bad formatted lines and CONTINUE EXECUTION with the remainder of the csv file?
You could try handling the file reading yourself and let CSV work on one line at a time. Something like this:
File.foreach(filename) do |line|
begin
CSV.parse(line) do |row|
# Do something with row...
end
rescue CSV::MalformedCSVError => e
# complain about line
end
end
You'd have to do something with the header line yourself of course. Also, this won't work if you have embedded newlines in your CSV.
One problem with using File to manually go through each line in the file is that CSV files can contain fields with \n (newline character) in them. File will take that to indicate a newline and you will end up trying to parse a partial row.
Here is an another approach that might work for you:
#csv = CSV.new('path/to/file.csv')
loop do
begin
row = #csv.shift
break unless row
# do stuff
rescue CSV::MalformedCSVError => error
# handle the error
next
end
end
The main downside that I see with this approach is that you don't have access to the CSV row string when handling the error, just the CSV::MalformedCSVError itself.

rescuing from Mysql2::Error

I have a simple question. I have a join table which has an index that ensure that (col 1, col 2) is unique.
I am adding to that table using mysql2 gem and am trying to catch the Mysql2::Error if the attempt results in a duplicate key error. While I am getting the duplicate key error, my rescue body is not being executed.
begin
self.foo << bar
rescue Mysql2::Error
logger.debug("#{$!}")
end
I am receiving the following error upon executing self.foo << bar
Mysql2::Error: Duplicate entry '35455-6628' for key 'index_foos_bars_on_foo_id_and_bar_id': INSERT INTO foos_bars (foo_id, bar_id) VALUES (35455, 6628)
BUT my rescue statement is not being hit! The exception is not be successfully rescued from. What am I doing wrong? If I remove Mysql2::Error and rescue for everything, then it works. But that is bad practice - I just want to rescue from Mysql2::Error which in the event of a duplicate entry.
Thanks,
Mysql2::Error is wrapped in another exception class now. Change your code to:
begin
self.foo << bar
rescue Exception => e # only for debug purposes, don't rescue Exception in real code
logger.debug "#{e.class}"
end
...and you'll see the real exception class that you need to rescue.
Edit: It seems in this case it turned out to be ActiveRecord::RecordNotUnique

Resources