Remove non numeric charecters from parsed CSV file - ruby-on-rails

iam new to ruby.i want to remove non numeric characters from phone number parsed from a CSV file.
Here is the code iam using.
require 'csv'
csv_text = File.read('file.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "First Name: #{row['Name']} - HomePhone: #{row['Phone']} - Zip Code: #{row['Zipcode']}"
end
the out put print as Follows
FirstName:Abiel HomePhone:6667-88-76
(In CSV file HomePhone contains non numeric characters.)
I want the out put as FirstName:Abiel HomePhone:66678876

This should work:
row['Phone'].gsub(/[^0-9]/, "")

Yes, or just row['Phone'].gsub(/\D/, "")
where \d means a numeric char, and \D means anything non-numeric.

Related

CSV::MalformedCSVError: Unquoted fields do not allow \r or \n in Ruby

I have CSV's which I am trying to import into my oracle database but unfortunately I keep on getting the same error:
> CSV::MalformedCSVError: Unquoted fields do not allow \r or \n (line 1).
I know there are tons of similar questions which have been asked but none relate specifically to my issue other than this one, but unfortunately it didn't help.
To explain my scenario:
I have CSV's in which the rows don't always end with a value, but
rather, just a comma because it's a null value hence it stays blank.
I would like to import the CSV's regardless of whether the ending is with a comma or without a comma.
Here are the first 5 lines of my CSV with changed values due to privacy reasons,
id,customer_id,provider_id,name,username,password,salt,email,description,blocked,created_at,updated_at,deleted_at
1,,1,Default Administrator,admin,1,1," ",Initial default user.,f,2019-10-04 14:28:38.492000,2019-10-04 14:29:34.224000,
2,,2,Default Administrator,admin,2,1,,Initial default user.,,2019-10-04 14:28:38.633000,2019-10-04 14:28:38.633000,
3,1,,Default Administrator,admin,3,1," ",Initial default user.,f,2019-10-04 14:41:38.030000,2019-11-27 10:23:03.329000,
4,1,,admin,admin,4,1," ",,,2019-10-28 12:21:23.338000,2019-10-28 12:21:23.338000,
5,2,,Default Administrator,admin,5,1," ",Initial default user.,f,2019-11-12 09:00:49.430000,2020-02-04 08:20:06.601000,2020-02-04 08:20:06.601000
As you can see the ending is sometimes with or without a comma and this structure repeats quite often.
This is my code with which I have been playing around with:
def csv_replace_empty_string
Dir.foreach(Rails.root.join('db', 'csv_export')) do |filename|
next if filename == '.' or filename == '..' or filename == 'extract_db_into_csv.sh' or filename =='import_csv.rb'
read_file = File.read(Rails.root.join('db', 'csv_export', filename))
replace_empty_string = read_file.gsub(/(?<![^,])""(?![^,])/, '" "')
format_csv = replace_empty_string.gsub(/\r\r?\n?/, "\n")
# format_csv = remove_empty_lines.sub!(/(?:\r?\n)+\z/, "")
File.open(Rails.root.join('db', 'csv_export', filename), "w") {|file| file.puts format_csv }
end
end
I have tried using many different kinds of gsubs found online in similar forums, but it didn't help.
Here is my function for importing the CSV in the db:
def import_csv_into_db
Dir.foreach(Rails.root.join('db', 'csv_export')) do |filename|
next if filename == '.' or filename == '..' or filename == 'extract_db_into_csv.sh' or filename =='import_csv.rb'
filename_renamed = File.basename(filename, File.extname(filename)).classify
CSV.foreach(Rails.root.join('db', 'csv_export',filename), :headers => true, :skip_blanks => true) do |row|
class_name = filename_renamed.constantize
class_name.create!(row.to_hash)
puts "Insert on table #{filename_renamed} complete"
end
end
end
I have also tried the options provided by CSV such as :row_sep => :"\n" or :row_sep => "\r" but keep on getting the same error.
I am pretty sure I have some sort of thinking error, but I can't seem to figure it out.
I fixed the issue by using the following:
format_csv = replace_empty_string.gsub(/\r\r?\n?/, "\n")
This was originally #mgrims answer, but I had to adjust my code by further removing the :skip_blanks :row_sep options.
It is importing successfully now!

Rails - Loop through csv separated by bars

I have a CSV document with one column and 1000 rows. Each row has a string of data which is seperated by "|".
For example
BOB|MARLEY|306336|Friday| 9:00AM|02 DIS 2|HELE TP 1|PARRA|JULIA|20 Jul 2018|TOMPSON|TORI|21332|NA|AUS|4214|||0400 000 000|zzz11#bigpond.com|.0000|NULL|NULL|0|QLD|F|2016-06-22 00:00:00.000|
I need to loop through each row then split the string into another array. I then need to loop through each of those arrays.
Currently I have
csv_text = open('https://res.cloudinary.com/thypowerhouse/raw/upload/v1534642033/rackleyswimming/HVL_SCHOOL.csv')
csv = CSV.parse(csv_text, :headers=>true)
csv.each do |row|
new_row = row.map(&:inspect).join
new_row = new_row.delete! '[]'
new_row = new_row.gsub('|', '", "')
new_row = new_row.split(',')
puts new_row
end
Don't know if I'm heading in the right direction?
You can use col_sep to separate the data of each row:
require "csv"
CSV.foreach("HVL_SCHOOL.csv", headers: true, col_sep: "|") do |row|
# Your code here, trait your data
end
Every row on the scope of CSV#foreach (previus example) will be a CSV::Row that can be treated as an array because it has enumerable as included module.
I think with this you can do what you want with this data.

Remove brackets and quotation marks from CSV generated output

I use CSV to save each line in a text file as a separate object in the database.
Each line is saved with added closing brackets and double quotes:
["One line of text"]
Is there any option in CSV to exclude those, or else any other nifty way to remove them?
require 'csv'
def self.import_lines_of_text(filename)
csv_file_path = "db/questions/#{filename}"
CSV.foreach(csv_file_path) do |row|
clean_sentence = row.join(",")
self.create!(content: clean_sentence)
end
end

Ruby CSV#fetch case sensitivness

I have simple csv file that has the following header: 'NYC'.
I use CSV fetch method:
http://ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV/Row.html#method-i-fetch
the problem is that when I use fetch with 'NYC' it works properly but when I use fetch with 'nyc' it returns:
KeyError: key not found: nyc
How can I solve this problem?
There's an option :header_converters. You can set it to:
:downcase Calls downcase() on the header String.
:symbol The header String is downcased, spaces are replaced with underscores, non-word characters are dropped, and finally to_sym() is called.
Example:
require 'CSV'
CSV.parse("NYC\nfoo", headers: true, header_converters: :symbol) do |row|
row[:nyc] #=> "foo"
end

How parse the data from TXT file with tab separator?

I am using ruby 1.8.7 , rails 2.3.8. I want to parse the data from TXT dump file separated by tab.
In this TXT dump contain some CSS property look like has some invalid data.
When run my code using FasterCSV gem
FasterCSV.foreach(txt_file, :quote_char => '"',:col_sep =>'\t', :row_sep =>:auto, :headers => :first_row) do |row|
col= row.to_s.split(/\t/)
puts col[15]
end
the error written in console as "Illegal quoting on line 38." Can any one suggest me how to skip the row which has invalid data and proceed data load process of remaining rows?
Here's one way to do it. We go to lower level, using shift to parse each row and then silent the MalformedCSVError exception, continuing with the next iteration. The problem with this is the loop doesn't look so nice. If anyone can improve this, you're welcome to edit the code.
FasterCSV.open(filename, :quote_char => '"', :col_sep => "\t", :headers => true) do |csv|
row = true
while row
begin
row = csv.shift
break unless row
# Do things with the row here...
rescue FasterCSV::MalformedCSVError
next
end
end
end
Just read the file as a regular one (not with FasterCSV), split it like you do know by \t and it should work
So the problem is that TSV files don't have a quote character. The specification simply specifies that you aren't allowed to have tabs in the data.
The CSV library doesn't really support this use case. I've worked around it by specifying a quote character that I know won't appear in my data. For example
CSV.parse(txt_file, :quote_char => '☎', :col_sep => "\t" do |row|
puts row[15]
end

Resources