I have a CSV document with one column and 1000 rows. Each row has a string of data which is seperated by "|".
For example
BOB|MARLEY|306336|Friday| 9:00AM|02 DIS 2|HELE TP 1|PARRA|JULIA|20 Jul 2018|TOMPSON|TORI|21332|NA|AUS|4214|||0400 000 000|zzz11#bigpond.com|.0000|NULL|NULL|0|QLD|F|2016-06-22 00:00:00.000|
I need to loop through each row then split the string into another array. I then need to loop through each of those arrays.
Currently I have
csv_text = open('https://res.cloudinary.com/thypowerhouse/raw/upload/v1534642033/rackleyswimming/HVL_SCHOOL.csv')
csv = CSV.parse(csv_text, :headers=>true)
csv.each do |row|
new_row = row.map(&:inspect).join
new_row = new_row.delete! '[]'
new_row = new_row.gsub('|', '", "')
new_row = new_row.split(',')
puts new_row
end
Don't know if I'm heading in the right direction?
You can use col_sep to separate the data of each row:
require "csv"
CSV.foreach("HVL_SCHOOL.csv", headers: true, col_sep: "|") do |row|
# Your code here, trait your data
end
Every row on the scope of CSV#foreach (previus example) will be a CSV::Row that can be treated as an array because it has enumerable as included module.
I think with this you can do what you want with this data.
Related
I'm trying to import CSV files and the header row keeps moving around, sometimes it's in row 20, sometimes it's in 25, and so on, but the field 'SPIRIT Barcode' is always in the header and it's the only thing I'm interested in at this point. I'm saving it in "barcode".
How do I manipulate this to find the row 'SPIRIT Barcode' is in and use that as the header? (everything above the header can be ignored)
def self.import(file)
#b = []
CSV.foreach(file.path, headers: true) do |row|
entry = ZygReport.find_by(barcode: row['SPIRIT Barcode']) || new
entry.update({
:barcode => row['SPIRIT Barcode'],
})
entry.save!
#b << [entry.barcode]
end
end
Ignore the #b, that's for another function.
This code parses the file only once, reading it line by line and looking for SPIRIT Barcode.
Once the line is found, it removes the newline, and splits it to get an array of header names.
CSV.parse can be called on the file directly. Since it has already been read until the header, it will start at the correct line :
require 'csv'
sep = ';'
File.open('test.csv'){|file|
header = file.find{|line| line.include?('SPIRIT Barcode')}.chomp.split(sep)
CSV.parse(file, headers: header, col_sep: sep).each do |row|
p row
end
}
With test.csv as :
Ignore me
Ignore me
SPIRIT Barcode; B; C
1; 2; 3
4; 5; 6
It outputs :
#<CSV::Row "SPIRIT Barcode":"1" " B":" 2" " C":" 3">
#<CSV::Row "SPIRIT Barcode":"4" " B":" 5" " C":" 6">
It shouldn't be hard to adapt it to your data.
This is untested, but it's the general idea for what you're trying to do:
require 'csv'
header_found = false
File.foreach('path/to/file.csv') do |li|
next unless header_found || li['SPIRIT Barcode']
header_found = true
# start processing CSV...
CSV.parse(li) do |row|
# use row here...
end
end
You'll have to figure out what to do when the header is found if you want to keep track of the values.
Is there any way to tell the CSV object that a line break between quotes is not a row delimiter?
My CSV file is:
"a","b","c"
1,"some
text with line break",21
2,"blah",4
My code is:
CSV.foreach(file_path, headers: true) do |row|
puts row
end
I want it to return only two rows, but it returns three.
You're (wrongly) judging the number of rows by the number of printed lines. It returns two. Go figure:
[4] pry(main)> CSV.foreach('example.csv', headers: true).to_a
=> [
#<CSV::Row "a":"1" "b":"some\ntext with line break" "c":"21">,
#<CSV::Row "a":"2" "b":"blah" "c":"4">
]
Your code outputs three lines because you're printing the rows out and line break is printed as-is. That makes it look as if one row became two. Thinking the same way, I'd say that your source CSV contains 4 (four!) rows. And that isn't really true.
Currently, you can set your header into true then to show your data row.to_hash. Example:
CSV.foreach("/home/akbar/text.csv", headers: true) do |row|
puts row.to_hash
end
The result is:
1.9.3p194 :034 > CSV.foreach("/home/akbar/text.csv", headers: true) do |x|
1.9.3p194 :035 > puts x.to_hash
1.9.3p194 :036?> end
{"a"=>"1", "b"=>"some\ntext with line break", "c"=>"21"}
{"a"=>"2", "b"=>"blah", "c"=>"4"}
For more information see "ruby-on-rails-import-data-from-a-csv-file".
For those who getting trouble when trying to read a CSV file that contains a line break in any row, just read it with row_sep: '\r\n'
data = CSV.read('your_file.csv', row_sep: "\r\n")
I'm exporting a CSV from many different sources which makes it very hard to sort before putting it into the CSV.
csv = CSV.generate col_sep: '#' do |csv|
... adding a few columns here
end
Now, it would be awesome if I was able to sort this CSV by the 2nd column. Is that in any way possible?
If you're trying to sort before writing, it depends on your data structure, in which i'll need to see your code a bit more. For reading a csv, you can convert it to hash and sort by header name even:
rows = []
CSV.foreach('mycsvfile.csv', headers: true) do |row|
rows << row.to_h
end
rows.sort_by{ |row| row['last_name'] }
Edit to use sort_by, thanks to max williams.
Here is how you would sort by column number:
rows = []
CSV.foreach('mycsvfile.csv', headers: true) do |row|
# collect each row as an array of values only
rows << row.to_h.values
end
# sort in place by the 2nd column
rows.sort_by! { |row| row[1] }
rows.each do |row|
# do stuff with your now sorted rows
end
I have already CSV file, the content like
a1 a2 a3
1 2 3
4 5 6
5 8 2
Now, What I want, when I read any row i want to add a flag in the csv file like
a1 a2 a3 flag
1 2 3 1
4 5 6 1
5 8 2
the above flag 1 that means this record is inserted in the table.
so How can I add flag in the csv file?
Thanks In Advance
I came up with two ways to append a column(s) to an existing CSV file.
Method 1 late merges the new column by reading the file into an array of hashes, then appending the columns to the end of each row. This method can exhibit anomalies if run multiple times.
require 'csv'
filename = 'test.csv'
# Load the original CSV file
rows = CSV.read(filename, headers: true).collect do |row|
row.to_hash
end
# Original CSV column headers
column_names = rows.first.keys
# Array of the new column headers
additional_column_names = ['flag']
# Append new column name(s)
column_names += additional_column_names
s = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Original CSV values
values = row.values
# Array of the new column(s) of data to be appended to row
additional_values_for_row = ['1']
values += additional_values_for_row
csv << values
end
end
# Overwrite csv file
File.open(filename, 'w') { |file| file.write(s) }
Method 2 early merges the new column(s) into the row hash. The nicety of this method is it is more compact and avoids duplicate column names if run more than once. This method can also be used to change any existing values in the CSV.
require 'csv'
filename = 'test.csv'
# Load the original CSV file
rows = CSV.read(filename, headers: true).collect do |row|
hash = row.to_hash
# Merge additional data as a hash.
hash.merge('flag' => '0')
# BONUS: Change any existing data here too!
hash.merge('a1' => hash['a1'].to_i + 1 )
end
# Extract column names from first row of data
column_names = rows.first.keys
txt = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Extract values for row of data
csv << row.values
end
end
# Overwrite csv file
File.open(filename, 'w') { |file| file.write(txt) }
You need to write new CSV file with additional column, and then replace original file with new one.
Not sure if you can append a new column in the same file, but you can append a new row into your csv:
CSV.open('your_csv.csv', 'w') do |csv|
customers.array.each do |row|
csv << row
end
end
Hope this helps.
I'm using ruby 1.9.2. My csv file as follows..,
NAME, Id, No, Dept
Tom, 1, 12, CS
Hendry, 2, 35, EC
Bahamas, 3, 21, IT
Frank, 4, 61, EE
I want to print an specific row say ('Tom'). I tried out in many ways, but I didn't find the exact result. The most recommended options is "Fastercsv". But it is applicable for my version. Also, I noticed that csv print the field as column wise. How to print an entire row using csv in rails. My ruby code is as follows
require 'csv'
csv_text = File.read('sampler.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "#{row[:NAME]},#{row[:Id]},#{row[:No]},#{row[:Dept]}"
end
Use .find
csv = CSV.read('sampler.csv', headers: true)
puts csv.find {|row| row['NAME'] == 'Tom'} #=> returns first `row` that satisfies the block.
Here's another approach that keeps the code within the CSV API.
csv_table is a CSV::Table
row is a CSV::Row
row_with_specified_name is a CSV::Row.
csv_table = CSV.table("./tables/example.csv", converters: :all)
row_with_specified_name = csv_table.find do |row|
row.field(:name) == 'Bahamas'
end
p row_with_specified_name.to_csv.chomp #=> "Bahamas,3,21,IT"
FYI, CSV.table is just a shortcut for:
CSV.read( path, { headers: true,
converters: :numeric,
header_converters: :symbol }.merge(options) )
As per the docs.
If you have a large CSV file and want to find an exact row it will be way faster and way less memory intense to read one line at a time.
require 'csv'
csv = CSV.open('sampler.csv', 'r', headers: true)
while row = csv.shift
if row['name'] == 'Bahamas'
break
end
end
pp row