I'm trying to get fastercsv setup so that rather than parsing each row, it will place each column into an multi array.
CSV import file:
id, first name, last name, age
1, joe, smith, 11
2, jane, doe, 14
Save to array named people:
people[0][0] would equal id
people[2][1] would equal jane
This is what I currently have:
url = 'http://url.com/file.csv'
open(url) do |f|
f.each_line do |line|
FasterCSV.parse(line) do |row|
row
end
end
end
Any help is appreciated.
Have you read the FasterCSV documentation?
If you did, you would know that the easiest way to do what you want is:
people = FasterCSV.read('http://url.com/file.csv')
Thanks EmFi, with your help I was able to come up with a solution.
This takes a remote url csv file and loads it into a multi-dimensional array, based on columns.
require 'rio'
require 'fastercsv'
url = 'http://remoteurl.com/file.csv'
people = FasterCSV.parse(rio(url).read)
You can use CsvMapper on top of FasterCSV
It s a life saver
Related
I'm going to preface that I'm still learning ruby.
I'm writing a script to parse a .csv and identify possible duplicate records in the data-set.
I have a .csv file with headers, so I'm parsing the data so that I can access each row using a header title as such:
#contact_table = CSV.parse(File.read("app/data/file.csv"), headers: true)
# Prints all last names in table
puts contact_table['last_name']
I'm trying to iterate over each row in the table and identify if the last name I'm currently iterating over is similar to the next last name, but I'm having trouble doing this. I guess the way I'm handling it is as if it's an array, but I checked the type and it's a CSV::Row.
example (this doesn't work):
#contact_table.each_with_index do |c, i|
puts "first contact is #{c['last_name']}, second contact is #{c[i + 1]['last_name']}"
end
I realized this doesn't work like this because the table isn't an array, it's a CSV::Row like I previously mentioned. Is there any method that can achieve this? I'm really blanking right now.
My csv looks something like this:
id,first_name,last_name,company,email,address1,address2,zip,city,state_long,state,phone
1,Donalt,Canter,Gottlieb Group,dcanter0#nydailynews.com,9 Homewood Alley,,50335,Des Moines,Iowa,IA,515-601-4495
2,Daphene,McArthur,"West, Schimmel and Rath",dmcarthur1#twitter.com,43 Grover Parkway,,30311,Atlanta,Georgia,GA,770-271-7837
#contact_table should be a CSV::Table which is a collection of CSV::Rows so in this:
#contact_table.each_with_index do |c, i|
...
end
c is a CSV::Row. That's why c['last_name'] works. The problem is that here:
c[i + 1]['last_name']
you're looking at c (a single row) instead of #contact_table, if you said:
#contact_table[i + 1]['last_name']
then you'd get the next last name or, when c is the last row, an exception because #contact_table[i+1] will be nil.
Also, inside the iteration, c is the current (or (i+1)th) row and won't always be the first.
What is your use case for this? Seems like a school project?
I recommend for_each instead of parse (see this comparison). I would probably use a Set for this.
Create a Set outside of the scope of parsing the file (i.e., above the parsing code). Let's call it rows.
Call rows.include?(row) during each iteration while parsing the file
If true, then you know you have a duplicate
If false, then call rows.add(row) to add the new row to the set
You could also just fill your set with an individual value from a column that must be distinct (e.g., row.field(:some_column_name)), such as email or phone number, and do the same inclusion check for that.
(If this is for a real app, please don't do this. Use model validations instead.)
I would use #read instead of #parse and do something like this:
require 'csv'
LASTNAME_INDEX = 2
data = CSV.read('data.csv')
data[1..-1].each_with_index do |row, index|
puts "Contact number #{index + 1} has the following last name : #{row[LASTNAME_INDEX]}"
end
#~> Contact number 1 has the following last name : Canter
#~> Contact number 2 has the following last name : McArthur
I have one xlsx file having 3 sheets.I want to import data of each sheet into different tables.Please Help me into this.
Is it possible to convert it to csv?
If possible, convert this xlsx file to 3 separated csv files (e.g field delimeter = , text delimiter = " ).
Then, open your rails console (or create a .rb script), read each file with the CSV class and save the data on the table. Jump the first line if you have a header (drop(1)).
Example:
require 'csv'
CSV.foreach("sheet1.csv").drop(1) do |row|
YourTable.create!({
field_a: row[0],
field_b: row[1],
field_c: row[2]
})
end
PS: I don't know why people are downvoting this question. SO is suppose to be a place where programmers seek help. The downvoters could at least explain why they are down voting.
I have to loop through an array and I'm not sure which row I'm going to start at since that varies:
def add_data
#sheet = #workbook.create_worksheet
responses.each do |response|
# I want something like #sheet.rows << [response.id]
end
end
I'm looking through the docs but all the examples given use a specified row index: http://spreadsheet.rubyforge.org/GUIDE_txt.html
How would I push an array of values to the next row in the file?
Just dug into the classes themselves and came up with a solution... the documentation kinda whomps:
#sheet.insert_row(#sheet.last_row_index + 1, ["hey", "cool"])
Arrays have always been my downfall in every language I've worked with, but I'm in a situation where I really need to create a dynamic array of multiple items in Rails (note - none of these are related to a model).
Briefly, each element of the array should hold 3 values - a word, it's language, and a translation into English. For example, here's what I'd like to do:
myArray = Array.new
And then I'd like to push some values to the array (note - the actual content is taken from elsewhere - although not a model - and will need to be added via a loop, rather than hard coded as it is here):
myArray[0] = [["bonjour"], ["French"], ["hello"]]
myArray[1] = [["goddag"], ["Danish"], ["good day"]]
myArray[2] = [["Stuhl"], ["German"], ["chair"]]
I would like to create a loop to list each of the items on a single line, something like this:
<ul>
<li>bonjour is French for hello</li>
<li>goddag is Danish for good day</li>
<li>Stuhl is German for chair</li>
</ul>
However, I'm struggling to (a) work out how to push multiple values to a single array element and (b) how I would loop through and display the results.
Unfortunately, I'm not getting very far at all. I can't seem to work out how to push multiple values to a single array element (what normally happens is that the [] brackets get included in the output, which I obviously don't want - so it's possibly a notation error).
Should I be using a hash instead?
At the moment, I have three separate arrays, which is what I've always done, but I don't particularly like - that is, one array to hold the original word, one array to hold the language, and a final array to hold the translation. While it works, I'm sure this is a better approach - if I could work it out!
Thanks!
Ok, let's say you have the words you'd like in a CSV file:
# words.csv
bonjour,French,hello
goddag,Danish,good day
stuhl,German,chair
Now in our program we can do the following:
words = []
File.open('words.csv').each do |line|
# chomp removes the newline at the end of the line
# split(',') will split the line on commas and return an array of the values
# We then push the array of values onto our words array
words.push(line.chomp.split(','))
end
After this code is executed, the words array had three items in it, each item is an array that is based off of our file.
words[0] # => ["bonjour", "French", "hello"]
words[1] # => ["goddag", "Danish", "good day"]
words[2] # => ["stuhl", "German", "chair"]
Now we want to display these items.
puts "<ul>"
words.each do |word|
# word is an array, word[0], word[1] and word[2] are available
puts "<li>#{word[0]} is #{word[1]} for #{word[2]}</li>"
end
puts "</ul>"
This gives the following output:
<ul>
<li>bonjour is French for hello</li>
<li>goddag is Danish for good day</li>
<li>stuhl is German for chair</li>
</ul>
Also, you didn't ask about it, but you can access part of a given array by using the following:
words[0][1] # => "French"
This is telling ruby that you want to look at the first (Ruby arrays are zero based) element of the words array. Ruby finds that element (["bonjour", "French", "hello"]) and sees that it's also an array. You then asked for the second item ([1]) of that array and Ruby returns the string "French".
You mean something like this?
myArray.map{|s|"<li>#{[s[0],'is',s[1],'for',s[2]].join(" ")}</li>"}
Thanks for your help guys! I managed to figure a solution out based on your advice
For the benefit of anyone else who stumbles across this problem, here's my elided code. NB: I use three variables called text, language and translation, but I suppose you could replace these with a single array with three separate elements, as Jason suggests above.
In the Controller (content is being added via a loop):
#loop start
my_array.push(["#{text}", "#{language}", "#{translation}"])
#loop end
In the View:
<ul>
<% my_array.each do |item| %>
<li><%= item[0] # 0 is the original text %> is
<%= item[1] # 1 is the language %> for
<%= item[2] # 2 is the translation %></li>
<% end %>
</ul>
Thanks again!
I have a simple 4-column Excel spreadsheet that matches universities to their ID codes for lookup purposes. The file is pretty big (300k).
I need to come up with a way to turn this data into a populated table in my Rails app. The catch is that this is a document that is updated now and then, so it can't just be a one-time solution. Ideally, it would be some sort of ruby script that would read the file and create the entries automatically so that when we get emailed a new version, we can just update it automatically. I'm on Heroku if that matters at all.
How can I accomplish something like this?
If you can, save the spreadsheet as CSV, there's much better gems for parsing CSV files than for parsing excel spreadsheets. I found an effective way of handling this kind of problem is to make a rake task that reads the CSV file and creates all the records as appropriate.
So for example, here's how to read all the lines from a file using the old, but still effective FasterCSV gem
data = FasterCSV.read('lib/tasks/data.csv')
columns = data.remove(0)
unique_column_index = -1#The index of a column that's always unique per row in the spreadsheet
data.each do | row |
r = Record.find_or_initialize_by_unique_column(row[unique_column_index])
columns.each_with_index do | index, column_name |
r[column_name] = row[index]
end
r.save! rescue => e Rails.logger.error("Failed to save #{r.inspect}")
end
It does kinda rely on you having a unique column in the original spreadsheet to go off though.
If you put that into a rake task, you can then wire it into you're Capistrano deploy script, so it'll be run every time you deploy. the find_or_initialize should ensure you shouldn't get duplicate records.
Parsing newish Excel files isn't too much trouble using Hpricot. This will give you a two-dimensional array:
require 'hpricot'
doc = open("data.xlsx") { |f| Hpricot(f) }
rows = doc.search('row')
rows = rows[1..rows.length] # Skips the header row
rows = rows.map do |row|
columns = []
row.search('cell').each do |cell|
# Excel stores cell indexes rather than blank cells
next_index = (cell.attributes['ss:Index']) ? (cell.attributes['ss:Index'].to_i - 1) : columns.length
columns[next_index] = cell.search('data').inner_html
end
columns
end