I am trying to read a form-uploaded .csv file. I am taking my answers in part from several answers: In Ruby, how to read data column wise from a CSV file?, how to read a User uploaded file, without saving it to the database, and Rails - Can't import data from CSV file. But so far nothing has worked.
Here is my code:
def upload_file
file = Tempfile.new(params[:search_file])
csv_text = File.read(file)
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts row
end
render json: {success: true}
end
I am sure that the file is not nil. It contains 4 columns and 2 rows of simple text. However, my file value above comes out as an empty array, and the csv_text value is an empty string. I am very sure the file contains values.
I have also tried params[:search_field].read and that throws an error every time, saying "undefined method 'read'".
How can I simply read these values from the user uploaded file? I am on rails 5.1.6 and ruby 2.3.
Edit:
I have tried some of the solutions below. However, the problem is that it doesn't write the contents of the file, when I call file.write--it simply writes the name of the file (like, myFileNameHere.csv) as a string to the temp file. The "ok testing now" never prints to terminal in the below code. Here is my code now:
file = Tempfile.new(['hello', '.csv'])
file.write(params[:search_file])
file.rewind
csv_text = file.read
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
puts "ok testing row"
puts row
end
file.close
file.unlink # deletes the temp file
You are reading from a empty tempfile. When you put params[:search_file], this value will become part of the new Tempfile filename (like this "/tmp/#{params[:search_file]}.24722.0").
So when you do File.read(file) it will try to read a tempfile that has params[:search_file] value in the it's filename but has no other value from the form inside it.
You should either skip the Tempfile part and load the params[:search_file] with File.read(params[:search_file]) or fill the new Tempfile object with params[:search_file] content. (I would recommend the first).
Tempfile.new('something') always returns an empty temporary file with 'something' in its basename.
First you create the tempfile (with the filename you want), then you can write the content from params[:search_file], rewind and read it.
Source : Class: Tempfile (Ruby 1.9.3)
Related
I want to loop over a csv file using CSV.foreach, read the data, perform some operation with it, and write the result to the last column of that row, using the Row object.
So let's say I have a csv with data I need to save to a database using Rails ActiveRecord, I validate the record, if it is valid, I write true in the last column, if not I write the errors.
Example csv:
id,title
1,some title
2,another title
3,yet another title
CSV.foreach(path, "r+", headers: true) do |row|
archive = Archive.new(
title: row["title"]
)
archive.save!
row["valid"] = true
rescue ActiveRecord::RecordInvalid => e
row["valid"] = archive.errors.full_messages.join(";")
end
When I run the code it reads the data, but it does not write anything to the csv. Is this possible?
Is it possible to write in the same csv file?
Using:
Ruby 3.0.4
The row variable in your iterator exists only in memory. You need to write the information back to the file like this:
new_csv = ["id,title,valid\n"]
CSV.foreach(path, 'r+', headers: true) do |row| # error here, see edit note below
row["valid"] = 'foo'
new_csv << row.to_s
end
File.open(path, 'w+') do |f|
f.write new_csv
end
[EDIT] the 'r+' option to foreach is not valid, it should be 'r'
Maybe this is over-engineering things a bit. But I would do the following:
Read the original CSV file.
Create a temporary CSV file.
Insert the updated headers into the temporary CSV file.
Insert the updated records into the temporary CSV file.
Replace the original CSV file with the temporary CSV file.
csv_path = 'archives.csv'
input_csv = CSV.read(csv_path, headers: true)
input_headers = input_csv.headers
# using an UUID to prevent file conflicts
tmp_csv_path = "#{csv_path}.#{SecureRandom.uuid}.tmp"
output_headers = input_headers + %w[errors]
CSV.open(tmp_csv_path, 'w', write_headers: true, headers: output_headers) do |output_csv|
input_csv.each do |archive_data|
values = archive_data.values_at(*input_headers)
archive = Archive.new(archive_data.to_h)
archive.valid?
# error_messages is an empty string if there are no errors
error_messages = archive.errors.full_messages.join(';')
output_csv << values + [error_messages]
end
end
FileUtils.move(tmp_csv_path, csv_path)
I am new to RoR.
I want to dynamically add attributes from a csv file so that my code would be able to dynamically read any csv file and build the db (i.e. convert any CSV file into Ruby objects)
I was using the below code
csv_data = File.read('myData.csv')
csv = CSV.parse(csv_data, :headers => true, :header_converters => :symbol)
csv.each do |row|
MyModel.create!(row.to_hash)
end
However it will fail for the following example
myData.csv
Name,id
foo,1
bar,10
myData2.csv
Name,value
foo,1
bar,10
It will result an error for myData2 because the value is not a parameter in MyModel
unknown attribute 'value' for MyModel.
I have thought about using send(:attrAccessor, name) but I was not sure how can I integrate it when reading from csv, any ideas ?
You are doing it properly but you can also bulk upload the records
csv_data =
CSV.read("#{Rails.root}/myData.csv",
headers: true,
header_converters: :symbol
).map(&:to_hash)
MyModel.create(csv_data)
NOTE: If the data is going to be same you can use seeds.rb
I have a file b.xls from excel I need to import it to my rails app
I have tried to open it
file = File.read(Rails.root.to_s+'/b.xls')
I have got this
file.encoding => #Encoding:UTF-8
I have few questions:
how to open without this symbols(normal language)?
how to convert this file to a hash?
File pretty large about 5k lines
You must have array of all rows then you can convert it to some hash if you like so.
I would recommend to use a batch_factory gem.
The gem is very simple and relies on the roo gem under the hood.
Here is the code example
require 'batch_factory'
factory = BatchFactory.from_file(
Rails.root.join('b.xlsx'),
keys: [:column1, :column2, ..., :what_ever_column_name]
)
Then you can do
factory.each do |row|
puts row[:column1]
end
You can also omit specifying keys. Then batch_factory will automatically fetch headers from the first row. But your keys would be in russian. Like
factory.each do |row|
puts row['Товар']
end
If you want to hash with product name as key you can do
factory.inject({}) do |hash, row|
hash.merge(row['Товар'] => row)
end
I am trying to get data out of pdf files, convert them to CSV, then organize into one table.
A sample pdf can be found here
https://www.ttb.gov/statistics/2011/201101wine.pdf
It's data on US wine production. So far, I have been able to get the PDF files and convert them into CSV.
Here is the CSV file that has been converted from PDF:
https://github.com/jjangmes/wine_data/blob/master/csv_data/201101wine.txt
However, when I try to find data by row, it's not working.
require 'csv'
csv_in = CSV.read('201001wine.txt', :row_sep => :auto, :col_sep => ";")
csv_in.each do |line|
puts line
end
When I put line[0], I get the entire data being printed. So it looks like the entire data is just shoved into row[0].
line will extract all the data.
line[0] will extract all the data with space in between lines.
line[1] gives the error "/bin/bash: shell_session_update: command not found"
How can I correctly divide up the data so I can parse them by row?
This is a really messy data with no heading or ID, so I think the best approach is to get the data in csv, and find the data I want by looking up the right row number.
Though not all data have the same number of rows, most do. So I thought that'd be the best way for now.
Thanks!
Edit 1:
Here is the code that I have to scrape and get the data.
require 'mechanize'
require 'docsplit'
require 'byebug'
require 'csv'
def pdf_to_text(pdf_filename)
extracted_text = Docsplit.extract_text([pdf_filename], ocr: false, col_sep: ";", output: 'csv_data')
extracted_text
end
def save_by_year(starting, ending)
agent = Mechanize.new{|a| a.ssl_version, a.verify_mode = 'TLSv1', OpenSSL::SSL::VERIFY_NONE}
agent.get('https://www.ttb.gov')
(starting..ending).each do |year|
year_page = agent.get("/statistics/#{year}winestats.shtml")
(1..12).each do |month|
month_trail = '%02d' % month
link = year_page.links_with(:href => "20#{year}/20#{year}#{month_trail}wine.pdf").first
page = agent.click(link)
File.open(page.filename.gsub(" /","_"), 'w+b') do |file|
file << page.body.strip
end
pdf_to_text("20#{year}#{month_trail}wine.pdf")
end
end
end
After converting, I am trying to access the data through accessing the text file then row in each.
I am using ruby 1.8.7 , rails 2.3.8. I want to parse the data from TXT dump file separated by tab.
In this TXT dump contain some CSS property look like has some invalid data.
When run my code using FasterCSV gem
FasterCSV.foreach(txt_file, :quote_char => '"',:col_sep =>'\t', :row_sep =>:auto, :headers => :first_row) do |row|
col= row.to_s.split(/\t/)
puts col[15]
end
the error written in console as "Illegal quoting on line 38." Can any one suggest me how to skip the row which has invalid data and proceed data load process of remaining rows?
Here's one way to do it. We go to lower level, using shift to parse each row and then silent the MalformedCSVError exception, continuing with the next iteration. The problem with this is the loop doesn't look so nice. If anyone can improve this, you're welcome to edit the code.
FasterCSV.open(filename, :quote_char => '"', :col_sep => "\t", :headers => true) do |csv|
row = true
while row
begin
row = csv.shift
break unless row
# Do things with the row here...
rescue FasterCSV::MalformedCSVError
next
end
end
end
Just read the file as a regular one (not with FasterCSV), split it like you do know by \t and it should work
So the problem is that TSV files don't have a quote character. The specification simply specifies that you aren't allowed to have tabs in the data.
The CSV library doesn't really support this use case. I've worked around it by specifying a quote character that I know won't appear in my data. For example
CSV.parse(txt_file, :quote_char => '☎', :col_sep => "\t" do |row|
puts row[15]
end