Ruby - checking if file is a CSV - ruby-on-rails

I have just wrote a code where I get a csv file passed in argument and treat it line by line ; so far, everything is okay. Now, I would like to secure my code by making sure that what we receive in argument is a .csv file.
I saw in the Ruby doc that it exist a == "--file" option but using it generate an error : the way I understood it, it seems this option only work for the txt files.
Is there a method specific that allowed to check if my file is a csv ? Here some of my code :
if ARGV.empty?
puts "j'ai rien reçu"
# option to check, don't work
elsif ARGV[0].shift == "--file"
# my code so far, whithout checking
else CSV.foreach(ARGV.shift) do |row|
etc, etc...

I think it is unpossible to make a real safe test without additional information.
Just some notes what you can do:
You get a filename in a variable filename.
First, check if it is a file:
File.exist?
Then you could check, if the encoding is correct:
raise "Wrong encoding" unless content.valid_encoding?
Has your csv always the same number of columns? And do you have only one liner?
This can be a possibility to make the next check:
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
This check can be modified for your case, e.g. if you have always an exact number of rows.
In total you can define something like:
require 'csv'
#columns defines the expected numer of columns per line
def csv?(filename, sep: ';', columns: 3)
return false unless File.exist?(filename) #"No file"
content = File.read(filename, :encoding => 'utf-8')
return false unless content.valid_encoding? #"Wrong encoding"
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
CSV.parse(content, :col_sep => sep)
end
if csv = csv?('test.csv')
csv.each do |row|
p row
end
end

You can use ruby-filemagic gem
gem install ruby-filemagic
Usage:
$ irb
irb(main):001:0> require 'filemagic'
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip')
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0>
https://github.com/ricardochimal/ruby-filemagic

Use File.extname() to check the origin file
File.extname("test.rb") #=> ".rb"

Related

Rails console vs a rake task: returning File.size is not consistent

I'm having a strange issue where when I check the File.size of a particular file in Rails console, it returns the correct size. However when I run the same code in a rake task, it returns 0. Here is the code in question (I've tidied it up a bit to help with readability):
def sum_close
daily_closed_tickets = Fst.sum_retrieve_closed_tickets
daily_closed_tickets.each do |ticket|
CSV.open("FILE_NAME_HERE", "w+", {force_quotes: false}) do |csv|
if (FileCopyReceipt.exists?(path: "#{ticket.attributes['TroubleTicketNumber']}_sum.txt"))
csv << ["GENERATE CSV WITH ATTRIBUTES HERE"]
files = Dir.glob("/var/www/html/harmonize/public/close/CLOSED_#{ticket.attributes['TroubleTicketNumber']}_sum.txt")
files.each do |f|
Rails.logger.info "File size (should return non-0): #{File.size(f)}" #returns 0, but not in Rails Console
Rails.logger.info "File size true or false, should be true: #{File.size(f) != 0}" #returns false, should return true
Rails.logger.info "Rails Environment: #{Rails.env}" #returns production
if(!FileCopyReceipt.exists?(path: f) && (File.size(f) != 0))
Rails.logger.info("SUM CLOSE, GOOD => FileUtils.cp_r occurred and FileCopyReceipt object created")
else
Rails.logger.info("SUM CLOSE, WARNING: => no data transfer occurred")
end
end
else
Rails.logger.info("SUM CLOSE => DID NOT make it into initial if ClosedDate.present? if block")
end
end
end
close_tickets.rake
task :close_tickets => :environment do
tickets = FstController.new
tickets.sum_close
tickets.dais_close
end
It is beyond me why this File.size comes back as 0 when this is run as a rake task. I thought it may be a environment issue, but that does not seem to be the case.
Any insight on the matter is appreciated.
The CSV.open block and everything being wrapped in there was causing issues. So I just made CSV generation it's own snippet instead of wrapping everything in there.
daily_closed_tickets.each do |ticket|
CSV.open("generate csv here.txt") do |csv|
#enter ticket.attributes here for the csv
end
#continue on with the rest of the code and File.size() works properly
end

ruby net-sftp read file line by line

I am using ruby 2.0.0 and rails 4.0.0. I have something similar to this:
require 'net/sftp'
sftp = Net::SFTP.start('ftp.app.com','username', :password => 'password')
sftp.file.open("/path/to/remote/file.csv", "r") do |f|
puts f.gets
end
This opens the file on the FTP site, but it only puts the first line of the csv file. I need to read this file row by row, preferably ignoring the header.
How can I read the file row by row, without downloading the file locally?
I solved this by doing this:
data = sftp.download!("/path/to/remote/file.csv").split(/\r\n/)
data.each do |line|
puts line
end
The proper answer for this would actually be to use the file.eof? value.
The code would look like:
require 'net/sftp'
sftp = Net::SFTP.start('ftp.app.com','username', :password => 'password')
sftp.file.open("/path/to/remote/file.csv", "r") do |f|
while !f.eof?
puts f.gets
end
end
Documentation can be found here
In my case something like this worked:
data = sftp.download!("/path/to/remote/file.csv").split(/\n/).map{ |e| e.split(/,/).map{ |x| x.gsub(/"/, "")} }
data.each do |line|
puts line
end
Will also split each row of the .csv into different array columns and remove any excess of "". Note this is for mac where line breaks are \n.

Ruby CSV File Parsing, Headers won't format?

My rb file reads:
require "csv"
puts "Program1 initialized."
contents = CSV.open "data.csv", headers: true
contents.each do |row|
name = row[4]
puts name
end
...but when i run it in ruby it wont load the program. it gives me the error message about the headers:
syntax error, unexpected ':', expecting $end
contents = CSV.open "data.csv", headers: true
so I'm trying to figure out, why won't ruby let me parse this file? I've tried using other csv files I have and it won't load, and gives me an error message. I'm trying just to get the beginning of the program going! I feel like it has to do with the headers. I've updated as much as I can, mind you I'm using ruby 1.8.7. I read somewhere else that I could try to run the program in irb but it didn't seem like it needed it. so yeah... thank you in advance!!!!
Since you are using this with Ruby 1.8.7, :headers => true won't work in this way.
The simplest way to ignore the headers and get your data is to shift the first row in the data, which would be the headers:
require 'csv'
contents = CSV.open("data.csv", 'r')
contents.shift
contents.each do |row|
name = row[4]
puts name
end
If you do want to use the syntax with headers in ruby 1.8, you would need to use FasterCSV, something similar to this:
require 'fastercsv'
FasterCSV.foreach("data.csv", :headers => true) do |fcsv_obj|
puts fcsv_obj['name']
end
(Refer this question for further read: Parse CSV file with header fields as attributes for each row)

ruby net_dav sample puts

I'm trying to put a file on a site with WEB_DAV. (a ruby gem)
When I follow the example, I get a nil exception
#### GEMS
require 'rubygems'
begin
gem "net_dav"
rescue LoadError
system("gem install net_dav")
Gem.clear_paths
end
require 'net/dav'
uri = URI('https://staging.web.mysite');
user = "dave"
pasw = "correcthorsebatterystaple"
dav = Net::DAV.new(uri, :curl => false)
dav.verify_server = false
dav.credentials(user, pasw)
cargo = ("testing.txt")
File.open(cargo, "rb") { |stream|
dav.put(urI.path +'/'+ cargo, stream, File.size(cargo))
}
when I run this I get
`digest_auth': can't convert nil into String (TypeError)
this relates to line 197 in my nav.rb file.
request_digest << ':' << params['nonce']
So what I'm wondering is what step did I not add?
Is there a reasonable example of the correct use of this gem? Something that does something that works would be sweet :)
SIDE QUESTION: Is this the correct gem to use to do web_DAV? It seems an old unmaintained gem, perhaps there's something used by more to accomplish the task?
Try referencing the hash with a symbol rather than a string, i.e.
request_digest << ':' << params[:nonce]
In a simple test
baz = "baz"
params = {:foo => "bar"}
baz << ':' << params['foo']
results in the same error as you're getting.

Illegal quoting on line using FasterCSV in ruby 1.8.7

I am facing "Illegal quoting" error when parse the content from SQL dump and the dump file is in the format of TXT with tab (\t) separator.
require 'rubygems'
require 'faster_csv'
begin
FasterCSV.foreach(excel_file, :quote_char => '"',:col_sep =>'\t', :row_sep =>:auto, :headers => :first_row) do |row|
col= row.to_s.split(/\t/)
if col[3]!="" or !col[3].empty?
color_value=col[3].to_s.capitalize
#Inser Color
color=Color.find_or_create_by_name(:name=>color_value)
elsif col[3].empty?
color_id= nil
end
end
rescue Exception => e
puts e
end
The program executed and run successfully but there is an invalid data present like
below (#font-face ...) mean execution terminated with error of "Illegal quoting on line 3.
ID Name code comments
1 white 234 good
2 Black 222
3 red 343 #font-face { font-family: "Verdana"; .....}
Can any one suggest me how to skip when invalid data occurs in column ?
Thanks in advance.
I'm not sure if this will solve the error you are seeing, but you need to use double quotes around escaped characters, e.g.:
:col_sep => "\t"
FasterCSV isn't very kind to badly formatted data.
I don't know that there is a solution for this.
However - if your example file doesn't actually contain any quoting using "
then perhaps just use a different quot_char (eg ')
You can use the ASCII code for the NULL character -- \0x00 -- as such:
FasterCSV.foreach(excel_file, :quote_char => '\0x00',:col_sep =>'\t', :row_sep =>:auto, :headers => :first_row) do |row|
...
end
You can find a chart of some ASCII chars here: http://www.bluesock.org/~willg/dev/ascii.html

Resources