Ruby:copy two rows - ruby-on-rails

I am a newcomer in Ruby.
I have a sample (input text) like:
Message:
update attributes in file and commit version
----
Modified
I need to put in line the row after "message" tag. Note that this row can be and close with "message" like
Message:update attributes in file and commit version
I've tried like this:
if line =~/Message/
But of course it doesn't search the next row.
Can anyone help me how to catch rows between tags "Message" and "---"
If you know some examples please type a link
Update: the whole code
require 'csv'
data = []
File.foreach("new7.txt") do |line|
line.chomp!
if line =~ /Revision/
data.push [line]
elsif line =~ /Author/
if data.last and not data.last[1]
data.last[1] = line
else
data.push [nil, line]
end
elsif line=~/^Message:(.*)^-/m
if data.last and not data.last[2]
data.last[2] = line
else
data.push [nil, nil, line]
end
end
end
CSV.open('new1.csv', 'w') do |csv|
data.each do |record|
csv << record
end
enter code here
Input file:
Revision: 37407
Author: imakarov
Date: 21 июня 2013 г. 10:23:28
Message:my infomation
dmitry name
Output csv file:

You can use /^Message:(.*)^---/m as your regex. The /m allows you to match across line boundaries. See http://rubular.com/r/FhqiKx0XyI
Update #1: Here's sample output from irb:
Peters-MacBook-Air-2:bot palfvin$ irb
1.9.3p194 :001 > line = "\nMessage:first-line\nsecond-line\n---\nthird-line"
=> "\nMessage:first-line\nsecond-line\n---\nthird-line"
1.9.3p194 :002 > line =~ /^Message:(.*)^-/m
=> 1
1.9.3p194 :003 > $1
=> "first-line\nsecond-line\n"
1.9.3p194 :004 >

Related

Unknown Attribute when importing from CSV

I am trying to do the following in IRB:
file = CSV.read('branches.csv', headers:true)
file.each do |branch|
Branch.create(attributes:branch.to_hash)
end
branches.csv contains one header entitled business_name which should map onto the attribute for Branch of the same name, but I see the error:
ActiveRecord::UnknownAttributeError: unknown attribute 'business_name' for Branch.
Strangely, doing Branch.create(business_name:'test') works just fine with no issues.
Update:
I think this has something to do with the encoding of the text in the UTF-8 CSV produced by Excel as suggested in the comments below. Not sure if this IRB gives any clues... but our header title business_name != "business_name"
2.3.3 :348 > file = CSV.read('x.csv', headers:true)
#<CSV::Table mode:col_or_row row_count:165>
2.3.3 :349 > puts file.first.to_hash.first.first
business_name
2.3.3 :350 > file = CSV.read('x.csv', headers:true)
#<CSV::Table mode:col_or_row row_count:165>
2.3.3 :351 > puts file.first.to_hash.first.first == "business_name"
false
Just skip the attributes: part. It is not needed at all, because branch.to_hash already returns exactly the format you describe in your last sentence.
file = CSV.read('branches.csv', headers:true)
file.each { |branch| Branch.create(branch.to_hash) }

Ruby - checking if file is a CSV

I have just wrote a code where I get a csv file passed in argument and treat it line by line ; so far, everything is okay. Now, I would like to secure my code by making sure that what we receive in argument is a .csv file.
I saw in the Ruby doc that it exist a == "--file" option but using it generate an error : the way I understood it, it seems this option only work for the txt files.
Is there a method specific that allowed to check if my file is a csv ? Here some of my code :
if ARGV.empty?
puts "j'ai rien reçu"
# option to check, don't work
elsif ARGV[0].shift == "--file"
# my code so far, whithout checking
else CSV.foreach(ARGV.shift) do |row|
etc, etc...
I think it is unpossible to make a real safe test without additional information.
Just some notes what you can do:
You get a filename in a variable filename.
First, check if it is a file:
File.exist?
Then you could check, if the encoding is correct:
raise "Wrong encoding" unless content.valid_encoding?
Has your csv always the same number of columns? And do you have only one liner?
This can be a possibility to make the next check:
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
This check can be modified for your case, e.g. if you have always an exact number of rows.
In total you can define something like:
require 'csv'
#columns defines the expected numer of columns per line
def csv?(filename, sep: ';', columns: 3)
return false unless File.exist?(filename) #"No file"
content = File.read(filename, :encoding => 'utf-8')
return false unless content.valid_encoding? #"Wrong encoding"
content.each_line{|line|
return false if line.count(sep) < columns - 1
}
CSV.parse(content, :col_sep => sep)
end
if csv = csv?('test.csv')
csv.each do |row|
p row
end
end
You can use ruby-filemagic gem
gem install ruby-filemagic
Usage:
$ irb
irb(main):001:0> require 'filemagic'
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip')
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0>
https://github.com/ricardochimal/ruby-filemagic
Use File.extname() to check the origin file
File.extname("test.rb") #=> ".rb"

In Ruby 1.9.3, check if the user input is a directory

#!/usr/bin/ruby
puts "Please enter the path-name of the directory:"
p = STDIN.gets
isdir = File.directory?(p)
puts "#{isdir} #{p}"
it always return me a false! even though I know the user input is a directory. I think (p) is not working as a parameter. So i think its saying that p is not a directory not the user input for example "/usr/bin/". any help?
Using p = STDIN.gets '\n' was getting appended. Instead you can use gets.chomp. Also you need to use File.expand_path. Check the example below.
# My irb
1.9.3-p545 :002 > p = gets.chomp
~/.ssh
=> "~/.ssh"
1.9.3-p545 :003 > File.directory?(p)
=> false
1.9.3-p545 :004 > File.exists? File.expand_path(p)
=> true
The p value is not strictly equal to what you expect it to be. It contains \n at the end:
# in my irb:
1.9.3p392 :010 > p = STDIN.gets
/home/
=> "/home/\n"
1.9.3p392 :011 > isdir = File.directory?(p)
=> false
1.9.3p392 :012 > isdir = File.directory?(p.strip)
=> true
The strip method:
Strips entire range of Unicode whitespace from the right and left of the string.
Source: http://apidock.com/rails/ActiveSupport/Multibyte/Chars/strip

Ruby csv import trouble

I tried to import data from csv in my rails app, but something went wrong:
CSV::MalformedCSVError in ArticlesController#index
Unclosed quoted field on line 1.
My csv looks like this:
"Код";"№ по каталогу (артикул)";"Наименование товара";"Ед. изм.";"Цена опт.";"Доп.";"Остатки";"Класс";"Группа";"Бренд";"Блок."
2223;15-562-44;15-562-44 (27-B07-F) VW Polo 95-R ;шт ;37,430;;;Амортизаторы ;Амортизаторы BOGE ;;
10327;24-052-1;24-052-1(46-A27-0) LAND ROVER 84- F ;шт ;68,750;;;Амортизаторы ;Амортизаторы BOGE ;;
10328;24-053-1;24-053-1(46-A28-0) LAND ROVER 84- R ;шт ;68,750;;;Амортизаторы ;Амортизаторы BOGE ;;
Maybe this is because of the first line (which has no ;;)
My code look like this:
def csv_import
require 'csv'
file = File.open("/#{Rails.public_path}/uploads/smallcsv.csv")
#csv = CSV.parse(file)
csv = CSV.open(file, "r:ISO-8859-15:UTF-8", {:col_sep => ";", :row_sep => ";;", :headers => :first_row})
file_path = "/#{Rails.public_path}/uploads/smallcsv.csv"
##parsed_file=CSV::Reader.parse(file_path)
csv.each do |row|
ename = row[2]
eprice = row[5]
eqnt = row[7]
esupp = row[10]
logger.warn(ename)
end
end
I'm running ruby 1.9+ with fastercsv gem
I figured this out myself using "CSV - Unquoted fields do not allow \r or \n (line 2)".
The problem was with the first line, so :auto helped me.

How would I find similar lines in two CSV files?

Here is my code but it takes forever for huge files:
require 'rubygems'
require "faster_csv"
fname1 =ARGV[0]
fname2 =ARGV[1]
if ARGV.size!=2
puts "Display common lines in the two files \n Usage : ruby user_in_both_files.rb <file1> <file2> "
exit 0
end
puts "loading the CSV files ..."
file1=FasterCSV.read(fname1, :headers => :first_row)
file2=FasterCSV.read(fname2, :headers => :first_row)
puts "CSV files loaded"
#puts file2[219808].to_s.strip.gsub(/\s+/,'')
lineN1=0
lineN2=0
# count how many common lines
similarLines=0
file1.each do |line1|
lineN1=lineN1+1
#compare line 1 to all line from file 2
lineN2=0
file2.each do |line2|
puts "file1:l#{lineN1}|file2:l#{lineN2}"
lineN2=lineN2+1
if ( line1.to_s.strip.gsub(/\s+/,'') == line2.to_s.strip.gsub(/\s+/,'') )
puts "file1:l#{line1}|file2:l#{line2}->#{line1}\n"
similarLines=similarLines+1
end
end
end
puts "#{similarLines} similar lines."
Ruby has set operations available with arrays:
a_ary = [1,2,3]
b_ary = [3,4,5]
a_ary & b_ary # => 3
So, from that you should try:
puts "loading the CSV files ..."
file1 = FasterCSV.read(fname1, :headers => :first_row)
file2 = FasterCSV.read(fname2, :headers => :first_row)
puts "CSV files loaded"
common_lines = file1 & file2
puts common_lines.size
If you need to preprocess the arrays, do it as you load them:
file1 = FasterCSV.read(fname1, :headers => :first_row).map{ |l| l.to_s.strip.gsub(/\s+/, '') }
file2 = FasterCSV.read(fname2, :headers => :first_row).map{ |l| l.to_s.strip.gsub(/\s+/, '') }
You're gsubing File2 every time you loop through File1. I'd do that first, and then just compare the results of that.
Edit Something like this (untested)
file1lines = []
file1.each do |line1|
file1lines = line1.strip.gsub(/\s+/, '')
end
# Do the same for `file2lines`
file1lines.each do |line1|
lineN1=lineN1+1
#compare line 1 to all line from file 2
lineN2=0
file2lines.each do |line2|
puts "file1:l#{lineN1}|file2:l#{lineN2}"
lineN2=lineN2+1
if ( line1 == line2 )
puts "file1:l#{line1}|file2:l#{line2}->#{line1}\n"
similarLines=similarLines+1
end
end
end
I'd also get rid of all the putses in the loops unless you really need them. If the files are huge, that's probably slowing it down a noticeable amount.

Resources