Parse CSV text with accents - ruby-on-rails

Does anyone know how to parse text like this:
informaci\u00f3n del aplicante
so that it ends up like this?
información del aplicante
This is my method:
def translations_to_csv(translations, headers, to_file_performance=false, file_name = nil)
csv_output = ""
CSV::Writer.generate(csv_output, fs = "\t") do |csv|
if to_file_performance
csv_string = ""
CSV::Writer.generate(csv_string) do |csv_str|
csv_str << headers.collect { |item| item.upcase }
end
File.open(file_name,'a') {|f| f.write(csv_string) }
else
csv << headers.collect { |item| item.upcase }
end
translations.each do |data|
row = headers.collect { |item| data.first[item.to_sym] }
csv << row
end
end
csv_output
end
The CSV exports most text correctly, but some text with accents will look like the first example.
I'm using Ruby 1.8.7 and Rails 2.3.5

Related

CSV in RUBY custom string

I have 1 field delivery_time It is in an array
include :
DELIVERY_TIME = [
I18n.t("activerecord.attributes.order.none_delivery_time"),
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
when I downloaded the csv directory it was in the form
"09:00~12:00"
but i want now when I download it will take the form :
"0912"
how to customize it?
my code:
def perform
CSV.generate(headers: true) do |csv|
csv << attributes
orders.each do |order|
csv << create_row(order)
end
end
end
def create_row(order)
row << order.delivery_time
end
AFAIU, you need to modify DELIVERY_TIME to fit your format. CSV is absolutely out of scope here. So to transform values, one should split by ~ and take the hour from the result.
DELIVERY_TIME = [
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
DELIVERY_TIME.map { |s| s.split('~').map { |s| s[0...2] }.join }
#⇒ ["0912", "1214", "1416", "1618", "1820", "1921", "2021"]
A safer method would be to use DateTime#parse for this
require 'time'
DELIVERY_TIME.map do |s|
s.split('~').map { |s| DateTime.parse(s).strftime("%H") }.join
end
#⇒ ["0912", "1214", "1416", "1618", "1820", "1921", "2021"]
It's not real clear what you're asking, but I'd probably start with something like this:
"09:00~12:00".scan(/\d{2}/).values_at(0, 2).join # => "0912"
Using that in some code:
"09:00~12:00".scan(/\d{2}/).values_at(0, 2).join # => "0912"
DELIVERY_TIME = [
'blah',
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
ary = [] << DELIVERY_TIME.first
ary += DELIVERY_TIME[1..-1].map { |i|
i.scan(/\d{2}/).values_at(0, 2).join
}
# => ["blah", "0912", "1214", "1416", "1618", "1820", "1921", "2021"]

Export arrays into xlsx using axlsx gem

I'm attempting to copy all the contents of a CSV file over to an excel workbook using the AXLSX gem. On a second sheet, I only want 2 of the columns copied over. Below is an example.
I tried the '.map' method but that didn't work.
require 'csv'
require 'Axlsx'
p = Axlsx::Package.new
wb = p.workbook
animals = CSV.read('test.csv', headers:true)
column = ['Animals', 'Name']
headers = Array.new
headers << "Animal"
headers << "Name"
headers << "Age"
headers << "State"
wb.add_worksheet(:name => 'Copy') do |sheet|
animals.each do |row|
headers.map { |col| sheet.add_row [row[col]] }
end
end
wb.add_worksheet(:name => 'Names') do |sheet|
animals.each do |row|
column.map { |col| sheet.add_row [row[col]] }
end
end
p.serialize 'Animals.xlsx'
CSV - But also desired output on XLSX
Output from my code
At first read file using IO::readlines, then split every line (separate by the cells using Array#map and String#split).
It will produce nested array. Something like
[["Animal", "Name", "Age", "State"], ["Dog", "Rufus", "7", "CA"], ["Bird", "Magnus", "3", "FL"]]
Each subarray here is the row of your table, each item of subarray is the cell of your table.
If you need just "Name" column on second sheet you need make nested array
[["Name"], ["Rufus"], ["Magnus"]]
That's all you need to generate XLSX file.
For example you have animals.csv with data serapated by , and want to generate animals.xlsx:
require 'axlsx'
copy = File.readlines('animals.csv', chomp: true).map { |s| s.split(',') }
names = copy.map { |l| [l[1]] }
Axlsx::Package.new do |p|
p.workbook.add_worksheet(name: 'Animals') { |sheet| copy.each { |line| sheet.add_row(line) }}
p.workbook.add_worksheet(name: 'Names') { |sheet| names.each { |line| sheet.add_row(line) }}
p.serialize('animals.xlsx')
end

How to add new column in CSV string in rails

I want to add new column and update existing values in CSV response. How can I do simpler and better way of doing the below transformations?
Input
id,name,country
1,John,US
2,Jack,UK
3,Sam,UK
I am using following method to parse the csv string and add new column
# Parse original CSV
rows = CSV.parse(csv_string, headers: true).collect do |row|
hash = row.to_hash
# Merge additional data as a hash.
hash.merge('email' => 'sample#gmail.com')
end
# Extract column names from first row of data
column_names = rows.first.keys
# Generate CSV after transformation of csv
csv_response = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Extract values for row of data
csv << row.values_at(*column_names)
end
end
I am using following method to parse the csv and update existing values
name_hash = {"John" => "Johnny", "Jack" => "Jackie"}
Parse original CSV
rows = CSV.parse(csv_string, headers: true).collect do |row|
hash = row.to_hash
hash['name'] = name_hash[hash['name']] if name_hash[hash['name']] != nil
hash
end
# Extract column names from first row of data
column_names = rows.first.keys
# Generate CSV after transformation of csv
csv_response = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Extract values for row of data
csv << row.values_at(*column_names)
end
end
One possible option given the following reference data to be used for modifying the table:
name_hash = {"John" => "Johnny", "Jack" => "Jackie"}
sample_email = {'email' => 'sample#gmail.com'}
Just store in rows the table converted to hash:
rows = CSV.parse(csv_string, headers: true).map(&:to_h)
#=> [{"id"=>"1", "name"=>"John", "country"=>"US"}, {"id"=>"2", "name"=>"Jack", "country"=>"UK"}, {"id"=>"3", "name"=>"Sam", "country"=>"UK"}]
Then modify the hash based on reference data (I used Object#then for Ruby 2.6.1 alias of Object#yield_self for Ruby 2.5):
rows.each { |h| h.merge!(sample_email).then {|h| h['name'] = name_hash[h['name']] if name_hash[h['name']] } }
#=> [{"id"=>"1", "name"=>"Johnny", "country"=>"US", "email"=>"sample#gmail.com"}, {"id"=>"2", "name"=>"Jackie", "country"=>"UK", "email"=>"sample#gmail.com"}, {"id"=>"3", "name"=>"Sam", "country"=>"UK", "email"=>"sample#gmail.com"}]
Finally restore the table:
csv_response = CSV.generate(headers: rows.first.keys) { |csv| rows.map(&:values).each { |v| csv << v } }
So you now have:
puts csv_response
# id,name,country,email
# 1,Johnny,US,sample#gmail.com
# 2,Jackie,UK,sample#gmail.com
# 3,Sam,UK,sample#gmail.com

ruby nokorigi export csv columns

i want to export csv in 3 columns with the type of it but the result that i get is not the same what i want. it's just only one column to show all my data, please help me what should i do
require 'nokogiri'
require 'csv'
page = Nokogiri::HTML(open("index.html"))
fullName = page.css('li._5i_q').css("a[data-gt]").children.map {|name| name.text }
shortURL = page.css('li._5i_q').css("._5j0e a[data-hovercard]")
myID = shortURL.map {|element|
element["data-hovercard"][/id=([^&]*)/].gsub('id=', '')
}
messenger = shortURL.map {|element|
element["data-hovercard"][/id=([^&]*)/].gsub('id=', '') + "#gmail.com"
}
attributes = %w{ID FullName Messenger}
CSV.open('chatId.csv', 'w') do |csv|
csv << attributes
myID.each do |x|
csv << [x]
end
fullName.each do |y|
csv << [y]
end
messenger.each do |z|
csv << [z]
end
end
It's all my code
You will have to write row by row when exporting data to csv. Therefore, try to create an array of [x, y, z] and export them using to_csv method. For example:
data = myID.zip(fullName, shortUrl)
CSV.open('chatId.csv', 'w') do |csv|
csv << attributes
csv << "\n"
data.each do |d|
csv << d.to_csv
end
end

Web Scraping using Ruby - If statment

I have built a web scraper. I need it to scrape the prices and bedrooms of a given neighborhood. Sometimes the span.first_detail_cell will return Furnished and the rest of the time it will return the price. I need to write something that can overlook the span.first_detail_cell if it is furnished and look in the next cell for the price. I think I need to write an if statement, but not sure of the parameters. Any help would be great!
require 'open-uri'
require 'nokogiri'
require 'csv'
url = "https://streeteasy.com/for-rent/bushwick"
page = Nokogiri::HTML(open(url))
page_numbers = []
page.css("nav.pagination span.page a").each do |line|
page_numbers << line.text
end
max_page = page_numbers.max
beds = []
price = []
max_page.to_i.times do |i|
url = "https://streeteasy.com/for-rent/bushwick?page=#{i+1}"
page = Nokogiri::HTML(open(url))
page.css('span.first_detail_cell').each do |line|
beds << line.text
end
page.css('span.price').each do |line|
price << line.text
end
end
CSV.open("bushwick_rentals.csv", "w") do |file|
file << ["Beds", "Price"]
beds.length.times do |i|
file << [beds[i], price[i]]
end
end
page.css('span.first_detail_cell').each do |line|
if line.text.include?("Furnished")
# do something hre
else
beds << line.text
end
end

Resources