ruby nokorigi export csv columns - ruby-on-rails

i want to export csv in 3 columns with the type of it but the result that i get is not the same what i want. it's just only one column to show all my data, please help me what should i do
require 'nokogiri'
require 'csv'
page = Nokogiri::HTML(open("index.html"))
fullName = page.css('li._5i_q').css("a[data-gt]").children.map {|name| name.text }
shortURL = page.css('li._5i_q').css("._5j0e a[data-hovercard]")
myID = shortURL.map {|element|
element["data-hovercard"][/id=([^&]*)/].gsub('id=', '')
}
messenger = shortURL.map {|element|
element["data-hovercard"][/id=([^&]*)/].gsub('id=', '') + "#gmail.com"
}
attributes = %w{ID FullName Messenger}
CSV.open('chatId.csv', 'w') do |csv|
csv << attributes
myID.each do |x|
csv << [x]
end
fullName.each do |y|
csv << [y]
end
messenger.each do |z|
csv << [z]
end
end
It's all my code

You will have to write row by row when exporting data to csv. Therefore, try to create an array of [x, y, z] and export them using to_csv method. For example:
data = myID.zip(fullName, shortUrl)
CSV.open('chatId.csv', 'w') do |csv|
csv << attributes
csv << "\n"
data.each do |d|
csv << d.to_csv
end
end

Related

CSV in RUBY custom string

I have 1 field delivery_time It is in an array
include :
DELIVERY_TIME = [
I18n.t("activerecord.attributes.order.none_delivery_time"),
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
when I downloaded the csv directory it was in the form
"09:00~12:00"
but i want now when I download it will take the form :
"0912"
how to customize it?
my code:
def perform
CSV.generate(headers: true) do |csv|
csv << attributes
orders.each do |order|
csv << create_row(order)
end
end
end
def create_row(order)
row << order.delivery_time
end
AFAIU, you need to modify DELIVERY_TIME to fit your format. CSV is absolutely out of scope here. So to transform values, one should split by ~ and take the hour from the result.
DELIVERY_TIME = [
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
DELIVERY_TIME.map { |s| s.split('~').map { |s| s[0...2] }.join }
#⇒ ["0912", "1214", "1416", "1618", "1820", "1921", "2021"]
A safer method would be to use DateTime#parse for this
require 'time'
DELIVERY_TIME.map do |s|
s.split('~').map { |s| DateTime.parse(s).strftime("%H") }.join
end
#⇒ ["0912", "1214", "1416", "1618", "1820", "1921", "2021"]
It's not real clear what you're asking, but I'd probably start with something like this:
"09:00~12:00".scan(/\d{2}/).values_at(0, 2).join # => "0912"
Using that in some code:
"09:00~12:00".scan(/\d{2}/).values_at(0, 2).join # => "0912"
DELIVERY_TIME = [
'blah',
"09:00~12:00",
"12:00~14:00",
"14:00~16:00",
"16:00~18:00",
"18:00~20:00",
"19:00~21:00",
"20:00~21:00",
].freeze
ary = [] << DELIVERY_TIME.first
ary += DELIVERY_TIME[1..-1].map { |i|
i.scan(/\d{2}/).values_at(0, 2).join
}
# => ["blah", "0912", "1214", "1416", "1618", "1820", "1921", "2021"]

How to add new column in CSV string in rails

I want to add new column and update existing values in CSV response. How can I do simpler and better way of doing the below transformations?
Input
id,name,country
1,John,US
2,Jack,UK
3,Sam,UK
I am using following method to parse the csv string and add new column
# Parse original CSV
rows = CSV.parse(csv_string, headers: true).collect do |row|
hash = row.to_hash
# Merge additional data as a hash.
hash.merge('email' => 'sample#gmail.com')
end
# Extract column names from first row of data
column_names = rows.first.keys
# Generate CSV after transformation of csv
csv_response = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Extract values for row of data
csv << row.values_at(*column_names)
end
end
I am using following method to parse the csv and update existing values
name_hash = {"John" => "Johnny", "Jack" => "Jackie"}
Parse original CSV
rows = CSV.parse(csv_string, headers: true).collect do |row|
hash = row.to_hash
hash['name'] = name_hash[hash['name']] if name_hash[hash['name']] != nil
hash
end
# Extract column names from first row of data
column_names = rows.first.keys
# Generate CSV after transformation of csv
csv_response = CSV.generate do |csv|
csv << column_names
rows.each do |row|
# Extract values for row of data
csv << row.values_at(*column_names)
end
end
One possible option given the following reference data to be used for modifying the table:
name_hash = {"John" => "Johnny", "Jack" => "Jackie"}
sample_email = {'email' => 'sample#gmail.com'}
Just store in rows the table converted to hash:
rows = CSV.parse(csv_string, headers: true).map(&:to_h)
#=> [{"id"=>"1", "name"=>"John", "country"=>"US"}, {"id"=>"2", "name"=>"Jack", "country"=>"UK"}, {"id"=>"3", "name"=>"Sam", "country"=>"UK"}]
Then modify the hash based on reference data (I used Object#then for Ruby 2.6.1 alias of Object#yield_self for Ruby 2.5):
rows.each { |h| h.merge!(sample_email).then {|h| h['name'] = name_hash[h['name']] if name_hash[h['name']] } }
#=> [{"id"=>"1", "name"=>"Johnny", "country"=>"US", "email"=>"sample#gmail.com"}, {"id"=>"2", "name"=>"Jackie", "country"=>"UK", "email"=>"sample#gmail.com"}, {"id"=>"3", "name"=>"Sam", "country"=>"UK", "email"=>"sample#gmail.com"}]
Finally restore the table:
csv_response = CSV.generate(headers: rows.first.keys) { |csv| rows.map(&:values).each { |v| csv << v } }
So you now have:
puts csv_response
# id,name,country,email
# 1,Johnny,US,sample#gmail.com
# 2,Jackie,UK,sample#gmail.com
# 3,Sam,UK,sample#gmail.com

Web Scraping using Ruby - If statment

I have built a web scraper. I need it to scrape the prices and bedrooms of a given neighborhood. Sometimes the span.first_detail_cell will return Furnished and the rest of the time it will return the price. I need to write something that can overlook the span.first_detail_cell if it is furnished and look in the next cell for the price. I think I need to write an if statement, but not sure of the parameters. Any help would be great!
require 'open-uri'
require 'nokogiri'
require 'csv'
url = "https://streeteasy.com/for-rent/bushwick"
page = Nokogiri::HTML(open(url))
page_numbers = []
page.css("nav.pagination span.page a").each do |line|
page_numbers << line.text
end
max_page = page_numbers.max
beds = []
price = []
max_page.to_i.times do |i|
url = "https://streeteasy.com/for-rent/bushwick?page=#{i+1}"
page = Nokogiri::HTML(open(url))
page.css('span.first_detail_cell').each do |line|
beds << line.text
end
page.css('span.price').each do |line|
price << line.text
end
end
CSV.open("bushwick_rentals.csv", "w") do |file|
file << ["Beds", "Price"]
beds.length.times do |i|
file << [beds[i], price[i]]
end
end
page.css('span.first_detail_cell').each do |line|
if line.text.include?("Furnished")
# do something hre
else
beds << line.text
end
end

Parse CSV text with accents

Does anyone know how to parse text like this:
informaci\u00f3n del aplicante
so that it ends up like this?
información del aplicante
This is my method:
def translations_to_csv(translations, headers, to_file_performance=false, file_name = nil)
csv_output = ""
CSV::Writer.generate(csv_output, fs = "\t") do |csv|
if to_file_performance
csv_string = ""
CSV::Writer.generate(csv_string) do |csv_str|
csv_str << headers.collect { |item| item.upcase }
end
File.open(file_name,'a') {|f| f.write(csv_string) }
else
csv << headers.collect { |item| item.upcase }
end
translations.each do |data|
row = headers.collect { |item| data.first[item.to_sym] }
csv << row
end
end
csv_output
end
The CSV exports most text correctly, but some text with accents will look like the first example.
I'm using Ruby 1.8.7 and Rails 2.3.5

Rails write a csv file column wise

I like to export a dataset from my rails application as a csv file using the builtin csv library of rails. Usually a csv file is written row wise like in my example below which comes from my datasets_controller.rb:
require 'csv'
dataset = Dataset.find(6)
dataset_headers = dataset.datacolumns.collect { |dc| dc.columnheader }
csv_file = CSV.generate do |csv|
csv << dataset_headers
end
And now my question is if I could also write my csv files column wise like this?
require 'csv'
dataset_columns = Datacolumn.all(:conditions => ["dataset_id = ?", 6], :order => "columnnr ASC").uniq
csv_file = CSV.generate do |csv|
csv << "here put one after another all my data columns"
end
EDIT:
Based on Douglas suggestion I came up with the colde below.
data_columns=Datacolumn.all(:conditions => ["dataset_id = ?", dataset.id], :order => "columnnr ASC").uniq
CSV.generate do |csv|
value=Array.new
data_columns.each do |dc|
value << dc.columnheader
dc.sheetcells.each do |sc|
if sc.datatype && sc.datatype.is_category? && sc.category
value << sc.category.short
elsif sc.datatype && sc.datatype.name.match(/^date/) && sc.accepted_value
value << sc.accepted_value.to_date.to_s
elsif sc.accepted_value
value << sc.accepted_value
else
value << sc.import_value
end
end
csv << value
value = Array.new
end
end
The output is not transposed for this case and looks like this:
height,10,2,<1,na
fullauthor,Fortune,(Siebold & Zucc.) Kuntze,Fortune,(Siebold & Zucc.) Kuntze
Year,1850,1891,1850,1891
fullname,Mahonia bealei,Toxicodendron sylvestre,Mahonia bealei,Toxicodendron sylvestre
But when I change the line which writes the csv to
csv << value.transpose
I get an error which tells me that it could not convert a string to array to do that.
Anybody an Idea how to fix this?
Any help with this would be appreciated.
Best Claas
You could use Array#transpose, which will flip your rows to columns. A simple example:
> a = [['name', 'charles', 'dave'],['age', 24, 36],['height', 165, 193]]
=> [["name", "charles", "dave"], ["age", 24, 36], ["height", 165, 193]]
> a.transpose
=> [["name", "age", "height"], ["charles", 24, 165], ["dave", 36, 193]]
Thus, assuming dataset_columns is an array:
require 'csv'
dataset_columns = Datacolumn.all(:conditions => ["dataset_id = ?", 6], :order => "columnnr ASC").uniq
csv_file = CSV.generate do |csv|
csv << dataset_columns.transpose
end

Resources