Have do i display and extract data from a feed url?
And I only of the interest to import/display those there have the catogory_id of 10
This is the feed url:
http://www.euroads.dk/system/api.php?username=x&password=x&function=campaign_feed&eatrackid=13614&version=5
The format on the feed is:
campaignid;;advertid;;title;;startdate;;enddate;;amountleft;;price;;percent;;campaigntype;;targetage;;targetsex;;category;;category_id;;cpc;;advert_type;;advert_title;;bannerwidth;;bannerheight;;textlink_length;;textlink_text;;advert_url;;advert_image;;advert_code;;campaign_teaser;;reward/cashback;;SEM;;SEM restrictions
Here is a sample code of the feed:
campaignid;;advertid;;title;;startdate;;enddate;;amountleft;;price;;percent;;campaigntype;;
targetage;;targetsex;;category;;category_id;;cpc;;advert_type;;advert_title;;bannerwidth;;bannerheight;;textlink_length;;textlink_text;;advert_url;;advert_image;;advert_code;;campaign_teaser;;reward/cashback;;SEM;;SEM restrictions <br/> <br/> 2603;;377553;;MP3 afspiller;;2010-07-21;;2011-12-31;;-1;;67,00;;;;Lead kampagne;;Over 18;;Alle;;Elektronik;Musik, film & spil;;7,13;;0,97;;Banner;;;;930;;180;;0;;;;http://tracking.euroads.dk/system<br/> <br/> /tracking.php?sid=1&cpid=2603&adid=377553&acid=4123&eatrackid=13614;;http://banner.euroads.dk/banner/1/2603/banner_21153.gif;;;;http://banner.euroads.dk/banner/1/2603/teaserbanner_1617.gif;;Allowed;;
The data format looks like a variation on CSV, if ';;' is used as a column separator. Based on that:
require 'csv'
CSV.parse(data, :col_sep => ';;') do |csv|
# do something with each record
end
data will be the content you receive.
Inside the loop, csv will be an array containing each record's fields. The first time through the loop will be the headers and subsequent times through csv will be the data records.
Sometimes you'll see ';;;;', which means there's an empty field; For instance, field;;;;field which would convert to ['field',nil,'field'] in csv. You'll need to figure out what you want to do with nil records. I'd suggest you'll probably want to map those to empty strings ('').
Related
I have a piece of code in Ruby which essentially adds multiple lines into a csv through the use of
csv_out << listX
I have both a header that I supply in the **options and regular data.
And I am having a problem when I try to view the CSV, mainly that all the values are in one row and it looks to me that any software fails to recognize '\n' as a line separator.
Input example:
Make, Year, Mileage\n,Ford,2019,10000\nAudi, 2000, 100000
Output dimensions:
8x1 table
Desired dimensions:
3x3 table
Any idea of how to go around that? Either by replacing '\n' with something or using something else than csv.generate
csv = CSV.generate(encoding: 'UTF=8') do |csv_out|
csv_out << headers
data.each do |row|
csv_out << row.values
end
The problem seems to be the data.each part. Assuming that data holds the string you have posted, this loop is executed only once, and the string is written into a single row.
You have to loop over the indivdual pieces of data, for instance with
data.split("\n").each
I'm going to preface that I'm still learning ruby.
I'm writing a script to parse a .csv and identify possible duplicate records in the data-set.
I have a .csv file with headers, so I'm parsing the data so that I can access each row using a header title as such:
#contact_table = CSV.parse(File.read("app/data/file.csv"), headers: true)
# Prints all last names in table
puts contact_table['last_name']
I'm trying to iterate over each row in the table and identify if the last name I'm currently iterating over is similar to the next last name, but I'm having trouble doing this. I guess the way I'm handling it is as if it's an array, but I checked the type and it's a CSV::Row.
example (this doesn't work):
#contact_table.each_with_index do |c, i|
puts "first contact is #{c['last_name']}, second contact is #{c[i + 1]['last_name']}"
end
I realized this doesn't work like this because the table isn't an array, it's a CSV::Row like I previously mentioned. Is there any method that can achieve this? I'm really blanking right now.
My csv looks something like this:
id,first_name,last_name,company,email,address1,address2,zip,city,state_long,state,phone
1,Donalt,Canter,Gottlieb Group,dcanter0#nydailynews.com,9 Homewood Alley,,50335,Des Moines,Iowa,IA,515-601-4495
2,Daphene,McArthur,"West, Schimmel and Rath",dmcarthur1#twitter.com,43 Grover Parkway,,30311,Atlanta,Georgia,GA,770-271-7837
#contact_table should be a CSV::Table which is a collection of CSV::Rows so in this:
#contact_table.each_with_index do |c, i|
...
end
c is a CSV::Row. That's why c['last_name'] works. The problem is that here:
c[i + 1]['last_name']
you're looking at c (a single row) instead of #contact_table, if you said:
#contact_table[i + 1]['last_name']
then you'd get the next last name or, when c is the last row, an exception because #contact_table[i+1] will be nil.
Also, inside the iteration, c is the current (or (i+1)th) row and won't always be the first.
What is your use case for this? Seems like a school project?
I recommend for_each instead of parse (see this comparison). I would probably use a Set for this.
Create a Set outside of the scope of parsing the file (i.e., above the parsing code). Let's call it rows.
Call rows.include?(row) during each iteration while parsing the file
If true, then you know you have a duplicate
If false, then call rows.add(row) to add the new row to the set
You could also just fill your set with an individual value from a column that must be distinct (e.g., row.field(:some_column_name)), such as email or phone number, and do the same inclusion check for that.
(If this is for a real app, please don't do this. Use model validations instead.)
I would use #read instead of #parse and do something like this:
require 'csv'
LASTNAME_INDEX = 2
data = CSV.read('data.csv')
data[1..-1].each_with_index do |row, index|
puts "Contact number #{index + 1} has the following last name : #{row[LASTNAME_INDEX]}"
end
#~> Contact number 1 has the following last name : Canter
#~> Contact number 2 has the following last name : McArthur
I have array reference like this
a1 = [["http://ww.amazon.com"],["failed"]]
When i write it to csv file it is written like
["http://ww.amazon.com"]
["failed"]
But i want to write like
http://ww.amazon.com failed
First you need to flatten the array a1
b1 = a1.flatten # => ["http://ww.amazon.com", "failed"]
Then you need to generate the CSV by passing every row (array) to the following csv variable:
require 'csv'
csv_string = CSV.generate({:col_sep => "\t"}) do |csv|
csv << b1
end
:col_sep =>"\t" is used to insert a tab separator in each row.
Change the value of :col_sep => "," for using comma.
Finally you have the csv_string containing the correct form of the csv
Ruby's built-in CSV class is your starting point. From the documentation for writing to a CSV file:
CSV.open("path/to/file.csv", "wb") do |csv|
csv << ["row", "of", "CSV", "data"]
csv << ["another", "row"]
# ...
end
For your code, simply flatten your array:
[['a'], ['b']].flatten # => ["a", "b"]
Then you can assign it to the parameter of the block (csv) which will cause the array to be written to the file:
require 'csv'
CSV.open('file.csv', 'wb') do |csv|
csv << [["row"], ["of"], ["CSV"], ["data"]].flatten
end
Saving and running that creates "file.csv", which contains:
row,of,CSV,data
Your question is written in such a way that it sounds like you're trying to generate the CSV file by hand, rather than rely on a class designed for that particular task. On the surface, creating a CSV seems easy, however it has nasty corner cases and issues to be handled when a string contains spaces and the quoting character used to delimit strings. A well-tested, pre-written class can save you a lot of time writing and debugging code, or save you from having to explain to a customer or manager why your data won't load correctly into a database.
But that leaves the question, why does your array contain sub-arrays? Usually that happens because you're doing something wrong as you gather the elements, and makes me think your question should really be about how do you avoid doing that. (It's called an XY problem.)
I have the following code:
def csv_to_array(file)
csv = CSV::parse(file)
fields = csv.shift
array = csv.collect { |record| Hash[*fields.zip(record).flatten] }
end
This creates an array of hashes, and works fine with comma separated values. I am trying to replicate this code for a tab delimited file. Currently, when I run the above code on my tab delimited file, I get something like this:
array[0] = {"First Name\tLast Name\tCode\t"=>"Luigi\tSmith\t1406\t"}
So, each array object is a hash as intended, but it has one key value pair - The entire tab delimited header row being the key, and the individual row of data being the value.
How can I alter this code to return an array of hashes with individual key value pairs, with the header of each column mapping to the row value for that column?
It seems that the options you pass to parse are listed in ::new
>> CSV.parse("qwe\tq\twe", col_sep: "\t"){|a| p a}
["qwe", "q", "we"]
Use the col_sep option, this post has the code: Changing field separator/delimiter in exported CSV using Ruby CSV
also checkout the docs: http://ruby-doc.org/stdlib-2.1.0/libdoc/csv/rdoc/CSV.html
lots of good stuff in the DEFAULT_OPTIONS section
I'm trying to limit the number of times I do a mysql query, as this could end up being 2k+ queries just to accomplish a fairly small result.
I'm going through a CSV file, and I need to check that the format of the content in the csv matches the format the db expects, and sometimes I try to accomplish some basic clean-up (for example, I have one field that is a string, but is sometimes in the csv as jb2003-343, and I need to strip out the -343).
The first thing I do is get from the database the list of fields by name that I need to retrieve from the csv, then I get the index of those columns in the csv, then I go through each line in the csv and get each of the indexed columns
get_fields = BaseField.find_by_group(:all, :conditions=>['group IN (?)',params[:group_ids]])
csv = CSV.read(csv.path)
first_line=csv.first
first_line.split(',')
csv.each_with_index do |row|
if row==0
col_indexes=[]
csv_data=[]
get_fields.each do |col|
col_indexes << row.index(col.name)
end
else
csv_row=[]
col_indexes.each do |col|
#possibly check the value here against another mysql query but that's ugly
csv_row << row[col]
end
csv_data << csv_row
end
end
The problem is that when I'm adding the content of the csv_data for output, I no longer have any connection to the original get_fields query. Therefore, I can't seem to say 'does this match the type of data expected from the db'.
I could work my way back through the same process that got me down to that level, and make another query like this
get_cleanup = BaseField.find_by_csv_col_name(first_line[col])
if get_cleanup.format==row[col].is_a
csv_row << row[col]
else
# do some data clean-up
end
but as I mentioned, that could mean the get_cleanup is run 2000+ times.
instead of doing this, is there a way to search within the original get_fields result for the name, and then get the associated field?
I tried searching for 'search rails object', but kept getting back results about building search, not searching within an already existing object.
I know I can do array.search, but don't see anything in the object api about search.
Note: The code above may not be perfect, because I'm not running it yet, just wrote that off the top of my head, but hopefully it gives you the idea of what I'm going for.
When you populate your col_indexes array, rather than storing a single value, you can store a hash which includes index and the datatype.
get_fields.each do |col|
col_info = {:row_index = row.index(col.name), :name=>col.name :format=>col.format}
col_indexes << col_info
end
You can then access all your data in the for loop