I have a multi-dimensional array that I'd like to use for building an xml output.
The array is storing a csv import. Where people[0][...] are the column names that will become the xml tags, and the people[...>0][...] are the values.
For instance, array contains:
people[0][0] => first-name
people[0][1] => last-name
people[1][0] => Bob
people[1][1] => Dylan
people[2][0] => Sam
people[2][1] => Shepard
XML needs to be:
<person>
<first-name>Bob</first-name>
<last-name>Dylan</last-name>
</person>
<person>
<first-name>Sam</first-name>
<last-name>Shepard</last-name>
</person>
Any help is appreciated.
I suggest using FasterCSV to import your data and to convert it into an array of hashes. That way to_xml should give you what you want:
people = []
FasterCSV.foreach("yourfile.csv", :headers => true) do |row|
people << row.to_hash
end
people.to_xml
There are two main ways I can think of achieving this, one using an XML serializer; the second by pushing out the raw string.
Here's an example of the second:
xml = ''
1.upto(people.size-1) do |row_idx|
xml << "<person>\n"
people[0].each_with_index do |column, col_idx|
xml << " <#{column}>#{people[row_idx][col_idx]}</#{column}>\n"
end
xml << "</person>\n"
end
Another way:
hash = {}
hash['person'] = []
1.upto(people.size-1) do |row_idx|
row = {}
people[0].each_with_index do |column, col_idx|
row[column]=people[row_idx][col_idx]
end
hash['person'] << row
end
hash.to_xml
Leaving this answer here in case someone needs to convert an array like this that didn't come from a CSV file (or if they can't use FasterCSV).
Using Hash.to_xml is a good idea, due to its support in the core rails. It's probably the simplest way to export Hash-like data to simple XML. In most, simple cases - more complex cases requires more complex tools.
Thanks to everyone that posted. Below is the solution that seems to work best for my needs. Hopefully others may find this useful.
This solution grabs a remote url csv file, stores it in a multi-dimensional array, then exports it as xml:
require 'rio'
require 'fastercsv'
url = 'http://remote-url.com/file.csv'
people = FasterCSV.parse(rio(url).read)
xml = ''
1.upto(people.size-1) do |row_idx|
xml << " <record>\n"
people[0].each_with_index do |column, col_idx|
xml << " <#{column.parameterize}>#{people[row_idx][col_idx]}</#{column.parameterize}>\n"
end
xml << " </record>\n"
end
There are better solutions out there, using hash.to_xml would have been great except I needed to change the csv index line to parameterize to use as a xml tag, but this code works so I'm happy.
Related
class GenericFormatter < Formatter
attr_accessor :tag_name,:objects
def generate_xml
builder = Nokogiri::XML::Builder.new do |xml|
xml.send(tag_name.pluralize) {
objects.each do |obj|
xml.send(tag_name.singularize){
self.generate_obj_row obj,xml
}
end
}
end
builder.to_xml
end
def initialize tag_name,objects
self.tag_name = tag_name
self.objects = objects
end
def generate_obj_row obj,xml
obj.attributes.except("updated_at").map do |key,value|
xml.send(key, value)
end
xml.updated_at obj.updated_at.try(:strftime,"%m/%d/%Y %H:%M:%S") if obj.attributes.key?('updated_at')
end
end
In the above code, I have implemented the formatter where I have used the nokogiri XML Builder to generate the XML by manipulating the objects passing out inside the code.It's generated the faster XML when the data is not too large if data is larger like more than 10,000 records then It's slow down the XML to generate and takes at least 50-60 seconds.
Problem: Is there any way to generate the XML faster, I have tried XML Builders on view as well but did n't work.How can I generate the XML Faster? Should the solution be an application on rails 3 and suggestions to optimized above code?
Your main problem is processing everything in one go instead of splitting your data into batches. It all requires a lot of memory, first to build all those ActiveRecord models and then to build memory representation of the whole xml document. Meta-programming is also quite expensive (I mean those send methods).
Take a look at this code:
class XmlGenerator
attr_accessor :tag_name, :ar_relation
def initialize(tag_name, ar_relation)
#ar_relation = ar_relation
#tag_name = tag_name
end
def generate_xml
singular_tag_name = tag_name.singularize
plural_tag_name = tag_name.pluralize
xml = ""
xml << "<#{plural_tag_name}>"
ar_relation.find_in_batches(batch_size: 1000) do |batch|
batch.each do |obj|
xml << "<#{singular_tag_name}>"
obj.attributes.except("updated_at").each do |key, value|
xml << "<#{key}>#{value}</#{key}>"
end
if obj.attributes.key?("updated_at")
xml << "<updated_at>#{obj.updated_at.strftime('%m/%d/%Y %H:%M:%S')}</updated_at>"
end
xml << "</#{singular_tag_name}>"
end
end
xml << "</#{tag_name.pluralize}>"
xml
end
end
# example usage
XmlGenerator.new("user", User.where("age < 21")).generate_xml
Major improvements are:
fetching data from database in batches, you need to pass ActiveRecord collection instead of array of ActiveRecord models
generating xml by constructing strings, this has a risk of producing invalid xml, but it is much faster than using builder
I tested it on over 60k records. It took around 40 seconds to generate such xml document.
There is much more that can be done to improve this even further, but it all depends on your application.
Here are some ideas:
do not use ActiveRecord to fetch data, instead use lighter library or plain database driver
fetch only data that you need
tweak batch size
write generated xml directly to a file (if that is your use case) to save memory
The Nokogiri gem has a nice interface for creating XML from scratch,
Nokogiri is a wrapper around libxml2.
Gemfile gem 'nokogiri' To generate xml simple use the Nokogiri XML Builder like this
xml = Nokogiri::XML::Builder.new { |xml|
xml.body do
xml.test1 "some string"
xml.test2 890
xml.test3 do
xml.test3_1 "some string"
end
xml.test4 "with attributes", :attribute => "some attribute"
xml.closing
end
}.to_xml
output
<?xml version="1.0"?>
<body>
<test1>some string</test1>
<test2>890</test2>
<test3>
<test3_1>some string</test3_1>
</test3>
<test4 attribute="some attribute">with attributes</test4>
<closing/>
</body>
Demo: http://www.jakobbeyer.de/xml-with-nokogiri
I have a file b.xls from excel I need to import it to my rails app
I have tried to open it
file = File.read(Rails.root.to_s+'/b.xls')
I have got this
file.encoding => #Encoding:UTF-8
I have few questions:
how to open without this symbols(normal language)?
how to convert this file to a hash?
File pretty large about 5k lines
You must have array of all rows then you can convert it to some hash if you like so.
I would recommend to use a batch_factory gem.
The gem is very simple and relies on the roo gem under the hood.
Here is the code example
require 'batch_factory'
factory = BatchFactory.from_file(
Rails.root.join('b.xlsx'),
keys: [:column1, :column2, ..., :what_ever_column_name]
)
Then you can do
factory.each do |row|
puts row[:column1]
end
You can also omit specifying keys. Then batch_factory will automatically fetch headers from the first row. But your keys would be in russian. Like
factory.each do |row|
puts row['Товар']
end
If you want to hash with product name as key you can do
factory.inject({}) do |hash, row|
hash.merge(row['Товар'] => row)
end
I want to open an external XML file, parse it and use the data to store in my database. I do this with Nokogiri quite easy:
file = '...external.xml'
xml = Nokogiri::XML(open(file))
xml.xpath('//Element').each do |element|
# process elements and save to Database e.g.:
#data = Model.new(:attr => element.at('foo').text)
#data.save
end
Now I want to try the (maybe faster) Ox gem (https://github.com/ohler55/ox) - but I do not get how to open and process a file from the documentary.
Any equivalent code examples for the above code would be awesome! Thank you!
You can't use XPath to locate nodes in Ox, but Ox does provide a locate method. You can use it like so:
xml = Ox.parse(%Q{
<root>
<Element>
<foo>ex1</foo>
</Element>
<Element>
<foo>ex2</foo>
</Element>
</root>
}.strip)
xml.locate('Element/foo/^Text').each do |t|
#data = Model.new(:attr => t)
#data.save
end
# or if you need to do other stuff with the element first
xml.locate('Element').each do |elem|
# do stuff
#data = Model.new(:attr => elem.locate('foo/^Text').first)
#data.save
end
If your query doesn't find any matches, it will return an empty array. For a brief description of the locate query parameter, see the source code at element.rb.
From the documentation:
doc2 = Ox.parse(xml)
To read the contents of a file in Ruby you can use xml = IO.read('filename.xml') (among others). So:
doc = Ox.parse(IO.read(filename))
If your XML file is UTF-8 encoded, then alternatively:
doc = Ox.parse( File.open(filename,"r:UTF-8",&:read) )
just started learning Rails and have managed to import a csv file into a database, but the price field in the csv has quotes and a comma like this: "560,000"
But if I make the price field as t.integer in the migration file, then add the data, the price gets imported as 560. So, how do I remove the quotes and the comma before importing it? thanks, Adam
edit: here's the rake file:
require 'csv'
task :csv_to_properties => [:environment] do
CSV.foreach("lib/assets/cbmb_sale.csv", :headers => true) do |row|
Property.create!(row.to_hash)
end
end
Try something like:
csvvalue = csvvalue.gsub!(/,/,'').to_i
Cheers!
Thanks for posting your code. I don't do a ton with converting csv's to hashes but something like this will probably work:
Property.create!(row.to_hash.each_pair{|k,v| row.store(k,v.gsub(/,/,'').to_i)})
Pretty ugly but probably pretty close to what you want.
In your code example, assuming the price field is in row element 4:
CSV.foreach("lib/assets/cbmb_sale.csv", :headers => true) do |row|
row[price=4].gsub!(/,/,'')
Property.create!(row.to_hash)
end
The price=4 is just a handy way to document the index value of the price element, it creates a variable called price assigns the value 4 to it, then immediately uses it as the array index.
Since Property.create! is already taking care of the string to integer conversion, we can perform an in-place substitution for the regular expression that contains a comma /,/ for an empty string ''.
Try:
"220,000".scan(/\d+/).join().to_i
I know I've done this before and found a simple set of code, but I cannot remember or find it :(.
I have a text file of records I want to import into my Rails 3 application.
Each line represents a record. Potentially it may be tab delimited for the attributes, but am fine with just a single value as well.
How do I do this?
File.open("my/file/path", "r").each_line do |line|
# name: "Angela" job: "Writer" ...
data = line.split(/\t/)
name, job = data.map{|d| d.split(": ")[1] }.flatten
end
Related topic
What are all the common ways to read a file in Ruby?
You want IO.foreach:
IO.foreach('foo.txt') do |line|
# process the line of text here
end
Alternatively, if it really is tab-delimited, you might want to use the CSV library:
File.open('foo.txt') do |f|
CSV.foreach(f, col_sep:"\t") do |csv_row|
# All parsed for you
end
end
IO.foreach("input.txt") do |line|
out.puts line
# You might be able to use split or something to get attributes
atts = line.split
end
Have you tried using OpenURI (http://ruby-doc.org/stdlib-2.1.2/libdoc/open-uri/rdoc/OpenURI.html)? You would have to make your files accessible from S3.
Or try using de aws-sdk gem (http://aws.amazon.com/sdk-for-ruby).
You can use OpenURI to read remote or local files.
Assuming that your model has an attachment named file:
# If object is stored in amazon S3, access it through url
file_path = record.file.respond_to?(:s3_object) ? record.file.url : record.file.path
open(file_path) do |file|
file.each_line do |line|
# In your case, you can split items using tabs
line.split("\t").each do |item|
# Process item
end
end
end