How to convert XML to Hash in ruby? - ruby-on-rails

I have a XML code which I want to convert into Hash
<meta_description><language id="1"></language><language id="2"></language></meta_description>
<meta_keywords><language id="1"></language><language id="2"></language></meta_keywords>
<meta_title><language id="1"></language><language id="2" ></language></meta_title>
<link_rewrite><language id="1" >konsult-500-krtim</language><language id="2" >konsult-500-krtim</language></link_rewrite>
<name><language id="1" >Konsult 500 kr/tim</language><language id="2" >Konsult 500 kr/tim</language></name>
<description><language id="1" ></language><language id="2" ></language></description>
<description_short><language id="1" ></language><language id="2" ></language></description_short>
<available_now><language id="1" ></language><language id="2" ></language></available_now>
<available_later><language id="1" ></language><language id="2" ></language></available_later>
<associations>
<categories nodeType="category" api="categories">
<category>
<id>2</id>
</category>
</categories>
<images nodeType="image" api="images"/>
<combinations nodeType="combination" api="combinations"/>
<product_option_values nodeType="product_option_value" api="product_option_values"/>
<product_features nodeType="product_feature" api="product_features"/>
<tags nodeType="tag" api="tags"/>
<stock_availables nodeType="stock_available" api="stock_availables">
<stock_available>
<id>111</id>
<id_product_attribute>0</id_product_attribute>
</stock_available>
</stock_availables>
<accessories nodeType="product" api="products"/>
<product_bundle nodeType="product" api="products"/>
</associations>
I want to convert this xml into Hash .
I try to find functions which convert this xml to h=Hash.new
How I do this?

There is ActiveSupport's Hash#from_xml method that you can use:
xml = File.open("data.xml").read # if your xml is in the 'data.xml' file
Hash.from_xml(xml)

If you are using Rails you can use the answer provided above, otherwise you can require the ActiveSuppport gem:
require 'active_support/core_ext/hash'
xml = '<foo>bar</foo>'
hash = Hash.from_xml(xml)
=>{"foo"=>"bar"}
Note this will only work with valid xml. See comments on op. Also note that using element attributes like id="1" won't convert back the same way for example:
xml = %q(
<root>
<foo id="1"></foo>
<bar id="2"></bar>
</root>).strip
hash = Hash.from(xml)
=>{"root"=>{"foo"=>{"id"=>"1"}, "bar"=>{"id"=>"2"}}}
puts hash.to_xml
# will output
<?xml version="1.0" encoding="UTF-8"?>
<hash>
<root>
<foo>
<id>1</id>
</foo>
<bar>
<id>2</id>
</bar>
</root>
</hash>

Use nokogiri to parse XML response to ruby hash. It's pretty fast.
require 'active_support/core_ext/hash' #from_xml
require 'nokogiri'
doc = Nokogiri::XML(response_body)
Hash.from_xml(doc.to_s)

Related

Nokogiri : NoMethodError (undefined method `inner_html' for nil:NilClass)

I'm trying to parse a simple XML data with nokogiri.
this is my XML:
POST /.... HTTP/1.1
Host: ....
Content-Type: text/xml; charset=utf-8
Content-Length: length
SOAPAction: "http://...."
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="...." xmlns:xsd="...." xmlns:soap="....">
<soap:Body>
<WS_QueryOnSec xmlns="......">
<type>string</type>
<ID>string</ID>
</WS_QueryOnSec>
</soap:Body>
</soap:Envelope>
and this is my simle request:
require "nokogiri"
#doc = Nokogiri::XML(request.body.read)
#something = #doc.at('type').inner_html
But Nokogiri can not find the Type or ID node.
When I change the data into this every thing works fine:
<soap:Body>
<type>string</type>
<ID>string</ID>
</soap:Body>
It seems the problem is the raw text above the data and the nods with xmlns or the other attributes!
What do you recommend to resolve this ?
The first "XML" isn't XML. It's text that contains XML. Remove the header information down to the blank line and try it again.
I think it'd help you to read the XML spec or to read some tutorials about creating XML which will help you understand how it's defined. XML is a tight specification and doesn't allow any deviation. The syntax is pretty flexible, but you have to play by its rules.
Consider these examples:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
foo
<root>
<node />
</root>
EOT
doc.errors # => [#<Nokogiri::XML::SyntaxError: Start tag expected, '<' not found>]
Removing the text, which is outside the root tag results in a proper parse:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<root>
<node />
</root>
EOT
doc.errors # => []
<root> isn't neccesarily the name of the "root" node, it's just the outermost tag:
doc = Nokogiri::XML(<<EOT)
<foo>
<node />
</foo>
EOT
doc.errors # => []
and still results in a valid DOM/internal representation of the document:
puts doc.to_html
# >> <foo>
# >> <node></node>
# >> </foo>
Your XML sample is using namespaces, which complicate matters somewhat. The Nokogiri documentation talks about how to deal with them, so you'll want to understand that part of parsing XML because you'll encounter it again. Here's the easy way of working with them:
require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="utf-8"?>
<Envelope xmlns:xsi="...." xmlns:xsd="...." xmlns:soap="....">
<Body>
<WS_QueryOnSec xmlns="......">
<type>string</type>
<ID>string</ID>
</WS_QueryOnSec>
</Body>
</Envelope>
EOT
namespaces = doc.collect_namespaces
doc.at('type', namespaces).text # => "string"

ruby/rails strip out xml code based on regex expression

I receive the following xml in a database field.
<Request type="Final">
<Field name="Grade">94.5</Field>
<Field name="EmployeeName">2398;;;Mike5</Field>
<Field name="Date">051215</Field>
</Request>
Currently, I just receive it and display them as it is:
def request_xml
(request.blank? ? "" : request.message)
end
Now, I want to return the xml by stripping of the EmployeeName value to nil i.e 2398;;;Mike5 from 2398;;;Mike5 based on certain logic
So I am ok with 2 solutions
> 1. if EmployeeName value matches a regex, return null else return the value as it is?
> - Return <Field name="EmployeeName"></Field>
> 2. if EmployeeName value matches a regex completely strip out the whole EmployeeName XML: <Field
> name="EmployeeName">2398;;;Mike5</Field> from the result
Is either of the above solution possible via ruby/rails code?
As #asiniy already answered, you can use nokogiri for changing your xml, like this:
UPDATE: you can filter tag contents with the following code:
require 'nokogiri'
xml = Nokogiri::XML('<Request type="Final">
<Field name="Grade">94.5</Field>
<Field name="EmployeeName">2398;;;Mike5</Field>
<Field name="Date">051215</Field>
</Request>')
# remove all Employee elements, containing text 2398
xml.css('Field[name=EmployeeName]').select { |node| node.text =~ /2398/ }.map(&:remove)
puts xml
will output
<?xml version="1.0"?>
<Request type="Final">
<Field name="Grade">94.5</Field>
<Field name="Date">051215</Field>
</Request>
You can do it through nokogiri gem. Here is official tutorial
require 'nokogiri'
xml = Nokogiri::XML('<Request type="Final">
<Field name="Grade">94.5</Field>
<Field name="EmployeeName">2398;;;Mike5</Field>
<Field name="Date">051215</Field>
</Request>')
if xml.xpath('//EmployeeName')
do_whatever_you_want

Rails XML Feed: ID as node attribute

I set up a simple XML feed for a vendor we're using (who refuses to read JSON).
<recipes type="array">
<recipe>
<id type="integer">1</id>
<name>
Hamburgers
</name>
<producturl>
http://test.com
</producturl>
...
</recipe>
...
<recipe>
However, the vendor requests that instead of having an id node, id is an attribute in the parent node. e.g.
<recipes type="array">
<recipe id="1">
<name>
Hamburgers
</name>
<producturl>
http://test.com
</producturl>
...
</recipe>
...
<recipe>
I'm building this with (basically)
xml_feed = []
recipes.each do |recipe|
xml_feed <<{id: recipe.id, name: recipe.name, ...}
end
...
render xml: xml_feed.to_xml(root: 'recipes')
But I'm unsure of how to include the id (or any field) as an attribute in the parent node like that. I googled around and couldn't find anything, nor were the http://api.rubyonrails.org/classes/ActiveRecord/Serialization.html docs very helpful
Thanks!
I would suggest you use the nokogiri gem. It provides all you can possible need for handling XML.
builder = Nokogiri::XML::Builder.new do |xml|
xml.root {
xml.objects {
xml.object.classy.thing!
}
}
end
puts builder.to_xml
<?xml version="1.0"?>
<root>
<objects>
<object class="classy" id="thing"/>
</objects>
</root>
The suggestion to use Nokogiri is fine. Just the sintax should be a little bit different to achive what you have requested:
builder = Nokogiri::XML::Builder.new do |xml|
xml.root {
xml.object('type' => 'Client') {
xml.name 'John'
}
}
end
puts builder.to_xml
<?xml version="1.0"?>
<root>
<object type="Client">
<name>John</name>
</object>
</root>

How to parse an XML file with metadata key names?

I've recently started using Nokogiri as a solution to parsing data into a RAILS 3 application. The problem I'm having is that I don't fully understand how to do it as the XML I am parsing appears to be 'non-standard'. Take a look at the snippet below:
<?xml version="1.0" encoding="utf-8"?>
<dataset xmlns="http://.com/schemas/xmldata/1/" xmlns:xs="http://www.w3.org/2001/XMLSchema-instance">
<!--
<dataset
xmlns="http://.com/schemas/xmldata/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema-instance"
xs:schemaLocation="http://.com/schemas/xmldata/1/ xmldata.xsd"
>
-->
<metadata>
<item name="Problem ID" type="xs:string" length="32"/>
<item name="Account Title" type="xs:string" length="162"/>
<item name="Account Name" type="xs:string" length="162"/>
<item name="Reassignment" type="xs:int" precision="1"/>
<item name="Initial Severity" type="xs:int" precision="1"/>
<item name="Resolution Desc" type="xs:string" length="510"/>
<item name="Resolver Name" type="xs:string" length="82"/>
<item name="Problem Code" type="xs:string" length="32"/>
<item name="Status" type="xs:string" length="32"/>
</metadata>
<data>
<row>
<value>AP-06684768 </value>
<value>ESA</value>
<value>1</value>
<value>8</value>
<value>8</value>
<value xs:nil="true" />
<value xs:nil="true" />
<value>ADDITION TO EXISTING FIREWALL</value>
<value></value>
<value>ESA BRIDGE </value>
<value>CLOSED </value>
<value>CLOSED </value>
</row>
<row>
<value>AP-06720564 </value>
<value>ESA</value>
<value>2011-01-19T12:02:47</value>
<value>2011-01-19T12:02:49</value>
<value>0</value>
<value>776</value>
<value>SCP UESCADADEV -> UESCADAPW/BW</value>
<value>NETAU_NETMGTS </value>
<value>N/A</value>
<value>ESA BRIDGE </value>
<value>CLOSED </value>
<value>CLOSED </value>
</row>
</data>
</dataset>
Instead of having named nodes and attributes it seems to be a 'metadata' section and then rows, much like a table really. How would I parse all this data?
require 'rubygems'
require 'nokogiri'
require 'pp'
doc = Nokogiri::XML(DATA)
column_names = doc.css('dataset > metadata > item').map {|a| a['name']}
result = doc.css('dataset > data > row').map do |row|
values = row.css('value').map { |value| value[:nil] == 'true' ? nil : value.content }
Hash[column_names.zip(values)]
end
pp result
results in
[{"Problem Code"=>"ADDITION TO EXISTING FIREWALL",
"Resolution Desc"=>nil,
"Reassignment"=>"8",
"Resolver Name"=>nil,
"Status"=>"",
"Problem ID"=>"AP-06684768 ",
"Account Name"=>"1",
"Initial Severity"=>"8",
"Account Title"=>"ESA"},
{"Problem Code"=>"NETAU_NETMGTS ",
"Resolution Desc"=>"776",
"Reassignment"=>"2011-01-19T12:02:49",
"Resolver Name"=>"SCP UESCADADEV -> UESCADAPW/BW",
"Status"=>"N/A",
"Problem ID"=>"AP-06720564 ",
"Account Name"=>"2011-01-19T12:02:47",
"Initial Severity"=>"0",
"Account Title"=>"ESA"}]
Here's working code that I hacked out and tested:
require 'rubygems'
require 'nokogiri'
class Item
attr_accessor :name
def initialize(name)
#name = name
end
end
file = File.open("data.xml")
document = Nokogiri::XML(file)
file.close
metadata = document.root.children[3]
items = metadata.children.reject{|child| child.attribute('name').nil?}.map do |child|
Item.new(child.attribute('name').value)
end
puts "#{items.size} items"
puts items.inspect
Results:
[~/stackoverflow/graphML] ruby parse.rb
9 items
[#<Item:0x007fc01c0fbd90 #id="Problem ID">, #<Item:0x007fc01c0fbca0 #id="Account Title">, #<Item:0x007fc01c0fbc28 #id="Account Name">, #<Item:0x007fc01c0fbbb0 #id="Reassignment">, #<Item:0x007fc01c0fbb38 #id="Initial Severity">, #<Item:0x007fc01c0fbac0 #id="Resolution Desc">, #<Item:0x007fc01c0fba48 #id="Resolver Name">, #<Item:0x007fc01c0fb9d0 #id="Problem Code">, #<Item:0x007fc01c0fb868 #id="Status">]
Here's the full project on GitHub: https://github.com/endymion/GraphML-parsing-exercise/tree/metadata-key-names
(It's a branch of a GraphML parsing exercise that I hacked out earlier tonight for somebody else on Stack Overflow.)

rails inverting to_xml and getting the original model

I did this:
[User.first, User.last].to_xml
and got this:
<users type="array">
<user>
<created-at type="datetime">2010-03-16T06:40:51Z</created-at>
<id type="integer">3</id>
<password-hash></password-hash>
<salt></salt>
<updated-at type="datetime">2010-03-16T06:40:51Z</updated-at>
<username nil="true"></username>
</user>
<user>
<created-at type="datetime">2010-03-23T03:58:15Z</created-at>
<id type="integer">7</id>
<password-hash></password-hash>
<salt></salt>
<tutorial-state nil="true"></tutorial-state>
<updated-at type="datetime">2010-03-23T03:58:15Z</updated-at>
<username nil="true"></username>
</user>
</users>
How can I take that string of xml and invert it to get the original activerecord objects back?
Try this:
Model object xml:
xml = User.first.to_xml
User.new(Hash.from_xml(xml))
Array of model xml:
xml = User.all.to_xml
users = (Hash.from_xml(xml)["users"] || []).collect{|attr| User.new(attr)}
I do know that you can do this on individual users; doing it on an array will require your own bit of XML parsing.
user = User.new
user.from_xml '<user><id type="integer">1</id></user>'

Resources