Nokogiri: create xml from string with `?` in field name - ruby-on-rails

Controller response includes "spec?" field:
r = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<hash type=\"array\">\n <item><spec? type=\"boolean\">false</spec?>\n </item>\n <hash>\n"
When trying to create xml from it with Nokogiri.xml(r) receive literally:
<?xml version="1.0" encoding="UTF-8"?>
<hash type="array">
<item><spec type=" type="boolean">false/spec">
</spec>item>
<hash>
</hash></item></hash>
which is something strange;
My question is:
is it possible to create xml from string using Nokogiri, parsing or removing ? and other non-xml-standart chars, at stage of Nokogiri.XML()?
Desirible result:
Nokogiri.xml(r) do |config|
config.maybe_some_configs?
end #=>
<?xml version="1.0" encoding="UTF-8"?>
<hash type="array">
<item><spec type="boolean">false</spec></item>
</hash>

The proper way to parse a string into an XML DOM is Nokogiri::XML or Nokogiri.XML or Nokogiri::XML.parse, but not using xml.
Also, XML tags can't contain ?. See the spec for more information. You'll have to dig through the "Names and Tokens" section and decode hexadecimal character descriptions to figure out the ranges of characters allowed, but a hint is that ? is character code 0x3f.
Which leads to the fact that the XML in r is invalid:
<?xml version="1.0" encoding="UTF-8"?>
<hash type="array">
<item><spec? type="boolean">false</spec?>
</item>
<hash>
Which, when parsed results in:
irb(main):012:0> doc = Nokogiri::XML(r)
#<Nokogiri::XML::Document:0x80c8014c name="document" children=[#<Nokogiri::XML::Element:0x80c7399c name="hash" attributes=[#<Nokogiri::XML::Attr:0x80c733e8 name="type" value="array">] children=[#<Nokogiri::XML::Text:0x80c6e26c "\n ">, #<Nokogiri::XML::Element:0x80c6df60 name="item" children=[#<Nokogiri::XML::Element:0x80c6d970 name="spec">, #<Nokogiri::XML::Text:0x80c6d09c "? type=\"boolean\">false">]>, #<Nokogiri::XML::Text:0x80c6ca34 "?>\n ">]>]>
irb(main):013:0> doc.errors
[
[0] #<Nokogiri::XML::SyntaxError: error parsing attribute name>,
[1] #<Nokogiri::XML::SyntaxError: attributes construct error>,
[2] #<Nokogiri::XML::SyntaxError: Couldn't find end of Start Tag spec line 3>,
[3] #<Nokogiri::XML::SyntaxError: expected '>'>,
[4] #<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: item line 3 and spec>,
[5] #<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: hash line 2 and item>,
[6] #<Nokogiri::XML::SyntaxError: Extra content at the end of the document>
]
As a result, Nokogiri is having to do some fixup in the DOM to try to make sense of it. The resulting XML looks like:
irb(main):014:0> puts doc.to_xml
<?xml version="1.0" encoding="UTF-8"?>
<hash type="array">
<item><spec/>? type="boolean">false</item>?>
</hash>
The way to fix it is to give Nokogiri valid XML. Either fix the source of the XML, if you control it, or fix the problems in the string before passing it to Nokogiri.
By its definition, XML is a strict format, and Nokogiri honors that and, trying to be friendly, makes it possible for you to check errors to see if its empty?. If it's not, odds are good you shouldn't continue using the source until you've determined the problems and fixed whatever causes the parsing problems. Sometimes the problem is fairly benign, and you can ignore it, but in either case you should at least be aware of it.
Pre-massaging the data before Nokogiri sees it isn't hard:
doc = Nokogiri::XML(r.gsub('spec?', 'spec'))
irb(main):024:0> puts doc.to_xml
<?xml version="1.0" encoding="UTF-8"?>
<hash type="array">
<item><spec type="boolean">false</spec>
</item>
<hash>
</hash></hash>
nil
irb(main):025:0> doc.errors
[
[0] #<Nokogiri::XML::SyntaxError: Premature end of data in tag hash line 5>,
[1] #<Nokogiri::XML::SyntaxError: Premature end of data in tag hash line 2>
]
That's a start, but not an attempt to fix it for you completely. I'm teaching you to fish, not handing out fish.

Related

nokogiri nodeset shows all nodes from document

I'm selecting a nodeset according to some condition. The resulting nodeset is correct. But if I do an xpath on it I get all the nodes from the document. I must be missing something here. Explanation and solution would be appreciated.
require 'nokogiri'
doc = Nokogiri::XML(DATA)
selection = doc.xpath("//listing[code[contains(text(), '34')]]")
p selection.length ## 2
p selection.xpath("//id").inner_text ##34567 (ids of all nodes), I'm trying to get 35 instead
__END__
<?xml version="1.0" encoding="UTF-8"?>
<listings>
<listing>
<id>3</id>
<code>3,4,55,34</code>
</listing>
<listing>
<id>4</id>
<code>3,4,55,33</code>
</listing>
<listing>
<id>5</id>
<code>3,4,55,34</code>
</listing>
<listing>
<id>6</id>
<code>3,4,55</code>
</listing>
<listing>
<id>7</id>
<code>3,14</code>
</listing>
</listings>

open xml file with nokogiri update node and save

I'm trying to figure out how to open an xml file, search by an id, replace a value in the node and then resave the document.
my xml
<?xml version="1.0"?>
<data>
<user id="1370018670618">
<email>1#1.com</email>
<sent>false</sent>
</user>
<user id="1370018701357">
<email>2#2.com</email>
<sent>false</sent>
</user>
<user id="1370018769724">
<email>3#3.com</email>
<sent>false</sent>
</user>
<user id="1370028546850">
<email>4#4.com</email>
<sent>false</sent>
</user>
<user id="1370028588345">
<email>5#5.com</email>
<sent>false</sent>
</user>
</data>
My code to open and find a node
xml_content = File.read("/home/mike/app/users.xml")
doc = Nokogiri::XML(xml_content)
node_update = doc.search("//user[#id='1370028588345'] //sent")
node_update.inner_html ##returns value of "sent"
the part in this where I'm stuck is actually updating the node. node_update.inner_html = "true" returns a method error on inner_html. then after that saving the updated file.
First of all, your node_update is actually a NodeSet, not the Node that you probably think it is. You need a Node if you want to call inner_html= on it:
node_update[0].inner_html = 'true'
Then writing out the updated XML is just a bit of standard file manipulating combined with a to_xml call:
File.open('whatever.xml', 'w') { |f| f.print(doc.to_xml) }
As an aside, your input isn't valid XML. You have a </details> but no <details>.

Hash xml parse to json shows does not have valid root

am converting xml file to json, it throws error
The document "some xml data" does not have a valid root.
am using json gem to conver, my code is
require 'json'
scheduledoc = "xmlfile"
scheduleData = Hash.from_xml(scheduleDoc).to_json
puts "schedule json #{scheduleData}
how to convert xml to json in rails.
Can we see the xml file?
First of all, make sure it begins with the right doctype.
example:
<?xml version="1.0" encoding="utf-8"?>
Then, try to wrap the entire document in a single tag
<?xml version="1.0" encoding="utf-8"?>
<root>
<sometag></sometag>
<sometag></sometag>
<someothertag>
<othercontent><othercontent>
...
</someothertag>
</root>

CDATA not working on rails

I have the below xml's in my code
XML Parsing Error: not well-formed
Location: http://localhost:3000/api/client?client=test1
Line Number 1, Column 1111:
<?xml version="1.0" encoding="UTF-8"?>
<application>
<name><![CDATA[TESTapp2]]></name>
<application-identifier>wac-8c28afa4-0f6e-11e1-8885-7071bc62c7bc</application-identifier>
<clients>
<pricepoint id="1" name=<![CDATA[TEST-price]]> currency="dollar" locale="la" country="india" price="50" text="this is a TEST" receipt="oi120934" operator-reference="1213w" operator-id="1"></pricepoint></pricepoints><product-image></product-image>
</clients>
</application>
<name><![CDATA[TESTapp2]]></name> this is working
<name=\"[CDATA[TESTapp2]]\"> this is not working,throws encoding error
AFAIK, Using CDATA as an attribute value is forbidden. CDATA can only be used for text nodes.

Nokogiri::XML not creating xml document

Alright, so the ultimate goal here is to parse the data inside of an xml response. The response comes in the format of a ruby string. The problem is that I'm getting an error when creating the xml file from that string (I know for a fact that response.body.to_s is a valid string of xml:
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<CardTxn>
<authcode>123</authcode>
<card_scheme>Mastercard</card_scheme>
<country>United Kingdom</country>
</CardTxn>
<datacash_reference>XXXX</datacash_reference>
<merchantreference>XX0001</merchantreference>
<mode>TEST</mode>
<reason>ACCEPTED</reason>
<status>1</status>
<time>1286477267</time>
</Response>
Inside the ruby method I try to generate an xml file:
doc = Nokogiri::XML(response.body.to_s)
the output of doc.to_s after the above code executes is:
<?xml version="1.0"?>
Any ideas why the file is not getting generated correctly?
This works for me on 1.9.2. Notice it's Nokogiri::XML.parse().
require 'nokogiri'
asdf = %q{<?xml version="1.0" encoding="UTF-8"?>
<Response>
<CardTxn>
<authcode>123</authcode>
<card_scheme>Mastercard</card_scheme>
<country>United Kingdom</country>
</CardTxn>
<datacash_reference>XXXX</datacash_reference>
<merchantreference>XX0001</merchantreference>
<mode>TEST</mode>
<reason>ACCEPTED</reason>
<status>1</status>
<time>1286477267</time>
</Response>
}
doc = Nokogiri::XML.parse(asdf)
print doc.to_s
This parses the XML into a Nokogiri XML document, but doesn't create a file. doc.to_s only shows you what it would be like if you printed it.
To create a file replace "print doc.to_s" with
File.open('xml.out', 'w') do |fo|
fo.print doc.to_s
end

Resources