Xml parsing in rails - ruby-on-rails

I have this XML data:
<?xml version="1.0" encoding="UTF-8"?>
<responseParam>
<RESULT>-1</RESULT>
<ERROR_CODE>509</ERROR_CODE>
</responseParam>
How can I fetch the value of error code only?
I have tried this :
result = Net::HTTP.get(URI.parse(otpUrl))
data = Hash.from_xml(result)
puts "#{data['ERROR_CODE']}"
puts data[:ERROR_CODE]
printing only "data" gives me the whole hash. I am not able to get only the value of ERROR_CODE.
Any help ?

you can use Nokigiri here.
suppose this is your error.xml
<?xml version="1.0" encoding="UTF-8"?>
<responseParam>
<RESULT>-1</RESULT>
<ERROR_CODE>509</ERROR_CODE>
</responseParam>
you can do something like:-
#doc = Nokogiri::XML(File.open("error.xml"))
#doc.xpath("//ERROR_CODE")
will give you something like:-
# => ["<ERROR_CODE>509</ERROR_CODE>]"
The Node methods xpath and css actually return a NodeSet, which acts very much like an array, and contains matching nodes from the document.

Related

XSLT: how to pass different prefix values for Node and its feilds

I have a requirement where Nodes have a different prefix value where as fields under them has a different prefix, how to achieve this using XSLT. I have attached sample input and expected its output. Can you please advise.
I am expecting nodes have prebix "cac" and its fields as "cbc" and also replace namespace ns2 with r1 prefix.
Input:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:StandardBusinessDoc xmlns:ns0="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader">
<ns1:Invoice xmlns:ns1="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<ns1:CustomizationID>urn:cen.eu:en131:2017#compliant#urn:fdc:peppol.eu:2017:pocc:billing:3.0</ns1:CustomizationID>
<ns1:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</ns1:ProfileID>
<ns1:ID>80160238</ns1:ID>
<ns1:BuyerReference>202208_604</ns1:BuyerReference>
<ns1:BillingReference>
<ns1:InvoiceDocumentReference>
<ns1:ID>test</ns1:ID>
<ns1:IssueDate>2022-09-28</ns1:IssueDate>
</ns1:InvoiceDocumentReference>
</ns1:BillingReference>
<ns1:AdditionalDocumentReference>
<ns1:ID>06AB87FD6E1E1EED96F1653A13ADC23</ns1:ID>
<ns1:DocumentDescription>SupplierUID</ns1:DocumentDescription>
</ns1:AdditionalDocumentReference>
<ns1:AdditionalDocumentReference>
<ns1:ID>2M</ns1:ID>
<ns1:DocumentDescription>Series</ns1:DocumentDescription>
</ns1:AdditionalDocumentReference>
<ns2:Classification xmlns:ns2="rl:rl-einvoicing">
<ns2:Line>
<ns2:ID>000010</ns2:ID>
<ns2:VatCategory>
<ns2:VatRate>24</ns2:VatRate>
<ns2:IncomeClassification>
<ns2:Category>category1_2</ns2:Category>
<ns2:Type>E3_561_005</ns2:Type>
<ns2:Amount>112.33</ns2:Amount>
</ns2:IncomeClassification>
</ns2:VatCategory>
</ns2:Line>
</ns2:Classification>
</ns0:StandardBusinessDoc>
Expected Output:
<?xml-model href="http://www.unece.org/fileadmin/DAM/cefact/namespaces/StandardBusinessDocumentHeader/StandardBusinessDocumentHeader.xsd" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<StandardBusinessDoc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader http://www.unece.org/fileadmin/DAM/cefact/namespaces/StandardBusinessDocumentHeader/StandardBusinessDocumentHeader.xsd" xmlns:rl="rl:rl-einvoicing" xmlns="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader">
<Invoice xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<cbc:CustomizationID>urn:cen.eu:en131:2017#compliant#urn:fdc:peppol.eu:2017:pocc:billing:3.0</cbc:CustomizationID>
<cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
<cbc:ID>80160238</cbc:ID>
<cbc:BuyerReference>202208_604</cbc:BuyerReference>
<cac:BillingReference>
<cac:InvoiceDocumentReference>
<cbc:ID>test</cbc:ID>
<cbc:IssueDate>2022-09-28</cbc:IssueDate>
</cac:InvoiceDocumentReference>
</cac:BillingReference>
<cac:AdditionalDocumentReference>
<cbc:ID>06AB87FD6E1E1EED96F1653A13ADC23</cbc:ID>
<cbc:DocumentDescription>SupplierUID</cbc:DocumentDescription>
</cac:AdditionalDocumentReference>
<cac:AdditionalDocumentReference>
<cbc:ID>2Μ</cbc:ID>
<cbc:DocumentDescription>Series</cbc:DocumentDescription>
</cac:AdditionalDocumentReference>
<rl:Classification>
<rl:Line>
<rl:ID>000010</rl:ID>
<rl:VatCategory>
<rl:VatRate>24</rl:VatRate>
<rl:IncomeClassification>
<rl:Category>category1_2</rl:Category>
<rl:Type>E3_561_005</rl:Type>
<rl:Amount>112.33</rl:Amount>
</rl:IncomeClassification>
</rl:VatCategory>
</rl:Line>
</rl:Classification>
</StandardBusinessDoc>

Removing elements with XPath

So let's say I have an XML file and I want to remove some nodes from it using their XPath. How would I do that and is it possible in the first place with xmerl or erlsom or maybe something else?
And if there is not a simple way with XPath, what is the correct way to remove elements from XML in general?
As stated by W3C,
XPath is a language for addressing parts of an XML document
the above literally means XPath is to query XML, not to modify it. The common approach to modifying XML document, would be to one of those:
using XSLT transformation schema;
reading the content into memory, modifying it and saving it back to the file.
AFAIU, the former is out of the scope of this question. For the latter, one might use Exsom library, which is “an Elixir wrapper around the erlsom XML parsing library.”
Assuming we have the xml and xsd taken from Excom examples:
<?xml version="1.0" encoding="UTF-8"?>
<foo attr="yo">
<bar>1</bar>
<bar>2</bar>
</foo>
One might do something like this to delete second bar node (most of the code is taken as is from Excom tests:
{ :ok, model } = Exsom.XSD.File.parse("complex.xsd")
{ :ok, instance, _ } = Exsom.File.parse("complex.xml", model)
#⇒ {:ok, {:foo_type, [], 'yo', ['1', '2']}}
Modify it according to what you want, e.g. remove bar element with 2
instance = {:foo_type, [], 'yo', ['1']}
{ :ok, binary_xml } = Exsom.compose(instance, model, [{ :output, :binary }])
#⇒ {:ok, "<foo attr=\"yo\"><bar>1</bar></foo>"}
Now you might write the binary_xml to a file.

Get xml-stylesheet when using xml type provider?

How to get the xml-stylesheet using xml type provider?
let xml = XmlProvider<"""<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type='text/xsl' href='/stylesheets/application_internet.xsl'?>
<application>......</application>""").GetSample()
let stylesheetHref = xml.....?
Expect string '/stylesheets/application_internet.xsl'.
There is no easy way I know of to get processing instructions and associated data using TypeProviders (or Linq to XML).
It can be done like this, though:
For your example XML GetSample returns just the root element content, i.e. ....... Changing that a bit lets us access the root XElement. Knowing the processing node is its preceding sibling, we can get a XProcessingInstruction and extract the url from its Data.
#I "../packages/FSharp.Data.2.2.5/lib/net40"
#r "System.Xml.Linq"
#r "FSharp.Data.dll"
open FSharp.Data
open System.Text.RegularExpressions
open System.Xml.Linq
let href s = Regex.Match(s, "href='(?<url>.*?)'").Groups.["url"].Value
type Xml = XmlProvider<"""<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type='text/xsl' href='/stylesheets/application_internet.xsl'?>
<application><other></other></application>""">
let doc = Xml.GetSample()
let stylesheetProcessing = (doc.XElement.PreviousNode :?> XProcessingInstruction)
// /stylesheets/application_internet.xsl
let url = href stylesheetProcessing.Data
Obviously this code expects the XML to always have a valid instruction in the same place. Adding error handling is left as an exercise :-)

Rails nokogiri parse XML file

I'm a little bit confused: could not find in web good examples of parsing xml with nokogiri...
example of my data:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rows SessionGUID="6448680D1">
<row>
<AnalogueCode>0451103079</AnalogueCode>
<AnalogueCodeAsIs>0451103079</AnalogueCodeAsIs>
<AnalogueManufacturerName>BOSCH</AnalogueManufacturerName>
<AnalogueWeight>0.000</AnalogueWeight>
<CodeAsIs>OC90</CodeAsIs>
<DeliveryVariantPriceAKiloForClientDescription />
<DeliveryVariantPriceAKiloForClientPrice>0.00</DeliveryVariantPriceAKiloForClientPrice>
<DeliveryVariantPriceNote />
<PriceListItemDescription />
<PriceListItemNote />
<IsAvailability>1</IsAvailability>
<IsCross>1</IsCross>
<LotBase>1</LotBase>
<LotType>1</LotType>
<ManufacturerName>KNECHT/MAHLE</ManufacturerName>
<OfferName>MSC-STC-58</OfferName>
<PeriodMin>2</PeriodMin>
<PeriodMax>4</PeriodMax>
<PriceListDiscountCode>31087</PriceListDiscountCode>
<ProductName>Фильтр масляный</ProductName>
<Quantity>41</Quantity>
<SupplierID>30</SupplierID>
<GroupTitle>Замена</GroupTitle>
<Price>203.35</Price>
</row>
<row>
<AnalogueCode>0451103079</AnalogueCode>
<AnalogueCodeAsIs>0451103079</AnalogueCodeAsIs>
<AnalogueManufacturerName>BOSCH</AnalogueManufacturerName>
<AnalogueWeight>0.000</AnalogueWeight>
<CodeAsIs>OC90</CodeAsIs>
<DeliveryVariantPriceAKiloForClientDescription />
<DeliveryVariantPriceAKiloForClientPrice>0.00</DeliveryVariantPriceAKiloForClientPrice>
<DeliveryVariantPriceNote />
<PriceListItemDescription />
<PriceListItemNote>[0451103079] Bosch,MTGC#0451103079</PriceListItemNote>
<IsAvailability>1</IsAvailability>
<IsCross>1</IsCross>
<LotBase>1</LotBase>
<LotType>0</LotType>
<ManufacturerName>KNECHT/MAHLE</ManufacturerName>
<OfferName>MSC-STC-1303</OfferName>
<PeriodMin>3</PeriodMin>
<PeriodMax>5</PeriodMax>
<PriceListDiscountCode>102134</PriceListDiscountCode>
<ProductName>Фильтр масляный</ProductName>
<Quantity>5</Quantity>
<SupplierID>666</SupplierID>
<GroupTitle>Замена</GroupTitle>
<Price>172.99</Price>
</row>
</rows>
</root>
and ruby code:
...
xml_doc = Nokogiri::XML(response.body)
parts = xml_doc.xpath('/root/rows/row')
with the help of xpath i could do this? also how to get this parts object (row)?
You're on the right track. parts = xml_doc.xpath('/root/rows/row') gives you back a NodeSet i.e. a list of the <row> elements.
You can loop through these using each or use row indexes like parts[0], parts[1] to access specific rows. You can then get the values of child nodes using xpath on the individual rows.
e.g. you could build a list of the AnalogueCode for each part with:
codes = []
parts.each do |row|
codes << row.xpath('AnalogueCode').text
end
Looking at the full example of the XML you're processing there are 2 issues preventing your XPath from matching:
the <root> tag isn't actually the root element of the XML so /root/.. doesn't match
The XML is using namespaces so you need to include these in your XPaths
so there are a couple of possible solutions:
use CSS selectors rather than XPaths (i.e. use search) as suggested by the Tin Man
after xml_doc = Nokogiri::XML(response.body) do xml_doc.remove_namespaces! and then use parts = xml_doc.xpath('//root/rows/row') where the double slash is XPath syntax to locate the root node anywhere in the document
specify the namespaces:
e.g.
xml_doc = Nokogiri::XML(response.body)
ns = xml_doc.collect_namespaces
parts = xml_doc.xpath('//xmlns:rows/xmlns:row', ns)
codes = []
parts.each do |row|
codes << xpath('xmlns:AnalogueCode', ns).text
end
I would go with 1. or 2. :-)
First, Nokogiri supports XPath AND CSS. I recommend using CSS because it's more easily read:
doc.search('row')
will return a NodeSet of every <row> in the document.
The equivalent XPath is:
doc.search('//row')
...how to get this parts object (row)?
I'm not sure what that means, but if you want to access individual elements inside a <row>, it's easily done several ways.
If you only want one node inside each of the row nodes:
doc.search('row Price').map(&:to_xml)
# => ["<Price>203.35</Price>", "<Price>172.99</Price>"]
doc.search('//row/Price').map(&:to_xml)
# => ["<Price>203.35</Price>", "<Price>172.99</Price>"]
If you only want the first such occurrence, use at, which is the equivalent of search(...).first:
doc.at('row Price').to_xml
# => "<Price>203.35</Price>"
Typically we want to iterate over a number of blocks and return an array of hashes of the data found:
row_hash = doc.search('row').map{ |row|
{
AnalogueCode: row.at('AnalogueCode').text,
Price: row.at('Price').text,
}
}
row_hash
# => [{:AnalogueCode=>"0451103079", :Price=>"203.35"},
# {:AnalogueCode=>"0451103079", :Price=>"172.99"}]
These are ALL covered in Nokogiri's tutorials and are answered many times here on Stack Overflow, so take the time to read and search.

Read XML file with Nokogiri

I currently have an XML file that is reading correctly except for one part. It is an item list and sometimes one item has multiple barcodes. In my code it only pulls out the first. How can I iterate over multiple barcodes. Please see code below:
def self.pos_import(xml)
Plu.transaction do
Plu.delete_all
xml.xpath('//Item').each do |xml|
plu_import = Plu.new
plu_import.update_pointer = xml.at('Update_Type').content
plu_import.plu = xml.at('item_no').content
plu_import.dept = xml.at('department').content
plu_import.item_description = xml.at('item_description').content
plu_import.price = xml.at('item_price').content
plu_import.barcodes = xml.at('UPC_Code').content
plu_import.sync_date = Time.now
plu_import.save!
end
end
My test XML file looks like this:
<?xml version="1.0" encoding="UTF-16" standalone="no"?>
<items>
<Item>
<Update_Type>2</Update_Type>
<item_no>0000005110</item_no>
<department>2</department>
<item_description>DISC-ALCOHOL PAD STERIL 200CT</item_description>
<item_price>7.99</item_price>
<taxable>No</taxable>
<Barcode>
<UPC_Code>0000005110</UPC_Code>
<UPC_Code>1234567890</UPC_Code>
</Barcode>
</Item>
</Items>
Any ideas how to pull both UPC_Code fields out and write them to my database?
.at will always return a single element. To get an array of elements use xpath like you do to get the list of Item elements.
plu_import.barcodes = xml.xpath('//UPC_Code').map(&:content)
Thanks for all the great tips. It definitely led me in the right direction. The way that I got it to work was just adding a period before the double //.
plu_import.barcodes = xml.xpath('.//UPC_Code').map(&:content)

Resources