How to create and compare variables in XSLT? - xslt-2.0

How do I create aff/#id and xref/#rid values for xref element using XSLT, and compare the xref/sup with the aff/sub value?
Input xml:
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Mochizuki</surname>
<given-names>Taku</given-names>
</name>
<xref ref-type="aff" rid="aff_ "><sup> </sup></xref>
<xref ref-type="aff" rid="aff_ "><sup> </sup></xref>
</contrib>
</contrib-group>
<aff id="aff_ "><sub> </sub>Departments of ...... Yokohama, Japan</aff>
<aff id="aff_ "><sub> </sub>Department of ...... Yokohama, Japan</aff>

Related

XSLT: how to pass different prefix values for Node and its feilds

I have a requirement where Nodes have a different prefix value where as fields under them has a different prefix, how to achieve this using XSLT. I have attached sample input and expected its output. Can you please advise.
I am expecting nodes have prebix "cac" and its fields as "cbc" and also replace namespace ns2 with r1 prefix.
Input:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:StandardBusinessDoc xmlns:ns0="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader">
<ns1:Invoice xmlns:ns1="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<ns1:CustomizationID>urn:cen.eu:en131:2017#compliant#urn:fdc:peppol.eu:2017:pocc:billing:3.0</ns1:CustomizationID>
<ns1:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</ns1:ProfileID>
<ns1:ID>80160238</ns1:ID>
<ns1:BuyerReference>202208_604</ns1:BuyerReference>
<ns1:BillingReference>
<ns1:InvoiceDocumentReference>
<ns1:ID>test</ns1:ID>
<ns1:IssueDate>2022-09-28</ns1:IssueDate>
</ns1:InvoiceDocumentReference>
</ns1:BillingReference>
<ns1:AdditionalDocumentReference>
<ns1:ID>06AB87FD6E1E1EED96F1653A13ADC23</ns1:ID>
<ns1:DocumentDescription>SupplierUID</ns1:DocumentDescription>
</ns1:AdditionalDocumentReference>
<ns1:AdditionalDocumentReference>
<ns1:ID>2M</ns1:ID>
<ns1:DocumentDescription>Series</ns1:DocumentDescription>
</ns1:AdditionalDocumentReference>
<ns2:Classification xmlns:ns2="rl:rl-einvoicing">
<ns2:Line>
<ns2:ID>000010</ns2:ID>
<ns2:VatCategory>
<ns2:VatRate>24</ns2:VatRate>
<ns2:IncomeClassification>
<ns2:Category>category1_2</ns2:Category>
<ns2:Type>E3_561_005</ns2:Type>
<ns2:Amount>112.33</ns2:Amount>
</ns2:IncomeClassification>
</ns2:VatCategory>
</ns2:Line>
</ns2:Classification>
</ns0:StandardBusinessDoc>
Expected Output:
<?xml-model href="http://www.unece.org/fileadmin/DAM/cefact/namespaces/StandardBusinessDocumentHeader/StandardBusinessDocumentHeader.xsd" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<StandardBusinessDoc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader http://www.unece.org/fileadmin/DAM/cefact/namespaces/StandardBusinessDocumentHeader/StandardBusinessDocumentHeader.xsd" xmlns:rl="rl:rl-einvoicing" xmlns="http://www.unece.org/cefact/namespaces/StandardBusinessDocumentHeader">
<Invoice xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2">
<cbc:CustomizationID>urn:cen.eu:en131:2017#compliant#urn:fdc:peppol.eu:2017:pocc:billing:3.0</cbc:CustomizationID>
<cbc:ProfileID>urn:fdc:peppol.eu:2017:poacc:billing:01:1.0</cbc:ProfileID>
<cbc:ID>80160238</cbc:ID>
<cbc:BuyerReference>202208_604</cbc:BuyerReference>
<cac:BillingReference>
<cac:InvoiceDocumentReference>
<cbc:ID>test</cbc:ID>
<cbc:IssueDate>2022-09-28</cbc:IssueDate>
</cac:InvoiceDocumentReference>
</cac:BillingReference>
<cac:AdditionalDocumentReference>
<cbc:ID>06AB87FD6E1E1EED96F1653A13ADC23</cbc:ID>
<cbc:DocumentDescription>SupplierUID</cbc:DocumentDescription>
</cac:AdditionalDocumentReference>
<cac:AdditionalDocumentReference>
<cbc:ID>2Μ</cbc:ID>
<cbc:DocumentDescription>Series</cbc:DocumentDescription>
</cac:AdditionalDocumentReference>
<rl:Classification>
<rl:Line>
<rl:ID>000010</rl:ID>
<rl:VatCategory>
<rl:VatRate>24</rl:VatRate>
<rl:IncomeClassification>
<rl:Category>category1_2</rl:Category>
<rl:Type>E3_561_005</rl:Type>
<rl:Amount>112.33</rl:Amount>
</rl:IncomeClassification>
</rl:VatCategory>
</rl:Line>
</rl:Classification>
</StandardBusinessDoc>

AntSCript to extract xml tag having specific matching string in attribute value from xml file

I have and XML file as below
<sca:composite xmlns:sca="http://www.osoa.org/xmlns/sca/1.0" xmlns:atleastonce="http://www.tibco.com/wrm/policy/atleastonce" xmlns:common="http://xsd.tns.tibco.com/n2/models/common" xmlns:compositeext="http://schemas.tibco.com/amx/3.0/compositeext" xmlns:jdbc="http://xsd.tns.tibco.com/amf/models/sharedresource/jdbc" xmlns:pbu="http://www.tibco.com/wrm/policy/pbu" xmlns:pfe="http://xsd.tns.tibco.com/n2/models/pfe/1.0" xmlns:scact="http://xsd.tns.tibco.com/amf/models/sca/componentType" xmlns:scaext="http://xsd.tns.tibco.com/amf/models/sca/extensions" xmlns:service="http://xsd.tns.tibco.com/bx/amx/model" xmlns:smtp="http://xsd.tns.tibco.com/amf/models/sharedresource/smtp" xmlns:soapbt="http://xsd.tns.tibco.com/amf/models/sca/binding/soap" xmlns:startservicefirst="http://www.tibco.com/wrm/policy/startservicefirst" xmlns:threading="http://www.tibco.com/wrm/policy/threading" xmlns:transactedoneway="http://www.tibco.com/wrm/policy/transactedoneway" xmlns:webapp="http://xsd.tns.tibco.com/amf/models/sca/implementationtype/webapp" xmlns:wrm="http://www.tibco.com/wrm" xmlns:xmi="http://www.omg.org/XMI" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" compositeext:formatVersion="2" compositeext:version="1.0.0.20180112132229840" name="za.co.rmb.dealamendmentsmaintenance" targetNamespace="http://www.example.com/za.co.rmb.dealamendmentsmaintenance" xmi:id="_4EfRQfeKEeeZRvktH3XIjg" xmi:version="2.0">
<sca:reference multiplicity="0..1" name="WorkListService_Consumer1" promote="dealAmendmentsMaintenanceProcessFlow/WorkListService_Consumer" wiredByImpl="false" xmi:id="_AR2UQPeLEeeZRvktH3XIjg">
<sca:interface.wsdl interface="http://services.brm.n2.tibco.com#wsdl.interface(WorkListService)" scaext:wsdlLocation=".processOut/process/dealAmendmentsMaintenance.xpdl/brm.wsdl" xmi:id="_AR2UQfeLEeeZRvktH3XIjg"/>
</sca:reference>
<sca:reference multiplicity="0..1" name="CreateDailyTasks_Consumer1" promote="dealAmendmentsMaintenanceProcessFlow/CreateDailyTasks_Consumer" wiredByImpl="false" xmi:id="_ATRQkPeLEeeZRvktH3XIjg">
<sca:interface.wsdl interface="http://www.tibco.com/bs3.0/_8uwIINbzEeWTpucOvGErRg#wsdl.interface(CreateDailyTasks)" scaext:wsdlLocation=".processOut/process/dealAmendmentsMaintenance.xpdl/dealAmendments_segregation.wsdl" xmi:id="_ATRQkfeLEeeZRvktH3XIjg"/>
</sca:reference>
</sca:composite>
With ant script i want to extract value in the "interface" attribute under sca:interface, by matching input value in "name" attribute in sca:refernce.
So lets say
if input will be : WorkListService_Consumer1
Expected Output : http://services.brm.n2.tibco.com#wsdl.interface(WorkListService)
Similarly, if
input will be : CreateDailyTasks_Consumer1
Expected Output : http://www.tibco.com/bs3.0/_8uwIINbzEeWTpucOvGErRg#wsdl.interface(CreateDailyTasks)
I tried using various xmltask commands but i am not getting succesfull.
Thanks
Shrijeet Sinha
You almost had the solution, however text() is used to reference the inner text of an XML element, such as <element>This text here</element>. Here is the syntax for referencing an attribute's value:
<xmltask source="xmlfile.xml">
<copy path="sca:composite/sca:reference[#name='${input}']/sca:interface.wsdl/#interface" property="testproperty"/>
</xmltask>

Rails nokogiri parse XML file

I'm a little bit confused: could not find in web good examples of parsing xml with nokogiri...
example of my data:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<rows SessionGUID="6448680D1">
<row>
<AnalogueCode>0451103079</AnalogueCode>
<AnalogueCodeAsIs>0451103079</AnalogueCodeAsIs>
<AnalogueManufacturerName>BOSCH</AnalogueManufacturerName>
<AnalogueWeight>0.000</AnalogueWeight>
<CodeAsIs>OC90</CodeAsIs>
<DeliveryVariantPriceAKiloForClientDescription />
<DeliveryVariantPriceAKiloForClientPrice>0.00</DeliveryVariantPriceAKiloForClientPrice>
<DeliveryVariantPriceNote />
<PriceListItemDescription />
<PriceListItemNote />
<IsAvailability>1</IsAvailability>
<IsCross>1</IsCross>
<LotBase>1</LotBase>
<LotType>1</LotType>
<ManufacturerName>KNECHT/MAHLE</ManufacturerName>
<OfferName>MSC-STC-58</OfferName>
<PeriodMin>2</PeriodMin>
<PeriodMax>4</PeriodMax>
<PriceListDiscountCode>31087</PriceListDiscountCode>
<ProductName>Фильтр масляный</ProductName>
<Quantity>41</Quantity>
<SupplierID>30</SupplierID>
<GroupTitle>Замена</GroupTitle>
<Price>203.35</Price>
</row>
<row>
<AnalogueCode>0451103079</AnalogueCode>
<AnalogueCodeAsIs>0451103079</AnalogueCodeAsIs>
<AnalogueManufacturerName>BOSCH</AnalogueManufacturerName>
<AnalogueWeight>0.000</AnalogueWeight>
<CodeAsIs>OC90</CodeAsIs>
<DeliveryVariantPriceAKiloForClientDescription />
<DeliveryVariantPriceAKiloForClientPrice>0.00</DeliveryVariantPriceAKiloForClientPrice>
<DeliveryVariantPriceNote />
<PriceListItemDescription />
<PriceListItemNote>[0451103079] Bosch,MTGC#0451103079</PriceListItemNote>
<IsAvailability>1</IsAvailability>
<IsCross>1</IsCross>
<LotBase>1</LotBase>
<LotType>0</LotType>
<ManufacturerName>KNECHT/MAHLE</ManufacturerName>
<OfferName>MSC-STC-1303</OfferName>
<PeriodMin>3</PeriodMin>
<PeriodMax>5</PeriodMax>
<PriceListDiscountCode>102134</PriceListDiscountCode>
<ProductName>Фильтр масляный</ProductName>
<Quantity>5</Quantity>
<SupplierID>666</SupplierID>
<GroupTitle>Замена</GroupTitle>
<Price>172.99</Price>
</row>
</rows>
</root>
and ruby code:
...
xml_doc = Nokogiri::XML(response.body)
parts = xml_doc.xpath('/root/rows/row')
with the help of xpath i could do this? also how to get this parts object (row)?
You're on the right track. parts = xml_doc.xpath('/root/rows/row') gives you back a NodeSet i.e. a list of the <row> elements.
You can loop through these using each or use row indexes like parts[0], parts[1] to access specific rows. You can then get the values of child nodes using xpath on the individual rows.
e.g. you could build a list of the AnalogueCode for each part with:
codes = []
parts.each do |row|
codes << row.xpath('AnalogueCode').text
end
Looking at the full example of the XML you're processing there are 2 issues preventing your XPath from matching:
the <root> tag isn't actually the root element of the XML so /root/.. doesn't match
The XML is using namespaces so you need to include these in your XPaths
so there are a couple of possible solutions:
use CSS selectors rather than XPaths (i.e. use search) as suggested by the Tin Man
after xml_doc = Nokogiri::XML(response.body) do xml_doc.remove_namespaces! and then use parts = xml_doc.xpath('//root/rows/row') where the double slash is XPath syntax to locate the root node anywhere in the document
specify the namespaces:
e.g.
xml_doc = Nokogiri::XML(response.body)
ns = xml_doc.collect_namespaces
parts = xml_doc.xpath('//xmlns:rows/xmlns:row', ns)
codes = []
parts.each do |row|
codes << xpath('xmlns:AnalogueCode', ns).text
end
I would go with 1. or 2. :-)
First, Nokogiri supports XPath AND CSS. I recommend using CSS because it's more easily read:
doc.search('row')
will return a NodeSet of every <row> in the document.
The equivalent XPath is:
doc.search('//row')
...how to get this parts object (row)?
I'm not sure what that means, but if you want to access individual elements inside a <row>, it's easily done several ways.
If you only want one node inside each of the row nodes:
doc.search('row Price').map(&:to_xml)
# => ["<Price>203.35</Price>", "<Price>172.99</Price>"]
doc.search('//row/Price').map(&:to_xml)
# => ["<Price>203.35</Price>", "<Price>172.99</Price>"]
If you only want the first such occurrence, use at, which is the equivalent of search(...).first:
doc.at('row Price').to_xml
# => "<Price>203.35</Price>"
Typically we want to iterate over a number of blocks and return an array of hashes of the data found:
row_hash = doc.search('row').map{ |row|
{
AnalogueCode: row.at('AnalogueCode').text,
Price: row.at('Price').text,
}
}
row_hash
# => [{:AnalogueCode=>"0451103079", :Price=>"203.35"},
# {:AnalogueCode=>"0451103079", :Price=>"172.99"}]
These are ALL covered in Nokogiri's tutorials and are answered many times here on Stack Overflow, so take the time to read and search.

Find child of child which attribute code is equal to the parameter passed on the url - XSL

On this dynamic website,
The url looks something like this : departments/CHEM.html
CHEM is a parameter.
<xsl:param name="dep" select="'CHEM'" />
a piece of the xml is below
<course acad_year="2012" cat_num="5085" offered="Y">
<term term_pattern_code="1" fall_term="Y" spring_term="N">fall term</term>
<department code="CHEM">
<dept_long_name>Department of Chemistry and Chemical Biology</dept_long_name>
<dept_short_name>Chemistry and Chemical Biology</dept_short_name>
</department>
</course> ....
I am trying to get the dept_short_name to use on my H1 tag, but I have not been successful.So far I tried
<h2><xsl:value-of select="course/department/[code={#$dep}]"/></h2>
Any suggestions??? Thanks!
Just use:
<xsl:value-of select="course/department[#code eq $dep]/dept_short_name"/>
Remember:
In XPath 2.0 (XSLT 2.0) use the eq operator for value comparissons -- it is more efficient than the general comparisson operator = which really, only, needs to be used when at least one of its operands is a sequence.
I would try this:
<xsl:value-of select="course/department[#code=$dep]/dept_short_name/text()"/>
That says: find the department element (inside a course element) whose code attribute is the value of parameter "dep", then find the dept_short_name child element, then get the text inside that element.
You have to use the # to say that "code" is an attribute, but "dep" should not have it. I think the {} notation is for use inside attributes of the non-XSLT elements of your stylesheet, so I wouldn't use it inside a value-of expression.

Parsing encoded tags in Ruby XML document using Nokogiri and regex

I am trying to parse XML with tags embedded in tags, like this one using Nokigiri and Ruby:
<seg>Trennmesser <ph><I.FIGREF ITEM="3" FORMAT="PARENTHESIS"></ph><bpt i="1"><I.FIGTARGET TARGET="CIADDAJA"></bpt><ept i="1"></I.FIGREF></ept></seg>
In this case I would only need the word "Trennmesser" not within the embedded tags.
In this second example:
<seg>Hilfsmittel <ph><F34#Z7#Lge></ph>X <ph><F0></ph>= 0,5mm zwischen Beschleunigerwalze <ph><F34#Z7#Lge></ph>D<ph><F0></ph> und Trennmesser schieben.</seg>
The words within the closed /ph and open ph tags are also interesting, so the regex would need to extract the string "Hilfsmittel 0,5mm zwischen Beschleunigerwalze und Trennmesser schieben." and discard everything else.
I have also uploaded a part of the document here:
http://pastebin.com/Q8CdnASz
Try this in irb
require 'nokogiri'
x = Nokogiri::XML.parse('<seg>Hilfsmittel <ph><F34#Z7#Lge></ph>X <ph><F0></ph>= 0,5mm zwischen Beschleunigerwalze <ph><F34#Z7#Lge></ph>D<ph><F0></ph> und Trennmesser schieben.</seg>')
x.xpath('//seg').children.reject {|x| x.element?}.join {|x| x.content}
for me this outputs
=> "Hilfsmittel X = 0,5mm zwischen Beschleunigerwalze D und Trennmesser schieben."
The idea here is that we iterate over the children of the <seg> tag, rejecting the ones that are elements themselves (<ph>), which should leave only the content elements. Take the resultant array, and join the content elements together as one string.
Note that the output is slightly different than you described, because there's an additional D and X in between two of the tags.
The content inside the <ph> tags has been encoded to preserve the reserved characters < and >.
A clean way to deal with this is to let Nokogiri reparse those chunks back into XML:
require 'nokogiri'
doc = Nokogiri::XML('<seg>Trennmesser <ph><I.FIGREF ITEM="3" FORMAT="PARENTHESIS"></ph><bpt i="1"><I.FIGTARGET TARGET="CIADDAJA"></bpt><ept i="1"></I.FIGREF></ept></seg>')
ph = Nokogiri::XML::DocumentFragment.parse(doc.at('seg ph').content)
puts ph.to_xml
Which outputs the following node, showing Nokogiri recreated that fragment correctly:
<I.FIGREF ITEM="3" FORMAT="PARENTHESIS"/>
For extracting the text inside the <seg> tag:
doc.at('//seg/text()').text
=> "Trennmesser "
When dealing with HTML or XML, it's never good to presuppose that regex will be the best path to extracting something. Both HTML and XML are too irregular and "flexible" (where flexible means it's often irritatingly malformed or defined in totally unique and unexpected ways).
To get the full content inside the <seg> tag in the second question:
require 'nokogiri'
doc = Nokogiri::XML('<seg>Hilfsmittel <ph><F34#Z7#Lge></ph>X <ph><F0></ph>= 0,5mm zwischen Beschleunigerwalze <ph><F34#Z7#Lge></ph>D<ph><F0></ph> und Trennmesser schieben.</seg>')
seg = Nokogiri::XML::DocumentFragment.parse(doc.at('seg').content)
puts seg.content
Which outputs:
Hilfsmittel #Z7#Lge>X = 0,5mm zwischen Beschleunigerwalze #Z7#Lge>D und Trennmesser schieben.

Resources