Parse Xml tags with attributes - parsing

I have this xml :
<document-display>
<name>
<entry lang="nl">nl Text</entry>
<entry lang="fr">fr Text</entry>
<entry lang="en">en Text</entry>
</name>
</document-display>
I would like to get the text according to the langage.
I'm using XmlSlurper.
With my current code :
def parsedD = new XmlSlurper().parse(xml)
parsedD."document-display".name.entry.each {it.#lang == 'fr'}
I have as bad result which is the concatenation of the 3 text content :
nl Textfr Texten Text
Thanks for helping.

Try
parsedD.name.entry.find { it.#lang == 'fr' }?.text()

Related

how to parse a xml in erlang?

I have this string with xml extract in a tuple list:
MessageResponse = [{"code",0},{"description","description"},{"respuestaServicioSoap",{{"executeWebServiceSolutionResult",{{"CEDULARUCSpecified", false},{"AUTORIZACION", "00000012431781"},{"AUTORIZACIONSpecified",true},{"RESULTADO","000"},{"CODIGO_RESULTADOSpecified",true},{"COD_PAGO","00000012431781"},{"COD_PAGOSpecified",true},{"COMISION",{{"string","0"}}},{"COMISIONSpecified", true},{"DIRECCIONSpecified", false},{"FECHA_COMPENSACIONSpecified", false},{"FECHA_TRANSACCION","20170116"},{"FECHA_TRANSACCIONSpecified",true},{"FECHORA_SW","20170116123951"},{"FECHORA_SWSpecified",true},{"HORA_TRANSACCION","123951"},{"HORA_TRANSACCIONSpecified",true},{"MENSAJE","TRANSACCION OK"},{"MENSAJESpecified",true},{"NOMBRESpecified",false},{"PRODUCTO","0010761005"},{"PRODUCTOSpecified",true},{"SECUENCIA_ADQ","2833"},{"SECUENCIA_ADQSpecified",true},{"SECUENCIA_SW","576167"},{"SECUENCIA_SWSpecified",true},{"TERMINAL","0696069603000001"},{"TERMINALSpecified",true},{"TYPE_TRNSpecified",false},{"VALOR_TOTAL", { { "string", "0" }}},{"VALOR_TOTALSpecified",true},{"XML_ADDSpecified",false},{"XML_DATASpecified",false},{"XML_FACT","<XML_FACT>\r\n <DATOS_FACT>\r\n <LINEA_1>REPRESENTACIONES ORMAN S.A.</LINEA_1>\r\n <LINEA_2>RUC: 0987654321</LINEA_2>\r\n <LINEA_3 />\r\n <LINEA_4 />\r\n <LINEA_5>FACTURA: 001-627-0000048745</LINEA_5>\r\n <LINEA_6>CLAVE: </LINEA_6>\r\n <LINEA_7>COMISION POR SERVICIO</LINEA_7>\r\n <LINEA_8>RECAUDACION EEAAPP - CUENTA: 11223344</LINEA_8>\r\n <LINEA_12>FACTURA: 001-627-0000048745 - CONSULTE SU DOCUMENTO EN WWW.LITO.COM/DOCUMENTOSELECTRONICOS</LINEA_12>\r\n <MSGCOMP />\r\n <MSGFACT />\r\n </DATOS_FACT>\r\n</XML_FACT>"},{"XML_FACTSpecified",true},{"XML_REPLY_CONSULTASpecified",false},{"XML_REPLY_PAGOSSpecified",false}}},{"executeWebServiceSolutionResultSpecified", true}}},{"result", "ok"}]
and need to get the text in LINEA_5 tag, any idea how to do it?
with this code:
{Xml, _Rest} = xmerl_scan:string(XmlFactura).
[#xmlText{value=Linea5}] = xmerl_xpath:string("//LINEA_5/text()", Xml).
the OTP library xmerl provides all the functions to manipulate XML files or string. It provides a set of record that help to handle different elements.
documentation is available here
The records are defined in erlXX/lib/xmerl-YYY/include/xmerl.hrl:
#xmlText{}
#xmlElement{}
#xmlPI{}
#xmlComment{}
#xmlDecl{}
[edit]
The xml data that you provide in your example is already modified, so I take an example from my own. Consider an xml file with the content:
<?xml version="1.0" encoding="UTF-8"?> <package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="uuid_id">
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:creator opf:role="aut" opf:file-as="Ahern, Cecelia">Cecelia Ahern</dc:creator>
<dc:publisher>J'ai Lu</dc:publisher>
<meta name="calibre:title_sort" content="Si tu me voyais maintenant"/>
<dc:description>description blah blah</dc:description>
<meta name="calibre:timestamp" content="2012-03-18T18:04:20+00:00"/>
<dc:title>Si tu me voyais maintenant</dc:title>
<meta name="cover" content="cover"/>
<dc:date>2012-03-18T18:04:23+00:00</dc:date>
<dc:contributor opf:role="bkp">calibre (0.8.42) [http://calibre-ebook.com]</dc:contributor>
<dc:identifier opf:scheme="ISBN">9782290006504</dc:identifier>
<dc:identifier id="uuid_id" opf:scheme="uuid">7d062b17-258e-4268-9d46-a753c063c969</dc:identifier>
<dc:subject>Chick-lit</dc:subject>
<meta name="calibre:user_categories" content="{}"/>
<meta name="calibre:author_link_map" content="{"Cecelia Ahern": ""}"/>
<dc:language>fr</dc:language>
</metadata>
<manifest>
<item href="cover.jpeg" id="cover" media-type="image/jpeg"/>
</manifest>
<spine toc="ncx">
<itemref idref="titlepage"/>
</spine>
<guide>
<reference href="titlepage.xhtml" type="cover" title="Cover"/>
</guide> </package>
It is extract from an epub book, and stored in a file "content.opf". If I want to get the author name (line 4) I can do:
1> rr("C:\\My programs\\erl8.2\\lib\\xmerl-1.3.12\\include\\xmerl.hrl").
2> {Xml,_} = xmerl_scan:file("../doc/content.opf"),
2> Content = Xml#xmlElement.content,
2> [MetaRec] = [X || X <- Content, X#xmlElement.name == metadata],
2> Meta = MetaRec#xmlElement.content,
2> [CreatRec] = [X || X <- Meta, X#xmlElement.name == 'dc:creator'],
2> Creat = CreatRec#xmlElement.content,
2> [CreatText] = [X || X <- Creat, is_record(X,xmlText)],
2> Aut = CreatText#xmlText.value.
"Cecelia Ahern"

Tag search in HTML using js in ant

I am using the below mentioned code to get the content of a specific tag, but when I am trying to execute it I am getting some extra data along with it, I don't understand why is it happening. Lets say if I search for title tag then I am getting " [echo] Title : <title>Unit Test Results</title>,Unit Test Results" this as result, but the problem is title only contains "<title>Unit Test Results</title>" why this extra ",Unit Test Results" thing is coming.
<project name="extractElement" default="test">
<!--Extract element from html file-->
<scriptdef name="findelement" language="javascript">
<attribute name="tag" />
<attribute name="file" />
<attribute name="property" />
<![CDATA[
var tag = attributes.get("tag");
var file = attributes.get("file");
var regex = "<" + tag + "[^>]*>(.*?)</" + tag + ">";
var patt = new RegExp(regex,"g");
project.setProperty(attributes.get("property"), patt.exec(file));
]]>
</scriptdef>
<!--Only available target...-->
<target name="test">
<loadfile srcFile="E:\backup\latest report\Report-20160523_2036.html" property="html.file"/>
<findelement tag="title" file="${html.file}" property="element"/>
<echo message="Title : ${element}"/>
</target>
The return value of RegExp.exec() is an array. From the Mozilla documentation on RegExp.prototype.exec():
The returned array has the matched text as the first item, and then
one item for each capturing parenthesis that matched containing the
text that was captured.
If you add the following code to your JavaScript...
var patt = new RegExp(regex,"g");
var execResult = patt.exec(file);
print("execResult: " + execResult);
print("execResult.length: " + execResult.length);
print("execResult[0]: " + execResult[0]);
print("execResult[1]: " + execResult[1]);
...you'll get the following output...
[findelement] execResult: <title>Unit Test Results</title>,Unit Test Results
[findelement] execResult.length: 2
[findelement] execResult[0]: <title>Unit Test Results</title>
[findelement] execResult[1]: Unit Test Results

hyperlink in openerp tree view

I want to add a link with ftp url in my tree view. i tested with adding widget="url" in my xml ,but its not working.
Please help
my code is
<tree string="File Names" >
<field name="time_created" string="Time Created"/>
<field name="size" string="Size"/>
<field name="file_name"/>
<field name="file_path" widget="url"/>
</tree>
class filedata(osv.osv):
_name = 'filedata'
_log_access = False
_columns = {
'file_name' : fields.char('Name'),
'file_path' : fields.char('File Path'),
'time_created' : fields.datetime('Date Time'),
'size' : fields.char('Size')
}
download this module and you can get clear idea of putting link in tree view. Hope this will help you.
https://www.openerp.com/apps/6.1/web_url/

Nokogiri get xpath from Nokogiri::XML::Element

How to get xpath for rc an element returned by search
f=File.open('/media/cc.xml')
doc = Nokogiri::XML f
rc = doc.search('realmCode')
[#<Nokogiri::XML::Element:0x15a4d714 name="realmCode" namespace=#<Nokogiri::XML::Namespace:0x15a4dafc href="urn:hl7-org:v3"> attributes=[#<Nokogiri::XML::Attr:0x15a4d5c0 name="code" value="US">]>]
Is there a way to extract xpath from rc ?
updated with xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/ccda.xsl"?>
<ClinicalDocument xmlns:sdtc="urn:hl7-org:sdtc" xmlns:vocdo="urn:hl7-org:v3/voc" xmlns="urn:hl7-org:v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:mif="urn:hl7-org:v3/mif">
<realmCode code="US"/>
<typeId root="2.16.840.1.113883.1.3" extension="POCD_HD000040"/>
<templateId root="2.16.840.1.113883.10.20.22.1.1"/>
<templateId root="2.16.840.1.113883.10.20.22.1.2"/>
<id root="2.16.840.1.113883.19.5.99999.1" extension="Test CCDA"/>
<code codeSystem="2.16.840.1.113883.6.1" codeSystemName="LOINC" code="34133-9" displayName="Summarization of Episode Note"/>
<title>Continuity of Care Document (C-CDA)</title>
<effectiveTime value="20130809043133+0000"/>
<confidentialityCode codeSystem="2.16.840.1.113883.5.25" codeSystemName="HL7 Confidentiality" code="N" displayName="Normal"/>
<languageCode code="en-US"/>
<recordTarget>
<patientRole>
<id root="2.16.840.1.113883.4.6" extension="1"/>
<id root="2.16.840.1.113883.4.1" extension="123-101-5230"/>
<addr use="HP">
<streetAddressLine>1357 Amber Drive</streetAddressLine>
<city nullFlavor="UNK"/>
<state nullFlavor="UNK"/>
<postalCode>97006</postalCode>
<country nullFlavor="UNK"/>
</addr>
<telecom value="3545345" use="HP"/>
<patient>
<name use="L">
<given qualifier="BR">test</given>
<family qualifier="CL">overall</family>
<prefix qualifier="IN">Mr</prefix>
</name>
<administrativeGenderCode codeSystem="2.16.840.1.113883.5.1" codeSystemName="HL7 AdministrativeGender" code="M" displayName="Male"/>
<birthTime value="19770429"/>
<raceCode codeSystem="2.16.840.1.113883.6.238" codeSystemName="Race and Ethnicity - CDC" code="2028-9" displayName="Asian"/>
<ethnicGroupCode codeSystem="2.16.840.1.113883.6.238" codeSystemName="Race and Ethnicity - CDC" code="2135-2" displayName="Hispanic or Latino"/>
<languageCommunication>
<languageCode code="eng"/>
<preferenceInd value="true"/>
</languageCommunication>
</patient>
</patientRole>
</recordTarget>
</ClinicalDocument>
rc is not an element-it's an array of matching elements:
results = rc.map do |node|
Nokogiri::CSS.xpath_for node.css_path
end
p results
Or, if you know there is only one matching element:
xpath = Nokogiri::CSS.xpath_for rc[0].css_path
Note that xpath_for returns an array, so you will need to extract the first element of the array:
xpath.first

xml.parse return null google app script

I am trying parse the xml but result return null.
Here is the xml:
<feed>
<title type="text">neymar</title>
<subtitle type="text">Bing Image Search</subtitle>
<id>https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query='neymar'&$top=2</id>
<rights type="text"/>
<updated>2013-05-13T08:45:02Z</updated>
<link rel="next" href="https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query='neymar'&$skip=2&$top=2"/>
<entry>
<id>https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query='neymar'&$skip=0&$top=1</id>
<title type="text">ImageResult</title>
<updated>2013-05-13T08:45:02Z</updated>
<content type="application/xml">
<m:properties>
<d:ID m:type="Edm.Guid">99cb00e9-c9bb-45ca-9776-1f51e30be398</d:ID>
<d:Title m:type="Edm.String">neymaer wallpaper neymar brazil wonder kid neymar wallpaper hd</d:Title>
<d:MediaUrl m:type="Edm.String">http://3.bp.blogspot.com/-uzJS8HW4j24/Tz3g6bNII_I/AAAAAAAAB1o/ExYxctnybUo/s1600/neymar-wallpaper-5.jpg</d:MediaUrl>
<d:SourceUrl m:type="Edm.String">http://insidefootballworld.blogspot.com/2012/02/neymar-wallpapers.html</d:SourceUrl>
<d:DisplayUrl m:type="Edm.String">insidefootballworld.blogspot.com/2012/02/neymar-wallpapers.html</d:DisplayUrl>
<d:Width m:type="Edm.Int32">1280</d:Width>
<d:Height m:type="Edm.Int32">800</d:Height>
<d:FileSize m:type="Edm.Int64">354173</d:FileSize>
<d:ContentType m:type="Edm.String">image/jpeg</d:ContentType>
<d:Thumbnail m:type="Bing.Thumbnail">
<d:MediaUrl m:type="Edm.String">http://ts3.mm.bing.net/th?id=H.5042206689331494&pid=15.1</d:MediaUrl>
<d:ContentType m:type="Edm.String">image/jpg</d:ContentType>
<d:Width m:type="Edm.Int32">300</d:Width>
<d:Height m:type="Edm.Int32">187</d:Height>
<d:FileSize m:type="Edm.Int64">12990</d:FileSize>
</d:Thumbnail>
</m:properties>
</content>
</entry>
<entry>
<id>https://api.datamarket.azure.com/Data.ashx/Bing/Search/Image?Query='neymar'&$skip=1&$top=1</id>
<title type="text">ImageResult</title>
<updated>2013-05-13T08:45:02Z</updated>
<content type="application/xml">
<m:properties>
<d:ID m:type="Edm.Guid">9a6b7476-643e-4844-a8da-a4b640a78339</d:ID>
<d:Title m:type="Edm.String">neymar jr 485x272 Neymar Show 2012 Hd</d:Title>
<d:MediaUrl m:type="Edm.String">http://www.sontransferler.com/wp-content/uploads/2012/07/neymar_jr.jpg</d:MediaUrl>
<d:SourceUrl m:type="Edm.String">http://www.sontransferler.com/neymar-show-2012-hd</d:SourceUrl>
<d:DisplayUrl m:type="Edm.String">www.sontransferler.com/neymar-show-2012-hd</d:DisplayUrl>
<d:Width m:type="Edm.Int32">1366</d:Width>
<d:Height m:type="Edm.Int32">768</d:Height>
<d:FileSize m:type="Edm.Int64">59707</d:FileSize>
<d:ContentType m:type="Edm.String">image/jpeg</d:ContentType>
<d:Thumbnail m:type="Bing.Thumbnail">
<d:MediaUrl m:type="Edm.String">http://ts1.mm.bing.net/th?id=H.4796985557255960&pid=15.1</d:MediaUrl>
<d:ContentType m:type="Edm.String">image/jpg</d:ContentType>
<d:Width m:type="Edm.Int32">300</d:Width>
<d:Height m:type="Edm.Int32">168</d:Height>
<d:FileSize m:type="Edm.Int64">4718</d:FileSize>
</d:Thumbnail>
</m:properties>
</content>
</entry>
</feed>
and here is the code:
var response = UrlFetchApp.fetch('https://api.datamarket.azure.com/Bing/Search/Image?Query=%27neymar%27&$top=2',options)
var resp = response.getContentText();
var ggg = Xml.parse(resp,false).getElement().getElement('entry').getElement('content').getElement('m:properties');
Logger.log(ggg);
How do I get element <d:MediaUrl m:type="Edm.String">?
update: but still not work
var response = UrlFetchApp.fetch('https://api.datamarket.azure.com/Bing/Search/Image?Query=%27neymar%27&$top=2',options)
var text = response.getContentText();
var eleCont = Xml.parse(text,true).getElement().getElement('entry').getElement('content');
var eleProp = eleCont.getElement('hxxp://schemas.microsoft.com/ado/2007/08/dataservices/metadata','properties')
var medUrl= eleProp.getElement('hxxp://schemas.microsoft.com/ado/2007/08/dataservices','MediaUrl').getText()
Logger.log(medUrl)
While the provider is using multiple namespaces (signified by m: and d: in front of element names), you can ignore them for retrieving the data you're interested in.
Once you've called getElement() to get the root of the XML doc, you can navigate through the rest using attribute names. (Stop after var feed = ... in the debugger, and explore feed, you'll find you have the entire XML document there
Try this:
var text = Xml.parse(resp,true);
var feed = text.getElement();
var urls = [];
for (var i in feed.entry) {
urls.push(feed.entry[0].content.properties.MediaUrl.Text);
}
Logger.log(urls);
This also works. Note that you have multiple entries in your response, and this example is going after the second of them:
var ggg = Xml.parse(resp,true)
.getElement()
.getElements('entry')[1]
.getElement('content')
.getElement('properties')
.getElement('MediaUrl')
.getText();
References
Namespaces in XML 1.0
XmlElement methods referencing namespace, such as getElement(namespaceName, localName)
Other relevant StackOverflow questions. xml element name with colon, lots about XML namespaces

Resources