How to access nested XML with Nokogiri [closed]

How to access nested XML with Nokogiri [closed] - ruby-on-rails

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am using Nokogiri to parse XML. I was told to use a CSS selector to search through the XML, but I can't chain it to get through the nested objects.
How do I access the inner elements?
2.6.3 :039 > pp a.css("interface").to_s
"<interface>\n" +
" <status>\n" +
" <__XML__OPT_Cmd_show_interface_status_down>\n" +
" <__XML__OPT_Cmd_show_interface_status___readonly__>\n" +
" <__readonly__>\n" +
" <TABLE_interface>\n" +
" <ROW_interface>\n" +
" <interface>mgmt0</interface>\n" +
" <state>connected</state>\n" +
" <vlan>routed</vlan>\n" +
" <duplex>full</duplex>\n" +
" <speed>a-1000</speed>\n" +
" <type>--</type>\n" +
" </ROW_interface>\n" +
" <ROW_interface>\n" +
" <interface>Vlan1</interface>\n" +
" <state>down</state>\n" +
" <vlan>routed</vlan>\n" +
" <duplex>auto</duplex>\n" +
" <speed>auto</speed>\n" +
" </ROW_interface>\n" +
" <ROW_interface>\n" +
" <interface>Vlan6</interface>\n" +
" <state>down</state>\n" +
" <vlan>routed</vlan>\n" +
" <duplex>auto</duplex>\n" +
" <speed>auto</speed>\n" +
" </ROW_interface>\n" +
" <ROW_interface>\n" +
" <interface>Vlan486</interface>\n" +
" <state>down</state>\n" +
" <vlan>routed</vlan>\n" +
" <duplex>auto</duplex>\n" +
" <speed>auto</speed>\n" +
" </ROW_interface>\n" +
" </TABLE_interface>\n" +
" </__readonly__>\n" +
" </__XML__OPT_Cmd_show_interface_status___readonly__>\n" +
" </__XML__OPT_Cmd_show_interface_status_down>\n" +
" </status>\n" +
" </interface><interface>mgmt0</interface><interface>Vlan1</interface><interface>Vlan6</interface><interface>Vlan486</interface>"
I end up with this tree. What is my XPath here? This is only part of the parsed XML:
2.6.3 :043 > pp parsed
#(DocumentFragment:0x3fce080cd300 {
name = "#document-fragment",
children = [
#(ProcessingInstruction:0x3fce080cce14 { name = "xml" }),
#(Text "\n"),
#(Element:0x3fce080cc7d4 {
name = "rpc-reply",
namespace = #(Namespace:0x3fce080cffb0 {
prefix = "nf",
href = "urn:ietf:params:xml:ns:netconf:base:1.0"
}),
children = [
#(Text "\n" + " "),
#(Element:0x3fce080cf22c {
name = "data",
namespace = #(Namespace:0x3fce080cffb0 {
prefix = "nf",
href = "urn:ietf:params:xml:ns:netconf:base:1.0"
}),
children = [
#(Text "\n" + " "),
#(Element:0x1903f98 {
name = "show",
namespace = #(Namespace:0x1903f20 {
href = "http://www.cisco.com/nxos:1.0:if_manager"
}),
children = [
#(Text "\n" + " "),
#(Element:0x1903700 {
name = "interface",
namespace = #(Namespace:0x1903f20 {
href = "http://www.cisco.com/nxos:1.0:if_manager"
}),
children = [
#(Text "\n" + " "),
#(Element:0x19030fc {
name = "status",
namespace = #(Namespace:0x1903f20 {
href = "http://www.cisco.com/nxos:1.0:if_manager"
}),
children = [
#(Text "\n" + " "),
#(Element:0x1902a1c {
name = "__XML__OPT_Cmd_show_interface_status_down",
namespace = #(Namespace:0x1903f20 {
href = "http://www.cisco.com/nxos:1.0:if_manager"
}),

Your question is really generic and poorly asked so answering a specific question is not possible, but it looks like you need to understand how to access tags in a document using a CSS accessor, which Nokogiri makes very easy.
Meditate on this:
require 'nokogiri'
foo =<<EOT
<tag1>
<tag2>some text</tag2>
<tag3>some more text</tag3>
<tags>something</tags>
<tags>or</tags>
<tags>other</tags>
</tag1>
EOT
xml = Nokogiri::XML.parse(foo)
at finds the first matching occurrence in the document:
xml.at('tag2').content # => "some text"
at is pretty smart, in that it tries to determine whether the accessor is CSS or XPath, so it's a good first tool when you want the first match. If that doesn't work then you can try at_css which specifies that accessor is CSS, because sometimes you can come up with something that could work as CSS or XPath but return different results:
xml.at_css('tag3').content # => "some more text"
xml.at_css('tag3').text # => "some more text"
Similar to at is search, which also tries to determine whether it's CSS or XPath, but finds all matching nodes throughout the document rather than just the first matching one. Because it returns all matching nodes, it returns a NodeSet, unlike at which returns a Node, so you have to be aware that NodeSets behave differently than Nodes when accessing their content or text:
xml.search('tags').text # => "somethingorother"
That's almost never what you want, but you'd be surprised how many people then ask how to split that resulting string into the desired three words. It's usually impossible to do accurately, so a different tactic is needed:
xml.search('tags').map { |t| t.content } # => ["something", "or", "other"]
xml.search('tags').map { |t| t.text } # => ["something", "or", "other"]
xml.search('tags').map(&:text) # => ["something", "or", "other"]
Both at and search have ..._css and ..._xpath variations to help you fine-tune your code's behavior, but I always recommend starting with the generic at and search until you're forced to define what the accessor is.
I also recommend starting with CSS accessors over XPath because they tend to be more readable, and more easily learned if you're working inside HTML with CSS. XPath is very powerful, probably still more so than CSS, but learning it takes longer and often results in less readable code, which affects maintainability.
This is all in the tutorials and cheat sheets and documentation. Nokogiri is extremely powerful but it takes time reading and trying things to learn it. You can also search on SO for other things I've written about searching XML and HTML documents; In particular "What are some examples of using Nokogiri?" helps get an idea how to scrape a page. There's a lot of information covering many different topics related to this. I find it an interesting exercise to parse documents like this as it was part of my professional life for years.

You could use xpath:
parsed = Nokogiri::XML::DocumentFragment.parse(xml)
siamese_cat = parsed.xpath(.//interface/status/state)
Or just iterating thru XML
parsed = Nokogiri::XML::DocumentFragment.parse(xml)
parsed.each do |element|
# Some instructions
end

Related

Cypher: How to optimize a search in a browser (maybe using parameters)

Well, I have to run the following query (Neo4j comm. ed. 3.0.12 on Docker)
the caveat is that the calendar name has unknown format:
1) firstname + " " + lastname + "-" + specialization
2) lastname + " " + firstname + "-" + specialization
:PARAM name: "Di Pietro Chiara - Gynecologist"
MERGE (_200:`person` {`lastname`: "Di Pietro", `firstname`: "Chiara", `birthdate`: "1984/03/25"})
MERGE (_cal_445:`calendar` { :`X-VR-CALNAME` = $name })-[:`belongs_to a`]-(_per_445:`person`)
WHERE $name = _per_445.firstname + " " + _per_445.lastname
OR $name = (_per_445.nome + " " + _per_445.cognome)
RETURN _cal_445, _per_445
The query, and some different variants, doesn't run. Sometimes returns an error, and sometimes destroys the browser layout on the screen,
Surely there is something wrong but I was unable to find and correct.
The part of confronting against two inverted format: how could be optimized?
Why the PARAM declaration generate an error?
Any help will be greatly appreciated.

This part of your query is not valid :
MERGE (_cal_445:`calendar` { :`X-VR-CALNAME` = $name })
You should replace it by this :
MERGE (_cal_445:`calendar` { `:X-VR-CALNAME`:$name })
Moreover, you are doing a MERGE with the value $name that is also on the WHERE cluse. It's just not allowed ...
If you replace the merge by a match, your query will work :
MERGE (_200:`person` {`lastname`: "Di Pietro", `firstname`: "Chiara", `birthdate`: "1984/03/25"})
WITH _200
MATCH (_cal_445:`calendar` { `:X-VR-CALNAME`: $name })-[:`belongs_to a`]-(_per_445:`person`)
WHERE $name = _per_445.firstname + " " + _per_445.lastname
OR $name = (_per_445.nome + " " + _per_445.cognome)
RETURN _cal_445, _per_445

Checking variables exist before building an array

I am generating a string from a number of components (title, authors, journal, year, journal volume, journal pages). The idea is that the string will be a citation as so:
#citation = article_title + " " + authors + ". " + journal + " " + year + ";" + journal_volume + ":" + journal_pages
I am guessing that some components occasionally do not exist. I am getting this error:
no implicit conversion of nil into String
Is this indicating that it is trying to build the string and one of the components is nil? If so, is there a neat way to build a string from an array while checking that each element exists to circumvent this issue?

It's easier to use interpolation
#citation = "#{article_title} #{authors}. #{journal} #{year}; #{journal_volume}:#{journal_pages}"
Nils will be substituted as empty strings

array = [
article_title, authors ,journal,
year, journal_volume, journal_pages
]
#citation = "%s %s. %s %s; %s:%s" % array
Use String#% format string method.
Demo
>> "%s: %s" % [ 'fo', nil ]
=> "fo: "

Considering that you presumably are doing this for more than one article, you might consider doing it like so:
SEPARATORS = [" ", ". ", " ", ";", ":", ""]
article = ["Ruby for Fun and Profit", "Matz", "Cool Tools for Coders",
2004, 417, nil]
article.map(&:to_s).zip(SEPARATORS).map(&:join).join
# => "Ruby for Fun and Profit Matz. Cool Tools for Coders 2004;417:"

Getting incremental changes from Neo4j DB

Is there any way in Neo4j 1.9, to get all the nodes/relationships that were modified(created/updated/deleted) within certain span of time - like we do in SOLR delta import?
One crude way I can think of is maintain a timestamp property for each node/relationship and index them to fetch those nodes/relationship.
START a=node:custom_index("timestamp:[{start_time} TO {end_time}]")
RETURN a;
But then the question would be if I modify the node via CYPHER, index will not be updated.

There's no built-in functionality like that in Neo4j, unfortunately.
To address issues one by one. Maintaining timestamp is not possible, because you have nowhere to put it in the case of deleted nodes/relationships. You can't put a timestamp on a property either. So you would know a node has been changed, but wouldn't know how.
One possible solution is to log the changes somewhere as they happen, using TransactionEventHandlers. Then, you can a) choose exactly what to record, and b) don't worry about Cypher, it will be logged no matter what method you used to update the database.
I've put together a small demo. It just logs every change to std out. It uses some GraphAware classes (disclaimer: I'm the author) for simplicity, but could be written without them, if you feel so inclined.
Here's the important part of the code, in case the link gets eventually broken or something:
#Test
public void demonstrateLoggingEveryChange() {
GraphDatabaseService database = new TestGraphDatabaseFactory().newImpermanentDatabase();
database.registerTransactionEventHandler(new ChangeLogger());
//perform mutations here
}
private class ChangeLogger extends TransactionEventHandler.Adapter<Void> {
#Override
public void afterCommit(TransactionData data, Void state) {
ImprovedTransactionData improvedData = new LazyTransactionData(data);
for (Node createdNode : improvedData.getAllCreatedNodes()) {
System.out.println("Created node " + createdNode.getId()
+ " with properties: " + new SerializablePropertiesImpl(createdNode).toString());
}
for (Node deletedNode : improvedData.getAllDeletedNodes()) {
System.out.println("Deleted node " + deletedNode.getId()
+ " with properties: " + new SerializablePropertiesImpl(deletedNode).toString());
}
for (Change<Node> changedNode : improvedData.getAllChangedNodes()) {
System.out.println("Changed node " + changedNode.getCurrent().getId()
+ " from properties: " + new SerializablePropertiesImpl(changedNode.getPrevious()).toString()
+ " to properties: " + new SerializablePropertiesImpl(changedNode.getCurrent()).toString());
}
for (Relationship createdRelationship : improvedData.getAllCreatedRelationships()) {
System.out.println("Created relationship " + createdRelationship.getId()
+ " between nodes " + createdRelationship.getStartNode().getId()
+ " and " + createdRelationship.getEndNode().getId()
+ " with properties: " + new SerializablePropertiesImpl(createdRelationship).toString());
}
for (Relationship deletedRelationship : improvedData.getAllDeletedRelationships()) {
System.out.println("Deleted relationship " + deletedRelationship.getId()
+ " between nodes " + deletedRelationship.getStartNode().getId()
+ " and " + deletedRelationship.getEndNode().getId()
+ " with properties: " + new SerializablePropertiesImpl(deletedRelationship).toString());
}
for (Change<Relationship> changedRelationship : improvedData.getAllChangedRelationships()) {
System.out.println("Changed relationship " + changedRelationship.getCurrent().getId()
+ " between nodes " + changedRelationship.getCurrent().getStartNode().getId()
+ " and " + changedRelationship.getCurrent().getEndNode().getId()
+ " from properties: " + new SerializablePropertiesImpl(changedRelationship.getPrevious()).toString()
+ " to properties: " + new SerializablePropertiesImpl(changedRelationship.getCurrent()).toString());
}
}
}

Grails How to make address show in two line?

I've a student form which there's location inside the form, when I run the app and show the form it'll look like this
Location : Jl Excel Road Ring No.36 SINGAPORE, 10110
But I want to make the location in two line like this
Location : Jl Excel Road Ring No.36
SINGAPORE, 10110
here's the gsp
<td><g:message code="location.label"/></td>
<td>${studentInstance.location}</td>
and this is the service in def show
def loc = Location.findByTidAndDeleteFlag(params.tid, "N")
if(loc != null){
studentInstance.location = loc.address1 + " " + loc.city + ", " + loc.zipCode
}
else{
studentInstance.location = ""
}

Use the br tag
studentInstance.location = loc.address1 + "<br/> " + loc.city + ", " + loc.zipCode
Then you can render directly the HTML unescaped like this:
<%=studentInstance.location%>
The default codec is probably HTML in your configuration.
Check the value of grails.views.default.codec
For more information read this:
http://grails.org/doc/2.2.1/ref/Plug-ins/codecs.html
I believe that starting from Grails 2.3.x the default views codec is HTML with XML escaping in order to prevent XSS attacks.

This is a bad approach but you can try
studentInstance.location = loc.address1 + "<br> " + loc.city + ", " + loc.zipCode
Generally, I would have each of the element of address available in view so that the styling is flexible in view than in controller, something raw would look like:
<td><g:message code="location.label"/></td>
<td>${model.address1} <br> ${model.city}, ${model.zipCode}</td>

Add plus before first symbol in word in string and * after last

I have such string:
Тормозные диски
How can i transform it into
+Тормозн* +дис*
Now with help of SO i use gsub, but some people say that it could be done via map. But how?
Note: main trouble is that i have cyrillic symbols...
now:
art_group_search = art_group.gsub(/\b(\w+?)\w{0,2}\b/, '+\1*').mb_chars.upcase.to_s

"Тормозные диски".split.map {|word| "+" + word + "*"}.join(" ")
To break that snippet up:
"Your string".split
=> ["Your", "string"]
["Your", "string"].map {|word| "+" + word + "*"}
=> ["+Your*", "+string*"]
["+Your*", "+string*"].join(" ")
=> "+Your* +string*"

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to access nested XML with Nokogiri [closed] - ruby-on-rails

You could use xpath: parsed = Nokogiri::XML::DocumentFragment.parse(xml) siamese_cat = parsed.xpath(.//interface/status/state) Or just iterating thru XML parsed = Nokogiri::XML::DocumentFragment.parse(xml) parsed.each do |element| # Some instructions end

Related

Cypher: How to optimize a search in a browser (maybe using parameters)

Checking variables exist before building an array

Getting incremental changes from Neo4j DB

Grails How to make address show in two line?

Add plus before first symbol in word in string and * after last

Categories

Resources