I'm running an else loop to iterate through an XML file in ruby, assigning values to a hash. There are 3 items in the XML file, but for some reason it only iterates through the first one, any idea why?
require "nokogiri"
f= File.open("untitled.xml")
doc = Nokogiri::XML(f)
f.close
doc.xpath('//item').each do |node|
children = node.children
item = {
"name" => node['name'],
"buyItNowPrice"=> children.css('buytItNowPrice').inner_text,
"description" => children.css('description').inner_text,
"startingBidPrice" => children.css('startingBidPrice').inner_text,
"closing_time" => children.css('closing_time').inner_text,
"closing_date" => children.css('closing_date').inner_text
}
puts item
end
XML:
<item name = "Test Thing">
<description>Something Coolest.</description>
<buytItNowPrice>154.99</buytItNowPrice>
<startingBidPrice>9999.99</startingBidPrice>
<closing_date>2014-12-25</closing_date>
<closing_time>12:32:PM</closing_time>
</item>
<item name = "Lazer">
<description>Something Cool.</description>
<buytItNowPrice>149.99</buytItNowPrice>
<startingBidPrice>9.99</startingBidPrice>
<closing_date>2014-12-25</closing_date>
<closing_time>12:32:PM</closing_time>
</item>
<item name = "Pokemon">
<description>Something even cooler.</description>
<buytItNowPrice>33.99</buytItNowPrice>
<startingBidPrice>9.99</startingBidPrice>
<closing_date>2014-12-25</closing_date>
<closing_time>12:32:PM</closing_time>
</item>
Output is only the the first item printed.
The given sample XML isn't valid.
A valid XML document requires a single root node, right now you have 3.
You could fix this by wrapping all the <item> nodes in a <items> root node, and iterate through its children then.
Related
All examples seen on the internet are XML files with structure like:
<open_tag>data that I want</close_tag>
but my XML file is different:
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="_x0034_00_x0020_-_x0020_Nomenklatury" xsi:schemaLocation="_x0034_00_x0020_-_x0020_Nomenklatury http://pcisrs/ReportServer?%2FTARIC%20Reporty%20Ciselnikov%2F400%20-%20Nomenklatury&rs%3AFormat=XML&rc%3ASchema=True" Name="400 - Nomenklatury">
<table1>
<Detail_Collection>
<Detail goods_nomenclature_item_id="0100000000" product_line="80" date_start="31.12.1971" quantity_indents="0" declarable_import="0" declarable_export="0" goods_nomenclature_item_description="ŽIVÉ ZVIERATÁ"/>
<Detail goods_nomenclature_item_id="0101000000" product_line="80" date_start="01.01.1972" quantity_indents="1" statistical_unit="NAR" declarable_import="0" declarable_export="0" goods_nomenclature_item_description="Živé kone, somáre, muly a mulice" parent_goods_nomenclature_item_id="0100000000" parent_product_line="80"/>
.....ETC....
</Detail_Collection>
</table1>
</Report>
If I understand the tutorials, this should work:
subor = Nokogiri::XML(File.open('vendor/financnasprava/nomenklatury/recent.xml'))
dataset = subor.xpath('//Detail')
but didn't.
You can work with this data like in the example below. I removed the source path as I have not this data locally.
If i'm right and you are trying to the access Detail attributes:
require 'nokogiri'
require 'open-uri'
data_xml = <<-EOT
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="400 - Nomenklatury">
<table1>
<Detail_Collection>
<Detail goods_nomenclature_item_id="0100000000" product_line="80" date_start="31.12.1971" quantity_indents="0" declarable_import="0" declarable_export="0" goods_nomenclature_item_description="ŽIVÉ ZVIERATÁ"/>
<Detail goods_nomenclature_item_id="0101000000" product_line="80" date_start="01.01.1972" quantity_indents="1" statistical_unit="NAR" declarable_import="0" declarable_export="0" goods_nomenclature_item_description="Živé kone, somáre, muly a mulice" parent_goods_nomenclature_item_id="0100000000" parent_product_line="80"/>
</Detail_Collection>
</table1>
</Report>
EOT
subor = Nokogiri::XML(data_xml)
dataset = subor.xpath('//Detail_Collection/*')
details = dataset.map do |row|
{
product_line: row.attributes['product_line'].value,
goods_nomenclature_item_id: row.attributes['goods_nomenclature_item_id'].value
}
end
puts details
#=> {:product_line=>"80", :goods_nomenclature_item_id=>"0100000000"}
#=> {:product_line=>"80", :goods_nomenclature_item_id=>"0101000000"}
I am using Hpricot gem to parse xml. I am able to get title and pubdate but it did not work for link. Here is the code snippet
items = doc.search("//item").first(6)
items.each do |item|
feed = {}
feed[:title] = item.search("//title").text
feed[:link] = item.search("//link").text
feed[:published_date] = item.search("//pubdate").text
feeds << feed
end
The resultant hpricot elements are as follows:
#<Hpricot::Elements[{elem <item> "\n\t\t" {elem <title> "openagent.com.au" </title>} "\n\t\t" {emptyelem <link>} "http://blog.iproperty.com.au/2016/03/22/openagent-com-au/" {bogusetag </link>} "\n\t\t" {elem <comments> "http://blog.iproperty.com.au/2016/03/22/openagent-com-au/#comments" </comments>} "\n\t\t" {elem <pubdate> "Mon, 21 Mar 2016 22:43:28 +0000" </pubDate>} "\n\t\t"
I have pasted the initial part as it is the only part which is important. Can anyone tell what is the solution for it.
items = doc.search("//item").first(6)
items.each do |item|
feed = {}
feed[:title] = item.search("//title").text
feed[:link] = item.search("//link").innerHTML
feed[:published_date] = item.search("//pubdate").text
feeds << feed
end
for getting link we can use innerHTML
I'm trying to parse a text, and based on tags to do actions.
The text is:
<window>
<caption>My window
</window>
<panel>
<label>
<caption>
<position>50,50
<color>255,255,255
</label>
</panel>
Code:
function parse_tag(chunck)
for start_tag,tag_name in string.gfind(chunck,"(<(.-)>)") do
if (child_obj[tag_name]) then
print(start_tag)
for data,end_tag in string.gfind(chunck,"<" .. tag_name ..">(.-)(</" .. tag_name ..">)") do
for object_prop,value in string.gfind(data,"<(.-)>(.-)") do
print("setting property = \"" .. object_prop .. "\", value of" .. value);
end
end
print("</" .. tag_name ..">");
elseif(findInArray(main_obj,tag_name)) then
print("Invalid data");
stop();
end
end
end
for key,tag in ipairs(main_obj) do
for start_tag,tag_name,chunck,end_tag in string.gfind(data,"(<(" .. tag.name .. ")>)(.-)(</" .. tag.name .. ">)") do --> searching for window/panel start and end tags
if (findInArray(main_obj,tag_name)) then
print(start_tag)
parse_tag(chunck); --> parses the tag with child tag
print(end_tag)
end
end
end
It seems to fail getting the value, as I get the following output:
<window>
</window>
<panel>
<label>
setting property = "caption", value of
setting property = "position", value of
setting property = "color", value of
</label>
</panel>
How can I use match the string after the first <%tag%> until the next <%tag%> or end of the chunk.
string.gfind(data,"<(.-)>(.-)")
Here, you try to match the value with .-. However, - is lazy, i.e, .- will try to match as little as possible, in this case, an empty string.
Try telling it to match until the next <:
string.gfind(data,"<(.-)>(._)<")
Tried different type of captures.
This
string.gfind(data,"<(.-)>([^%<+.-%>+]+)")
Seems to work
When I convert an XML structure to hash with Hash.from_xml(#xml) in Rails, the parser does not distinguish between empty arrays and nil values, whereas the XML depicts nodes that are immediately terminated with \ to be empty arrays, e.g. <audio_languages/> vs. those with attribute nil="true" to be interpreted as nil values.
The XML structure (which I have control over on how to generate) looks like this:
<response>
<medias>
<media>
<id>1</id>
<name>Media-1</name>
<audio_languages/>
<avg_rating nil="true"></avg_rating>
</media>
<media>
<id>2</id>
<name>Media-2</name>
<audio_languages/>
<avg_rating nil="true"></avg_rating>
</media>
</medias>
</response>
The expected output from Hash.from_xml(#xml) would be:
{"response"=>{"medias"=>{"media"=>[{"id"=>"1", "name"=>"Media-1", "audio_languages"=>[], "avg_rating"=>nil}, {"id"=>"2", "name"=>"Media-2", "audio_languages"=>[], "avg_rating"=>nil}]}}}
instead, I get nil values for audio_languages and avg_rating:
{"response"=>{"medias"=>{"media"=>[{"id"=>"1", "name"=>"Media-1", "audio_languages"=>nil, "avg_rating"=>nil}, {"id"=>"2", "name"=>"Media-2", "audio_languages"=>nil, "avg_rating"=>nil}]}}}
I ended up parsing the nodes using libxml and I am checking if the nodes has the signature I am looking for in order to figure out if I want to convert as an empty array vs. a nil value.
# Usage: Hash.from_xml_with_libxml(xml)
require 'xml/libxml'
# adapted from
# http://movesonrails.com/articles/2008/02/25/libxml-for-active-resource-2-0
class Hash
class << self
def from_xml_with_libxml(xml, strict=true)
LibXML::XML.default_load_external_dtd = false
LibXML::XML.default_pedantic_parser = strict
result = LibXML::XML::Parser.string(xml).parse
return { result.root.name.to_s => xml_node_to_hash_with_libxml(result.root)}
end
def xml_node_to_hash_with_libxml(node)
# If we are at the root of the document, start the hash
if node.element?
if node.children?
result_hash = {}
node.each_child do |child|
result = xml_node_to_hash_with_libxml(child)
if child.name == "text"
if !child.next? and !child.prev?
return result
end
elsif result_hash[child.name]
if result_hash[child.name].is_a?(Object::Array)
result_hash[child.name] << result
else
result_hash[child.name] = [result_hash[child.name]] << result
end
else
result_hash[child.name] = result
end
end
return result_hash
else
# Nodes of sort <audio_languages/>, are arrays,
# and nodes like <average_rating "nil"="true"/> are nil values.
if node.to_s.match(/^\<(.+)\/\>$/) && nil == node.attributes["nil"]
return []
end
return nil
end
else
return node.content.to_s
end
end
end
end
I have this code:
Model:
class WeatherLookup
attr_accessor :temperature, :icon, :condition, :zip, :fcttext
def fetch_weather(city)
HTTParty.get("http://api.wunderground.com/api/api_key/forecast/lang:NL/q/IT/#{city.slug}.xml")
end
def initialize
weather_hash = fetch_weather
end
def assign_values(weather_hash)
hourly_forecast_response = weather_hash.parsed_response['response']['forecast']['txt_forecast']['forecastdays']['forecastday'].first
self.fcttext = hourly_forecast_response['fcttext']
self.icon = hourly_forecast_response['icon_url']
end
def initialize(city)
#city = city
weather_hash = fetch_weather(city)
assign_values(weather_hash)
end
end
city_controller:
#weather_lookup = WeatherLookup.new(#city)
city_view:
= #weather_lookup.fcttext
= image_tag #weather_lookup.icon
This work fine...i get the fist dataset of the forecastdays container. The xml from the api looks like this:
<response>
<version>0.1</version>
<termsofService>
http://www.wunderground.com/weather/api/d/terms.html
</termsofService>
<features>
<feature>forecast</feature>
</features>
<forecast>
<txt_forecast>
<date>2:00 AM CEST</date>
<forecastdays>
<forecastday>
<period>0</period>
<icon>clear</icon>
<icon_url>http://icons-ak.wxug.com/i/c/k/clear.gif</icon_url>
<title>zondag</title>
<fcttext>
<![CDATA[ Helder. Hoog: 86F. Light Wind. ]]>
</fcttext>
<fcttext_metric>
<![CDATA[ Helder. Hoog: 30C. Light Wind. ]]>
</fcttext_metric>
<pop>0</pop>
</forecastday>
<forecastday>
<period>1</period>
<icon>clear</icon>
<icon_url>http://icons-ak.wxug.com/i/c/k/clear.gif</icon_url>
<title>zondagnacht</title>
<fcttext>
<![CDATA[ Helder. Laag: 61F. Light Wind. ]]>
</fcttext>
<fcttext_metric>
<![CDATA[ Helder. Laag: 16C. Light Wind. ]]>
</fcttext_metric>
<pop>0</pop>
</forecastday>
<forecastday>
<period>2</period>
<icon>partlycloudy</icon>
<icon_url>http://icons-ak.wxug.com/i/c/k/partlycloudy.gif</icon_url>
<title>maandag</title>
<fcttext>
<![CDATA[ Gedeeltelijk bewolkt. Hoog: 84F. Light Wind. ]]>
</fcttext>
<fcttext_metric>
<![CDATA[ Gedeeltelijk bewolkt. Hoog: 29C. Light Wind. ]]>
</fcttext_metric>
<pop>20</pop>
</forecastday>
<forecastday>
<period>3</period>
<icon>clear</icon>
<icon_url>http://icons-ak.wxug.com/i/c/k/clear.gif</icon_url>
<title>maandagnacht</title>
<fcttext>
<![CDATA[ Gedeeltelijk bewolkt. Laag: 63F. Light Wind. ]]>
</fcttext>
<fcttext_metric>
<![CDATA[ Gedeeltelijk bewolkt. Laag: 17C. Light Wind. ]]>
</fcttext_metric>
<pop>0</pop>
</forecastday>
I want to acces all the foracasts in the forecasts container by a loop, but when i change the hourly_forecast variable (.first) to .all or none i get the error message "can't convert String into Integer"
Someone ideas to fix this?
\
Once you remove .first from hourly_forecast, you are now returning a collection as opposed to a single item. That means your view won't work correctly so you'll need to update your view to render a collection. This is best done with a partial.
city_view will become:
= render partial: "weather_lookup", collection: #weather_lookup
Then create a partial named _weather_lookup.html.haml and include the following:
= #weather_lookup.fcttext
= image_tag #weather_lookup.icon
The underscore in front of the partial's filename is important as this is Rails convention so don't leave it out. When viewed, the partial will be rendered multiple times for every item in the collection.