Understand xml parsing in Apple sample code - ios

I'm studying Apple's LazyTableImages sample code. I'd like to understand how the app is pulling data from the RSS feed included in the app:
http://phobos.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/toppaidapplications/limit=75/xml
How are the contents at the above url parsed? Viewing the page source reveals HTML with no apparent xml section. While looking through the sample parsing code I found a few symbols like im:name. However these symbols are not in the contents of the above url.
I tried to host the contents of the above url locally (w/ limit=1). However pointing the sample code to #"~/Desktop/a.xml" causes the application to throw the error unsupported url.
More info: While reading http://en.wikipedia.org/wiki/Rss I came across what I expected to see at phobos link above. Something like this:
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>RSS Title</title>
<description>This is an example of an RSS feed</description>
<link>http://www.someexamplerssdomain.com/main.html</link>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
<item>
<title>Example entry</title>
<description>an interesting description</description>
<link>http://www.wikipedia.org/</link>
<guid>unique string per item</guid>
<pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate>
</item>
</channel>
</rss>
Is there an equivalent "human-readable" xml document corresponding to the above phobos link somewhere?

You're right, the feed you're looking at technically isn't an RSS Feed. It's an Atom 1.0 Feed, but both are popular XML-based feed formats.
If you view the source of the feed you will see the XML elements you're looking for, like:
<entry>
<updated>2011-12-09T16:15:32-07:00</updated>
<id>http://itunes.apple.com/us/app/tetris/id479943969?mt=8&uo=2</id>
<title>TETRISĀ® - Electronic Arts</title>
<summary>Long summary here</summary>
<im:name>TETRISĀ®</im:name>
...
</entry>
Some browser versions parse RSS/Atom feeds into user-friendly HTML pages and present them instead of the actual feed, it sounds like that's the type of HTML page you're viewing.
On a OS X, you could use a command like Curl to download the feed in a Terminal:
curl -o feed.xml http://phobos.apple.com/WebObjects/MZStoreServices.woa/ws/RSS/toppaidapplications/limit=75/xml

Related

What is the structure of wikipedia dumps?

I need the list of Hungarian words for a project and the only possible source I found is wikipedia XML dumps. They are really big, I guess I could parse them with a read stream and a SAX parser, but it would be nice to know more about the structure so I could test the code on a small example before running it on the big files. Is there a description somewhere about what structure they use and what the different XML gzip files contain? https://dumps.wikimedia.org/enwiki/latest/ https://dumps.wikimedia.org/huwiki/latest/
The format is documented here: https://www.mediawiki.org/wiki/Help:Export It looks like this:
<mediawiki xml:lang="en">
<page>
<title>Page title</title>
<restrictions>edit=sysop:move=sysop</restrictions>
<revision>
<timestamp>2001-01-15T13:15:00Z</timestamp>
<contributor><username>Foobar</username></contributor>
<comment>I have just one thing to say!</comment>
<text>A bunch of [[Special:MyLanguage/text|text]] here.</text>
<minor />
</revision>
<revision>
<timestamp>2001-01-15T13:10:27Z</timestamp>
<contributor><ip>10.0.0.2</ip></contributor>
<comment>new!</comment>
<text>An earlier [[Special:MyLanguage/revision|revision]].</text>
</revision>
</page>
<page>
<title>Talk:Page title</title>
<revision>
<timestamp>2001-01-15T14:03:00Z</timestamp>
<contributor><ip>10.0.0.2</ip></contributor>
<comment>hey</comment>
<text>WHYD YOU LOCK PAGE??!!! i was editing that jerk</text>
</revision>
</page>
</mediawiki>

When I use send_keys Chrome cant receive all tags (XML)

I have a textbox and I'm using find_element command and send_keys to send XML but the problem is, when I send XML, Chrome doesn't receive all the tags as if it were very fast to receive and just missed the letters.
#browser.find_element(:xpath,'/html/body/form/textarea').send_keys(#xml)
#browser.find_element(:xpath,'//*[#id="BotaoEnviar"]').click
Example XML (taken from comment):
<Xml>
<Id>123456</Id>
<Card>1234567890</Card>
<Value>15000</Value>
</Xml>
error
<Xml>
<Id>123456</Id>
<Card>1234567890</crtao>
<Value>15000</vaor>
</Xml>

Podcast from my server works on iTunes desktop, but not on mobile

I have an issue where I'm trying to host a podcast via RSS feed myself, and I'm running into a strange issue. I validated my RSS feed and all of that, got everything working on iTunes on the desktop just fine. However, when I go to play that same podcast on my iPhone, I either get a "podcast is temporarily unavailable" error, or nothing at all. All of the information shows up fine and everything, but nothing will play. I made sure to check with my hosting service that Byte Range Requests are supported, and enabled. I don't know what the problem could be. The RSS looks like this (I edited some personal information out):
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Striate Cortex Testing...</title>
<description>This is a test of the Striate Cortex Podcasting System...</description>
<link>http://www.striatecortex.com</link>
<language>en-us</language>
<copyright>Copyright 2016</copyright>
<lastBuildDate>Tue, 16 Feb 2016 08:55:00 -0500</lastBuildDate>
<pubDate>Tue, 16 Feb 2016 08:55:00 -0500</pubDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<webMaster>EMAIL (NAME)</webMaster>
<itunes:author>Striate Cortex</itunes:author>
<itunes:subtitle>PODCASTNAME is a show about interviewing other musicians, and talking about what life is like on the road.</itunes:subtitle>
<itunes:summary>PODCASTNAME is a show about interviewing other musicians, and talking about what life is like on the road. Something else to make this longer.</itunes:summary>
<itunes:owner>
<itunes:name>Striate Cortex</itunes:name>
<itunes:email>EMAIL</itunes:email>
</itunes:owner>
<itunes:explicit>Yes</itunes:explicit>
<itunes:image href="http://www.striatecortex.com/podcast.png"/>
<itunes:category text="Music">
</itunes:category>
<atom:link href="http://www.striatecortex.com/test.xml" rel="self" type="application/rss+xml" />
<item>
<title>Striate Cortex - PodcastName Test # 1</title>
<link>http://www.striatecortex.com/podcast/test</link>
<guid>http://www.striatecortex.com/test.mp3</guid>
<description>This is a description.</description>
<enclosure url="http://www.striatecortex.com/test.mp3" length="29184000" type="audio/mpeg"/>
<category>Podcasts</category>
<pubDate>Tue, 16 Feb 2016 08:55:00 -0500</pubDate>
</item>
</channel>
</rss>

Extracting data from xml file using xmllint

I have a small xml document from which I need to extract some values using xmllint. I am able to navigate through the xml hierarchy using xmllint --shell xmlfilename command.
But I am unable to extract the values. I don't want to use a grep / any pattern matching command, as that is already done and is a success.
I would appreciate any help regarding the xmlliint.
Here is my document in png format. I want to extract the 300$ and 500$ (the value).
<?xml version="1`.`0" encoding="ISO-8859-1"?>
<adi>
<asset>
<electronics item="Mobile" name="Nokia" value="300$" />
<electronics item="Mobile" name="Sony" value="500$" />
</asset>
</adi>
Another doubt is, are the two sets, the different representation of same xml ?
<?xml version="1.0 encoding="ISO-8859-1"?>
<adi>
<asset>
<electronics>
<item> Mobile </item>
<name>Nokia</name>
<value>300$</value>
</electronics>
<electronics>
<item> Mobile </item>
<name>Sony</name>
<value>500$</value>
</electronics>
</asset>
</adi>
With regards to your second question, those two snippets do not represent the same XML content. Attributes and child elements are not equivalent. A child element can be the root element of some arbitrary XML tree, but attributes are atomic.
E.g., I could modify the second snippet like this:
<?xml version="1.0 encoding="ISO-8859-1"?>
<adi>
<asset>
<electronics>
<item>
Mobile
<sub-item>Phone</sub-item>
</item>
<name>Nokia</name>
<value>300$</value>
</electronics>
<electronics>
<item> Mobile </item>
<name>Sony</name>
<value>500$</value>
</electronics>
</asset>
</adi>
where I have added <sub-item>Phone</sub-item> to the first <item> element.
However, there's no equivalent if item is an attribute instead, as in the first snippet.
Late but while searches for the tag xmllint match the first page, I answer you now ;)
use --xpath instead of --xpath like below
xmllint --xpath '//electronics/value/text()' second-xml_file.xml

Finding a text node in xml using xpath problem

I'm using rails and the Nokogiri parser. My xml is as below and I'm trying to get the 'Biology: 08:00' text into my view.
<rss version="2.0">
<channel>
<item>
<title>Biology: 08:00</title>
<description>Start time of Biology</description>
<pubDate>Tue, 13 Oct 2009 UT</pubDate>
</item>
</channel>
</rss>
I can find the node with the text 'biology' using the code below
#content = doc.xpath('//title[contains(text(),"Biology")]')
When I move it into my view it strangely ends up as the title of my .html.erb page. I can't seem to get it into the body with
<body>
<%=#content%>
</body>
anyone know what's going on?
You're getting the whole node, and the node is a <title> tag.
you want:
#content = doc.xpath('//title[contains(text(),"Biology")]/text()')
to get the text content of the node

Resources