Problems getting data from XML using Nokogiri and Rails - ruby-on-rails

I'm trying to get information from a XML file with Nokogiri. I can retrieve file using
f = File.open("/my/path/file.xml")
cac=Nokogiri::XML(f)
And what a get is a fancy noko:file. My row tags are defined like
<z:row ...info..../>
like
<Nokogiri::XML::Element:0x217e7b8 name="z:row" attributes=[#<Nokogiri::XML::Attr:0x217e754 name="ID_Poblacio" value="3">
and I cannot retrieve the rows using either:
s=cac.at_xpath("/*/z:row") or
s=cac.at_xpath("//z:row") or
s=cac.at_xpath("//row") or
s=cac.at_xpath("z:row")...
Probably I'm really fool but I cannot figure out which can be the issue.
Does anyone face this problem?
Thanks in advance.
P:S I tried to paste my cac file directly from bash but something wierd happens with format so I remove it from question. If anyone can explain how to do it I will appreciate it.

Your XML element name contains a colon, but it is not in a namespace (otherwise the prefix and uri would show up in the dump of the node). Using element names with colons without using namespaces is valid, but can cause problems (like this case) so generally should be avoided. Your best solution, if possible, would be to either rename the elements in your xml to avoid the : character, or to properly use namespaces in your documents.
If you can’t do that, then you’ll need to be able to select such element names using XPath. A colon in the element name part of an XPath node test is always taken to indicate a namespace. This means you can’t directly specify a name with a colon that isn’t in a namespace. A way around this is to select all nodes and use an XPath function in a predicate to refine the selection to only those nodes you’re after. You can use a colon in an argument to name() and it won’t be interpreted as a namespace separator:
s=cac.at_xpath("//*[name()='z:row']")

Related

List of System.IO.Packaging ContentType strings?

I'm updating some older code that uses System.IO.Packaging to programmatically create Excel files. It ultimately calls CreatePackage on various bits of internal state to build out the documents. CreatePackage takes a ContentType parameter, and the existing code contains a list of constants like:
Const cxl07WorksheetContentType = "application/vnd...."
I'm trying to add support for PivotTables, which require a PivotCache. I cannot find the appropriate ContentType. I thought I might be able to discern it from within the file, looking in _Rels for instance, but these are always in URI form and bear no obvious relationship to these constants.
So...
is this even required? I passed Nothing and "" but that did not work.
does anyone know where these might be defined? I looked on the mime database, and the MS website, but nothing came up on either for "pivot"
where are these even used? they do not appear in the resulting Package as far as I can see.

Why use "?" instead of ":" in URL?

We can use
'PATCH /companies/:id' : 'CompanyController.find'
to update data.
One suggested me that I can use the alternative way:
'PATCH /companies/find?key=Value'
But I do not know what it works. Please explain me why we prefer ? mark than : mark in search path.
You can use either or. The biggest reason most people chose one or the other is just how they want to present the URL to the user.
Using a path variable (:) can symbolize you're accessing a defined resource, like a user ID, where as an argument (?) can symbolize you're are dynamically changing/searching something within a defined resource, like a token or search term.
From what I can tell that's the general practice I see:
example.com/user/:username
versus
example.com/user/?search="foo"
http://en.wikipedia.org/wiki/URL
If we are firing GET request, ? symbol is used to let the server know the url parameter variables starts from there. And this is commonly used. I didn't used : symbol instead of ?
You are probably messing the things up:
According to your example, :id indicates a variable that must me replaced by an actual value in some frameworks such as Express. See the documentation for details.
And ? indicates the beginning of the query string component according to the RFC 3986.
It's a rule to design rest api
you can find 'how to design a rest api'
Assuming below code is Sails.js
'PATCH /companies/:id' : 'CompanyController.find'
It will makes REST API that be mapped onto 'CompanyController.find' by using PathParam. Like this
www.example.com/companies/100
Second one will makes REST API by using QueryParam.
It also be mapped onto 'CompanyController.find'
/companies/find?key=Value
But the API format is different. Like this
www.example.com/companies/find?key=100
PathParam or QueryParam is fine to make REST API.
If the Key is primary for company entity,
I think PathParam is more proper than QueryParam.

Any problems with using a period in URLs to delimiter data?

I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.
To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.

Resolve prefix programmatically, Jena

I have to parse through xml which contain URI links to dbpedia.org. I have to extract rdf triples from those URI based on a given Ontology using Jena library. How do I resolve the Prefix programmatically in Java based on the ontology given.
The given ontology says that triples can be extracted by querying dbpedia.org. For all such triples the corresponding dbpedia resource is available to start writing the query. But the problem is how do I write the query with only its resource available. I have the properties to query. But I don't have the PREFIX for those properties
Although this may not answer the question directly, I had a whole load of URIs, some prefixed some not and I wanted them all unprefixed (i.e. written out in full with their prefixes resolved). Searching Google the most useful thing I came across was this question (first) and the JavaDoc so I thought I'd add my experience to this question to help anyone else who might be searching for the same thing.
Jena's PrefixMap interface (which Model implements) has expandPrefix and qnameFor methods. The expandPrefix method is the one which helped me (qnameFor does the reverse i.e. it applies a prefix from a PrefixMap to a string and returns null if no such mapping can be found).
Hence for any resource, to ensure that you have a fully expanded URI you can do
myRes.getModel().expandPrefix(myRes.getURI());
Hope this helps someone.
Your question is not very clear, so if this answer doesn't address your issue please edit your question to say more clearly what your problem is. However, based on what you've asked, once you've read an RDF file into a Jena model, in XML or any other encoding, the prefixes used are available through the methods in the interface com.hp.hpl.jena.shared.PrefixMapping (see javadoc). A Model object implements that interface. To automatically expand prefix "foo", use the method getNsPrefixURI().
Edit
OK, given your revised question, there's a number of things you can do to turn a simple property name into a property URI that you can use in a SPARQL query:
use the prefix.cc service to look at possible expansions of prefixes and prefix names (e.g. if you are given dbpedia:elevation, you can look it up on prefix.cc (i.e: http://prefix.cc/dbpedia:elevation) to see that one of the possible expansions is http://dbpedia.org/ontology/elevation
issue a SPARQL describe query on the resource URI to see which properties are returned in the RDF description, then match those to the un-prefixed property names you've been given
ask your data provider to give you full property names, or otherwise provide the prefix expansions, in order to save you from having to reverse engineer which properties he or she meant in the first place.
Personally I'd advocate the third option if that's at all possible.

is it bad form to have the colon in db and in the URL?

I am designing my namespace such that the id i am storing in the DB is
id -> "e:t:222"
where "e" represents the Event class, "t" represents the type
i am also expecting to use this id in my urls
url -> /events/t:222
Is there anything wrong with doing this?
Is there anything wrong with doing this?
Yes: The colon is a reserved character in URLs that has a special meaning, namely specifying the server port, in a URL.
Using it in other places in the URL is a bad idea.
You would need to URLEncode the colon in order to use it.
There is nothing wrong with doing this, you'll simply need to encode the URL properly. Most libraries with do this automatically for you.
In general though, if you care about your data you shouldn't let the application drive the data or database design. Exceptions to this are application centric databases that have no life outside of a single application nor do you expect to use the data anywhere else. In this case, you may want to stick with schemas and idioms that work best with your application.

Resources