This is for the case of calling Saxon from a Java application. I understand that Saxon can use XPath 3.1 to run queries against JSON files. A couple of question on this:
Where is there an example of how to do this? I've done searches and find lots of answers on details of doing this, but noting on how to read in the file and perform queries. Is it the same as XML?
Is it possible to have a schema file for the JSON so returned values are correctly typed? If so, how?
Is XQuery also able to perform queries on JSON?
What version of Saxon supports this? (We are using 9.9.1.1 and want to know if I need to upgrade.)
Technically, you don't run queries against JSON files; you run them against the data structure that results from parsing a JSON file, which is a structure of maps and arrays. You can parse the JSON file using the parse-json() or json-doc() functions, and then query the result using operators that work on maps and arrays. Some of these (and examples of their use) are shown in the spec at
https://www.w3.org/TR/xpath-31/#id-maps-and-arrays
Googling for "query maps arrays JSON XPath 3.1" finds quite a lot of useful material. Or get Priscilla Walmsley's book: http://www.datypic.com/books/xquery/chapter24.html
Data types: the data types of string, number, and boolean that are intrinsic to JSON are automatically recognized by their form. There's no capability to do further typing using a schema.
XQuery is a superset of XPath, but as far as JSON/Maps/Arrays are concerned, I think the facilities in XPath and those in XQuery are exactly the same.
Saxon has added a bit of extra conformance and performance in each successive release. 9.9 is pretty complete in its coverage; 10.0 adds some optimizations (like a new internal data structure for maps whose keys are all strings, such as you get when you parse JSON). Details of changes in successive Saxon releases are described in copious detail at http://www.saxonica.com/documentation/index.html#!changes
Related
We are using Saxon purely to query data. We're about to update to XPath 3.1. For reading queries (no insert/update/delete) is there any difference between XPath 3.1 and XQuery (latest version)?
If so, what? I'm asking to determine if we should implement an XQuery API in our system along with the XPath 3.1?
The main differences are:
XQuery has node constructors (e.g. <out>{/x/y}</out>
XQuery has full FLWOR expressions with order by, group by, window clauses etc.
So XQuery is a bit more powerful for complex queries, but more importantly, it allows construction of a new XML document to represent the result.
Is there an existing parser to parse json-ld to markdown? I want to generate documentation from my jsonld file. If such a thing doesn't exist, how should I go ahead writing one? or perhaps I could use a json to markdown converter? Any suggestions on how could do this?
I was just googling for such a program, and found your question.
The closest things I could find are: ocxmd, which is an extension to Markdown; and md-ld, which does not even use proper Markdown - instead, it apparently creates an incompatible version of the format which can be parsed to JSON-LD.
If I were writing such a converter in Python, I would use:
pyld to parse JSON-LD files and expand them using the #context;
And a template engine, likely Jinja2, to generate Markdown representation of every node of the JSON-LD document.
The program would be based on recursion. You might have separate functions to display:
URIs,
Numbers,
Images,
...
The program will recurse over the JSON-LD document and convert each of its sections into Markdown format.
I want to use the from_xml method provided by the rails active support core ext,
An example :
hash = Hash.from_xml("my_xml.xml")
for converting XML into a Hash.
I want to use this because parsing a Hash is a lot easier than XML in ruby.
However, I would want to know what are the pros and cons of using this approach. Is there a better approach I can use for converting an XML into hash.
Thanks
XML unlike JSON is a document format and not just data exchange format and not always actually maps cleanly to programming language constructs like hashes. XML is actually ridiculously complex if you look at all the features like namespaces.
Hash.from_xml really only handles the simplest of cases and doesn't have a clue how to deal with stuff like attributes. It really only knows how to parse the XML generated by Hash#to_xml.
Advantages:
its so naive that its cute
Disadvantages:
see advantages
For non-trivial examples you'll need an actual XML parser like Nokogiri.
I am using Graph() in RDFLib, i am correctly getting results of from the graph using sparql. Is it possible to get the results directly in HTML table format?
rdflib is a library to work with rdf in python, not an HTML rendering engine. Usually if you work on a graph.sparql() query, you want to access the result in python itself.
That said, there is a fork focusing on hosting RDF called rdflib-web. In it you can find a htmlresults.py which does pretty much what i think you want.
I am in need of a data format which will allow me to reduce the time needed to parse it to a minimum. In other words I'm looking for a format with as little overhead as possible and being parseable in the shortest amount of time.
I am building an application which will pull a lot of data from an API, parse it and display it to the user. So the format should be as small as possible so that the transmission will be fast and also should be very efficient for parsing. What are my options?
Here are a few formats that pop in in my head:
XML (a lot of overhead and slow parsing IMO)
JSON (still too cumbersome)
MessagePack (looks interesting)
CSV (with a custom parser written in C)
Plist (fast parsing, a lot of overhead)
... any others?
So currently I'm looking at CSV the most. Any other suggestions?
As stated by Apple in Property List Programming Guide binary plist representation should be fastest
Property List Representations
A property list can be stored in one of three different ways: in an
XML representation, in a binary format, or in an “old-style” ASCII
format inherited from OpenStep. You can serialize property lists in
the XML and binary formats. The serialization API with the old-style
format is read-only.
XML property lists are more portable than the binary alternative and
can be manually edited, but binary property lists are much more
compact; as a result, they require less memory and can be read and
written much faster than XML property lists. In general, if your
property list is relatively small, the benefits of XML property lists
outweigh the I/O speed and compactness that comes with binary property
lists. If you have a large data set, binary property lists, keyed
archives, or custom data formats are a better solution.
You just need to set the correct flag while creation or reading NSPropertyListBinaryFormat_v1_0. Just be sure that the data you want to represent in the plist are resented by this format.