What is the difference between XML and RDF [duplicate] - ontology

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a doubt about what are the differences between JSON, XML and RDF.
I read on the internet:
JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web.
Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere.
So if I'm not mistaken, Json is used for exchange of data. XML is used for exchange of data too.
what are the main differences between these two?
RDF is used to describe resources on the Web and is based on a xml syntax. So the xml can be used both for the exchange of information, both for the description of the new languages​​?
Can you give me some clarification?
EDIT:
What I understand is:
"Resource Description Framework" suggests that provides a framework for describing resources. In a university exam I have used RDF to describe the ontology of a company that I have described the main components of a company and the relationships between them.
RDF is important for the semantic web because "describe resources" allows us to associate a semantic meaning to resources.
XML is a markup language. A markup language is a set of rules that describe the mechanisms of representation (structural, semantic or presentational) of a text (wikipedia). For this reason it can be used to define the structure of the text of RDF or SOAP etc..
You also say that it is used for data serialize.
JSON is only for data serialize. To serialize data JSON and XML is similar but with XML and XML SCHEMA I can associate semantic meaning to data, or am I wrong?

XML started life as a document markup language. It has been additionally been widely used to store (serialize) data structures in various computer languages and is the basis of SOAP based web services.
Json and YAML are designed to record data structures. Yaml has been described as a superset of JSON. In practice I have found there is little practical difference apart from the fact that Yaml is simpler to read and write by humans. JSON is now more widely favoured by REST based webservices, due to its simplicity.
RDF is less a data format and more acurately described as a metadata data model. It is used to record information on the internet and is one of the building block standards of the Semantic web. RDF can be expressed in various different formats, for example XML and JSON. I can recommend the following link as an introduction:
https://github.com/JoshData/rdfabout/blob/gh-pages/intro-to-rdf.md
For some RDF examples and more discussion on this topic:
JSON to XML conversion

Related

Does it make sense to interrogate structured data using NLP?

I know that this question may not be suitable for SO, but please let this question be here for a while. Last time my question was moved to cross-validated, it froze; no more views or feedback.
I came across a question that does not make much sense for me. How IFC models can be interrogated via NLP? Consider IFC models as semantically rich structured data. IFC defines an EXPRESS based entity-relationship model consisting of entities organized into an object-based inheritance hierarchy. Examples of entities include building elements, geometry, and basic constructs.
How could NLP be used for such type of data? I don't see NLP relevant at all.
In general, I would suggest that using NLP techniques to "interrogate" already (quite formally) structured data like EXPRESS would be overkill at best and a time / maintenance sinkhole at worst. In general, the strengths of NLP (human language ambiguity resolution, coreference resolution, text summarization, textual entailment, etc.) are wholly unnecessary when you already have such an unambiguous encoding as this. If anything, you could imagine translating this schema directly into a Prolog application for direct logic queries, etc. (which is quite a different direction than NLP).
I did some searches to try to find the references you may have been referring to. The only item I found was Extending Building Information Models Semiautomatically Using Semantic Natural Language Processing Techniques:
... the authors propose a new method for extending the IFC schema to incorporate CC-related information, in an objective and semiautomated manner. The method utilizes semantic natural language processing techniques and machine learning techniques to extract concepts from documents that are related to CC [compliance checking] (e.g., building codes) and match the extracted concepts to concepts in the IFC class hierarchy.
So in this example, at least, the authors are not "interrogating" the IFC schema with NLP, but rather using it to augment existing schemas with additional information extracted from human-readable text. This makes much more sense. If you want to post the actual URL or reference that contains the "NLP interrogation" phrase, I should be able to comment more specifically.
Edit:
The project grant abstract you referenced does not contain much in the way of details, but they have this sentence:
... The information embedded in the parametric 3D model is intended for facility or workplace management using appropriate software. However, this information also has the potential, when combined with IoT sensors and cognitive computing, to be utilised by healthcare professionals in Ambient Assisted Living (AAL) environments. This project will examine how as-constructed BIM models of healthcare facilities can be interrogated via natural language processing to support AAL. ...
I can only speculate on the following reason for possibly using an NLP framework for this purpose:
While BIM models include Industry Foundation Classes (IFCs) and aecXML, there are many dozens of other formats, many of them proprietary. Some are CAD-integrated and others are standalone. Rather than pay for many proprietary licenses (some of these enterprise products are quite expensive), and/or spend the time to develop proper structured query behavior for the various diverse file format specifications (which may not be publicly available in proprietary cases), the authors have chosen a more automated, general solution to extract the content they are looking for (which I assume must be textual or textual tags in nearly all cases). This would almost be akin to a search engine "scraping" websites and looking for key words or phrases and synonyms to them, etc. The upside is they don't have to explicitly code against all the different possible BIM file formats to get good coverage, nor pay out large sums of money. The downside is they open up new issues and considerations that come with NLP, including training, validation, supervision, etc. And NLP will never have the same level of accuracy you could obtain from a true structured query against a known schema.

Whether i need to use Ontology or database?

I have a set of football related keywords, a data set of positive sentiments words and negative sentiments words with me. My requirement is to combine these and search is social media to get some real time discussions and posts, and do some statistical analysis and reach some conclusions. This keywords and data sets are dynamically updating one. Now my question is
What is the best practice to handle the three sets of data? Using an Ontology structure or Well structured database?
Whether the data in the ontology is able to access from any programming languages? can i update or retrieve the data in Ontology using .NET or R or with any other programming language?
Thank you
Representing the related keywords as an ontology is a good idea rather than storing in a database.
SPARQL can be used to access and search the ontology to get related information
Your system will be semantically rich if its an ontology
If its a database, may be the access time may be improved but it will not be semantically rich
You may use apache jena which is a free Java API for creating an ontology.
Python also has many plugins for ontology generation.

Comparison of OData and Semantic Web/Linked Data

I'm trying to get my head around two very different approaches to data sharing: OData and Semantic Web/Linked Data. Is there a good comparison of the two?
As I understand it, OData combines syndication/CRUD (AtomPub), serialisation formats (XML, JSON), a data model, a query language, and some semantics/conventions governing use of those existing technologies. It's primarily intended for exposing data from one system so that others can consume it.
Linked Data is a data model, a rigorous commitment to URIs, an (optional?) serialisation format (RDF/XML), but (correct me if I'm wrong) doesn't say anything about transport, CRUD, etc. It seems intended to allow inferencing across lots of little chunks of data drawn from a wide variety of sources. (Not something of major importance to us right now - we would be synchronising large slabs of data between a small number of sources, and wanting to preserve provenance information).
I'm interested in technologies for sharing data between certain data management platforms, some of which I work on directly. OData seems more appealing as it's very straightforward to explain to developers: implement this API, follow that Atom standard, serialise the data like this. We're already doing something very similar for one platform: sharing XML-serialised data on an Atom feed, with URL parameters used to filter.
By contrast, my past experiences working with RDF have given me an impression of brittle, opaque (massive slabs of RDF/XML), inaccessible (using SPARQL vs SQL) technology - but perhaps I'm confusing the experience of working with a triplestore like Jena with simply exposing an existing database via a linked data API.
Any pointers, comments etc on the differences and similarities between these two approaches in terms of scope, technologies, ease, future potential etc would be great.
I think discussing this in depth is not really what Stackoverflow is meant for, but just to give you some pointers to interesting discussions about differences and overlap:
Oh - it is data on the Web
Microsoft, OData and RDF
One of the key differences seems to be that OData has no means to link data from different sources to each other. Essentially, you're still stuck in a silo.
It might also be interesting to check out various attempts to convert data between the two approaches. See a.o. http://answers.semanticweb.com/questions/1298/has-anyone-written-a-mapping-from-odata-to-rdf .
OData may be easier, but its not better, by any means. SPARQL and RDF (forget RDF/XML, better to look at Turtle) satisfies everything in OData along with providing many more cutting edge features such as:
Federation Extensions
Linked Data
Reasoning and Inference (for the more brave)
Equally, the software supporting the standards is actually quite sophisticated. Most people interested in OData generally come from a Microsoft background, so take a look at dotNetRdf
Here's a comparison matrix:
http://uoccou.wordpress.com/2011/02/17/linked-data-odata-gdata-datarss-comparison-matrix/
Unfortunately the table formatting is pretty horrible, but the content is useful.

Framework for building structured binary data parsers?

I have some experience with Pragmatic-Programmer-type code generation: specifying a data structure in a platform-neutral format and writing templates for a code generator that consume these data structure files and produce code that pulls raw bytes into language-specific data structures, does scaling on the numeric data, prints out the data, etc. The nice pragmatic(TM) ideas are that (a) I can change data structures by modifying my specification file and regenerating the source (which is DRY and all that) and (b) I can add additional functions that can be generated for all of my structures just by modifying my templates.
What I had used was a Perl script called Jeeves which worked, but it's general purpose, and any functions I wanted to write to manipulate my data I was writing from the ground up.
Are there any frameworks that are well-suited for creating parsers for structured binary data? What I've read of Antlr suggests that that's overkill. My current target langauges of interest are C#, C++, and Java, if it matters.
Thanks as always.
Edit: I'll put a bounty on this question. If there are any areas that I should be looking it (keywords to search on) or other ways of attacking this problem that you've developed yourself, I'd love to hear about them.
Also you may look to a relatively new project Kaitai Struct, which provides a language for that purpose and also has a good IDE:
Kaitai.io
You might find ASN.1 interesting, as it provide an absract way to describe the data you might be processing. If you use ASN.1 to describe the data abstractly, you need a way to map that abstract data to concrete binary streams, for which ECN (Encoding Control Notation) is likely the right choice.
The New Jersey Machine Toolkit is actually focused on binary data streams corresponding to instruction sets, but I think that's a superset of just binary streams. It has very nice facilities for defining fields in terms of bit strings, and automatically generating accessors and generators of such. This might be particularly useful
if your binary data structures contain pointers to other parts of the data stream.

Why is XML store not available in iOS?

From the Core Data Programming Guide:
iOS: The XML store is not available on iOS.
Why isn't this available? Is it because of the lack of certain XML classes or does it require too much processing power or RAM?
Apple would be the authoritative source for this, so we can only guess.
It’s probably because of two factors: XML stores are slower (as stated in the official documentation, mainly because of the need to parse XML and lack of efficient algorithm/data structures for common database operations) and potentially use more disk space than SQLite stores (since data must be enclosed in tags and XML stores use human-readable representation of data).
Edit: libxml2 is available on iOS so XML parsing functionality (or lack thereof) is certainly not the reason.

Resources