How to load TDB storage with inference via tdbloader.bat (windows, Jena 2.7.3)?
I used this assembler file:
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#infModel> .
<#infModel> a ja:InfModel ;
ja:baseModel <#tdbGraph>;
ja:reasoner
[ ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner> ].
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:location "DB";
.
My command:
c:\apache-jena-2.7.3\bat>tdbloader --tdb=test.ttl C:\apache-jena-2.7.3\Lubm10\*
I got an exception:
java.lang.ClassCastException: com.hp.hpl.jena.reasoner.rulesys.FBRuleInfGraph cannot be cast to com.hp.hpl.jena.tdb.store.GraphTDB
What is wrong?
(removing semicolon after "DB" - does not help)
It's not clear what you are trying to achieve. tdbloader is a tool for loading triples into a TDB store, prior to processing those triples via your app or SPARQL end-point. Separately, from your app code, you can construct a Jena model which uses the inference engine over a base model from a TDB graph. But I can't see why you are using an inference model at load time. If you look at the exception you are getting:
FBRuleInfGraph cannot be cast to com.hp.hpl.jena.tdb.store.GraphTDB
it confirms that you can't use an inference graph at that stage of the process, and I'm not sure why you would. Unless, of course, you are trying to statically compute the inference closure over the base model and store that in TDB, saving the need for inference computation at runtime. However, if you are trying to do that, I don't believe that can currently be done via the Jena assembler. You'll have to write custom code to do that at the moment.
Bottom line: separate the concerns. Use a plain graph description for tdbloader, use the inference graph at run time.
Related
I want to use Jena Fuseki to construct a SPARQL endpoint for some ontology file.
and my fuseki config as follow:
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
<#service1> rdf:type fuseki:Service ;
fuseki:name "ds" ; # http://host:port/ds
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceQuery "query" ; # SPARQL query service (alt name)
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol (read and write)
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read only)
fuseki:dataset <#dataset> ;
.
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#inf_model> ;
.
<#mv_data_model> a ja:MemoryModel;
ja:content[ja:externalContent <file://D:/Program%20Files/d2rq-0.8.1/movie.nt>] ;
ja:content[ja:externalContent <file://D:/Program%20Files/apache-jena-fuseki-3.13.1/run/ontology.ttl>]
.
<#inf_model> a ja:InfModel ;
ja:baseModel <#mv_data_model>;
ja:reasoner [ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>] ;
#ja:reasoner [
# ja:reasonerURL <http://jena.hpl.hp.com/2003/GenericRuleReasoner> ;
# ja:rulesFrom <file://D:/Program%20Files/apache-jena-fuseki-3.13.1/run/rule.ttl>; ]
.
I run fuseki as a Standalone Server.when I close the OWL reasoner it works well.But once the OWL reasoner is enabled,the server has no response for the query,even the query like
SELECT ?s ?p ?o
WHERE {
?s ?p ?o
}
limit 10
has no response, and then throw a Exception: java.lang.OutOfMemoryError. However,the RuleReasoner works well.
And my ttl file has about 1500000 triples, is the data scale is too large for the OWL Reasoner to have a inference?
All work is done on my pc,can any friend offer me a help? Thanks
In fuseki, when running a Reasoner over a too big DataSet, the inferences will be applied to All Graph in query execution time. Besides that, all inferences will be materialized in Fuseki TDB case reasoner applies forward reasoning. It will burden the system, cause the graph will be to big to materialize and reason using RAM memory.
We have alredy crashed a machine dedicating 1 TD RAM to Fuseki.
A possible solution is to split your dataset into independent parts for tunning the queries.
For more information, look at hadoop and AllegroGraph solution for high-perfomance with Clusters
https://allegrograph.com/hadoop-and-allegrograph-the-semantic-data-lake-for-analytics/
It depends on your demand. In an unlimited scale, cluster solution seems to be the best, but maybe locally increasing the dedicated RAM memory to JVM solve your problem.
I have the following RDF data in my Fuseki triplestore.
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix schema: <http://schema.org/> .
#prefix ex: <http://localhost:3030/eb/> .
#prefix wgs84: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
ex:School rdf:type owl:Class .
<http://localhost:3030/eb/School/1> rdf:type ex:School ;
schema:name "Escola 1" .
ex:NewSchool rdf:type owl:Class .
<http://localhost:3030/eb/NewSchool/1> rdf:type ex:NewSchool ;
wgs84:lat "23.085980" ;
wgs84:long "-5.692" .
<http://localhost:3030/eb/School/1> owl:sameAs <http://localhost:3030/eb/NewSchool/1> .
I query like this:
SELECT ?predicate ?object
WHERE {
<http://localhost:3030/eb/School/1> ?predicate ?object
}
with the following result:
predicate object
<http://www.w3.org/2002/07/owl#sameAs> <http://localhost:3030/eb/NewSchool/1>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://localhost:3030/eb/Escola>
<http://schema.org/name> "Escola 1"
I would like to know what should I do to make the query return the wgs84:lat / wgs84:long values from the owl:sameAs instance? Is it possible using a SPARQL query?
What it is needed here is to edit the configuration files (inside the folder /run/configuration/datasetname.ttl), add and restart the Fuseki server.
:service1 a fuseki:Service ;
fuseki:dataset :inferred_dataset ;
:inferred_dataset a ja:RDFDataset ;
ja:defaultGraph :inference_model .
:inference_model a ja:InfModel ;
ja:baseModel :tdb_graph ;
ja:reasoner [
ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>
] .
:tdb_graph a tdb:GraphTDB ;
tdb:dataset :tdb_dataset_readwrite .
:tdb_dataset_readwrite
a tdb:DatasetTDB ;
tdb:location "[MyDatasetLocationOnDisk]" .
Some links on how to do that:
https://christinemdraper.wordpress.com/2017/04/09/getting-started-with-rdf-sparql-jena-fuseki/
https://github.com/jfmunozf/Jena-Fuseki-Reasoner-Inference/wiki/Configuring-Apache-Jena-Fuseki-2.4.1-inference-and-reasoning-support-using-SPARQL-1.1:-Jena-inference-rules,-RDFS-Entailment-Regimes-and-OWL-reasoning
https://gist.github.com/ruebot/fb7b1da82042860138d2d609756e07dc
configure fuseki with TDB2 and OWL Reasoner
Then it behaves just like intended in the question.
Just to remember, if one wants to link to a third party's vocabulary, one must download the file and load it into Fuseki, to make the infereces work.
May I know if Apahe JENA supports OWL 2 syntax in Java? It does mentioned that in the documentation (https://jena.apache.org/documentation/ontology/) it only provide limited cardinality restrictions. I would like to confirm this from the experts.
Apache Jena does not support OWL2, only OWL11 through org.apache.jena.ontology.OntModel interface. See also documentation.
But you still can work with OWL2 in Jena using some external jena-based APIs and tools, e.g. ONT-API, that is OWL-API-api(v5) impl over Jena.
In ONT-API there are two main OWL2 view of data, which encapsulate the same RDF Graph: com.github.owlcs.ontapi.jena.model.OntModel and com.github.owlcs.ontapi.Ontology (in older versions (ONT-API:v1.x.x) these classes have names ru.avicomp.ontapi.jena.model.OntGraphModel and ru.avicomp.ontapi.OntologyModel respectively).
The com.github.owlcs.ontapi.jena.model.OntModel view is a full analogue of Jena org.apache.jena.ontology.OntModel, it is the facility to work with triples.
And the com.github.owlcs.ontapi.Ontology view is an extended org.semanticweb.owlapi.model.OWLOntology, the facility to work with axiomatic data, that is backed by the com.github.owlcs.ontapi.jena.model.OntModel view and vice-versa.
For example, the following snippet:
String uri = "https://stackoverflow.com/questions/54049750";
String ns = uri + "#";
OntModel m = OntModelFactory.createModel()
.setNsPrefixes(OntModelFactory.STANDARD).setNsPrefix("q", ns);
m.setID(uri);
OntClass c = m.createOntClass(ns + "c");
OntObjectProperty p = m.createObjectProperty(ns + "p");
OntIndividual i = c.createIndividual(ns + "i");
m.createObjectComplementOf(m.createObjectUnionOf(c, m.getOWLThing(),
m.createObjectSomeValuesFrom(p, m.createObjectOneOf(i))));
m.write(System.out, "ttl");
will produce the following ontology:
#prefix q: <https://stackoverflow.com/questions/54049750#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<https://stackoverflow.com/questions/54049750>
a owl:Ontology .
q:c a owl:Class .
q:p a owl:ObjectProperty .
q:i a owl:NamedIndividual , q:c .
[ a owl:Class ;
owl:complementOf [ a owl:Class ;
owl:unionOf ( q:c owl:Thing
[ a owl:Restriction ;
owl:onProperty q:p ;
owl:someValuesFrom [ a owl:Class ;
owl:oneOf ( q:i )
]
]
)
]
] .
I'm creating an ontology using Apache Jena. However, I can't find a way of creating custom datatypes as in the following example:
'has value' some xsd:float[>= 0.0f , <= 15.0f].
Do you have any ideas?
It seems what you need is DatatypeRestriction with two facet restrictions: xsd:minInclusive and xsd:maxInclusive.
It is OWL2 constructions.
org.apache.jena.ontology.OntModel does not support OWL2, only OWL1.1 partially (see documentation), and, therefore, there are no builtin methods for creating such data-ranges (there is only DataOneOf data range expression, see OntModel#createDataRange(RDFList)).
So you have to create a desired datatype manually, triple by triple, using the general org.apache.jena.rdf.model.Model interface.
In RDF, it would look like this:
_:x rdf:type rdfs:Datatype.
_:x owl:onDatatype DN.
_:x owl:withRestrictions (_:x1 ... _:xn).
See also owl2-quick-guide.
Or, to build such an ontology, you can use some external utilities or APIs.
For example, in ONT-API (v. 2.x.x) the following snippet
String ns = "https://stackoverflow.com/questions/54131709#";
OntModel m = OntModelFactory.createModel()
.setNsPrefixes(OntModelFactory.STANDARD).setNsPrefix("q", ns);
OntDataRange.Named floatDT = m.getDatatype(XSD.xfloat);
OntFacetRestriction min = m.createFacetRestriction(OntFacetRestriction.MinInclusive.class,
floatDT.createLiteral("0.0"));
OntFacetRestriction max = m.createFacetRestriction(OntFacetRestriction.MaxInclusive.class,
floatDT.createLiteral("15.0"));
OntDataRange.Named myDT = m.createDatatype(ns + "MyDatatype");
myDT.addEquivalentClass(m.createDataRestriction(floatDT, min, max));
m.createResource().addProperty(m.createDataProperty(ns + "someProperty"),
myDT.createLiteral("2.2"));
m.write(System.out, "ttl");
will produce the following ontology:
#prefix q: <https://stackoverflow.com/questions/54131709#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
[ q:someProperty "2.2"^^q:MyDatatype ] .
q:MyDatatype a rdfs:Datatype ;
owl:equivalentClass [ a rdfs:Datatype ;
owl:onDatatype xsd:float ;
owl:withRestrictions ( [ xsd:minInclusive "0.0"^^xsd:float ]
[ xsd:maxInclusive "15.0"^^xsd:float ]
)
] .
q:someProperty a owl:DatatypeProperty .
I'm serving a dataset containing 10-20 named graphs from a TDB dataset in Fuseki 2.
I'd like to use a reasoner to do inference on my data. The behaviour I'd like to see is that triples inferred within each graph should appear within those graphs (although it would be fine if the triples appear in the default graph too).
Is there a simple way of configuring this? I haven't found any configuration examples that match what I am trying to do.
The configuration I've tried is very similar to the following standard example.
DatasetTDB -> GraphTDB -> InfModel -> RDFDataset
The final view of the data I see is only a very tiny subset of the data (it appears that all the named graphs are dropped somewhere along this pipeline, and only the tiny default graph is left).
Using tdb:unionDefaultGraph seems to have no effect on this.
prefix : <#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
# Example of a data service with SPARQL query and update on an
# inference model. Data is taken from TDB.
## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model base data is in TDB.
<#service2> rdf:type fuseki:Service ;
fuseki:name "inf" ; # http://host/inf
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceUpdate "update" ;
fuseki:dataset <#dataset> ;
.
<#dataset> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model_inf> ;
.
<#model_inf> a ja:InfModel ;
ja:baseModel <#tdbGraph> ;
ja:reasoner [
ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>
] .
## Base data in TDB.
<#tdbDataset> rdf:type tdb:DatasetTDB ;
tdb:location "DB" ;
# If the unionDefaultGraph is used, then the "update" service should be removed.
# tdb:unionDefaultGraph true ;
.
<#tdbGraph> rdf:type tdb:GraphTDB ;
tdb:dataset <#tdbDataset> .
</code>
Does anyone have any thoughts on this?
Also, bonus points if there's a way to make the dataset writable. (On some level, what I'm trying to do is approach the default behaviour of Owlim/GraphDB, which keeps persistent named graphs, does inferencing, and also allows for updates.)
Thanks in advance.
I'm facing (or faced) the same problems on my code, but I have a partial solution. Unfortunately the link provided in the comments did not really help the issues I'm still facing, but this answers part of the problem.
The final view of the data I see is only a very tiny subset of the
data (it appears that all the named graphs are dropped somewhere along
this pipeline, and only the tiny default graph is left). Using
tdb:unionDefaultGraph seems to have no effect on this.
The workaround I found for this is to explicitly 'register' your named graphs in the configuration file. I don't really know if it is the best way (and did not found any documentation or example for this exact context). A working example on my setup (Fuseki 2.4):
[usual configuration start]
# TDB Dataset
:tdb_dataset_readwrite
a tdb:DatasetTDB ;
tdb:unionDefaultGraph true ;
#if you want all data to available in the default graph
#without 'FROM-NAMing them' in the SPARQL query
tdb:location "your/dataset/path" .
# Underlying RDF Dataset
<#dataset>
rdf:type ja:RDFDataset ;
ja:defaultGraph <#model> ;
ja:namedGraph [
ja:graphName <your/graph/URI> ;
ja:graph <#graphVar>
] ;
[repeat for other named graphs]
.
######
# Default Model : Inference rules (OWL, here)
<#model> a ja:InfModel;
ja:baseModel <#tdbGraph>;
ja:reasoner
[ ja:reasonerURL
<http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>
]
.
# Graph for the default Model
<#tdbGraph> rdf:type tdb:GraphTDB;
tdb:dataset :tdb_dataset_readwrite .
######
# Named Graph
<#graphVar> rdf:type tdb:GraphTDB ;
tdb:dataset :tdb_dataset_readwrite ;
tdb:graphName <your/graph/URI>
.
Then, you can run a query like this one
[prefixes]
SELECT ?graph ?predicate ?object
WHERE {
GRAPH ?graph {[a specific entity identifier] ?predicate ?object}
}
LIMIT 50
And it will display (in this case) properties and values, and the source graph where they were found.
BUT: in this example, even if the default graph supposedly imported inference rules (that should be applied globally, especially since the unionDefaultGraph parameter is enabled), they are not applied in a "cross-graph" manner, and that is the problem I am still facing.
Normally, if you add the inference engine to every graph, this should work, according to Andy Seaborne's post here, but it doesn't work in my case.
Hope this helps nevertheless.
I've come across this issue many times myself but I've actually never seen a solution. However, I managed to figure it out after having read this in the documentation about "special graph names" in TDB datasets. From what I understand, setting the union default graph for a TDB dataset in the assembler file only changes what is returned when that particular dataset is queried. However, there is a special graph name that can be used to reference the union graph: <urn:x-arq:UnionGraph>. So, simply create GraphTDB, reference the TDB dataset and point it to this special graph.
The config file below does what is requested in the question: reasoning is performed over the default union graph, and the result is exposed in the TDB dataset as writable service. (Note that the reasoning service will not see any changes in the dataset until it is reloaded, since reasoning is all done in memory).
#prefix : <http://base/#> .
#prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
#prefix fuseki: <http://jena.apache.org/fuseki#> .
# TDB
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
# Service 1: Dataset endpoint (no reasoning)
:dataService a fuseki:Service ;
fuseki:name "tdbEnpoint" ;
fuseki:serviceQuery "sparql", "query" ;
fuseki:serviceUpdate "update" ;
fuseki:dataset :tdbDataset ;
.
# Service 2: Reasoning endpoint
:reasoningService a fuseki:Service ;
fuseki:dataset :infDataset ;
fuseki:name "reasoningEndpoint" ;
fuseki:serviceQuery "query", "sparql" ;
fuseki:serviceReadGraphStore "get" ;
.
# Inference dataset
:infDataset rdf:type ja:RDFDataset ;
ja:defaultGraph :infModel ;
.
# Inference model
:infModel a ja:InfModel ;
ja:baseModel :g ;
ja:reasoner [
ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLFBRuleReasoner> ;
] ;
.
# Intermediate graph referencing the default union graph
:g rdf:type tdb:GraphTDB ;
tdb:dataset :tdbDataset ;
tdb:graphName <urn:x-arq:UnionGraph> ;
.
# The location of the TDB dataset
:tdbDataset rdf:type tdb:DatasetTDB ;
tdb:location "/fuseki/databases/db" ;
tdb:unionDefaultGraph true ;
.