Apache Jena: adding GraphNode (N3 formula) to Model (ARQInternalErrorException) - jena

I'm attempting to add a Graph node to a Model (N3 formula)
The output should be roughly like the below:
<http://localhost:8080/patches/#cf0ba48fa8b0421c8b025c3ea6b41a4f>
a <http://www.w3.org/ns/solid/terms#Patch> ;
<http://www.w3.org/ns/solid/terms#patches>
<http://example.com/#me> ;
<http://www.w3.org/ns/solid/terms#inserts> { http://example.com/#me http://example.com/#property http://example.com/#resource } .
Note that the #inserts property here is a "GraphNode" or N3 formula, I'm modelling a graph of the triples to write when a Task is complete
I tried the code below:
Model out = ModelFactory.createDefaultModel();
out.add(endState, SolidTerms.inserts, out.asRDFNode(NodeFactory.createGraphNode(inserts.getGraph())));
but this throws [org.apache.jena.sparql.ARQInternalErrorException: Unknown node type: {http://example.com/#me #http://example.com/#property http://example.com/#resource}]
I've tried many other ways of putting the Graph in the model with no success, posting this one because it was my first and I think the most rational. Maybe it's a bug or Jena just doesn't support the GraphNode type in RDFNode ?
The # in the output puzzled me a bit, but I checked that it's not included in my property definition, it's maybe added in by the createGraphNode or the error serialization

Related

Blank nodes generating when adding object properties to the ontology

I have an ontology in Protege.
When I add an object property like X worksFor Y, and then load the rdf to graphdb, it generates 3 triples with subject = blank node, property = owl:someValuesFrom, owl:onProperty, owl:rdfType, and then it adds a triple that states X rdf:subClassOf Y.
Is this correct?
What is the logic behind this?
Here is an example of what I'm doing:
This is the ontology in Protege. I made a small version that addresses this specific issue. I save it as rdf and then load it in GraphDb
And here is what I get in GraphDb after loading the rdf from the ontology.
I hope this helps to better understand the question.
The query output that you obtain is perfectly meaningful.
By stating that personaCliente (subject) is a SubClass Of (predicate) worksFor some empresaCliente (object), you're saying that if p is a client person then it must work for some client company.
Note that the object is not a simple super-class, but a complex class expressed by a property restriction.
In other words, you're stating that every client person p works for some blank node _, such that _ is a client company. If you know description logics, read this as persona ⊑ ∃worksFor.empresaCliente.
Now, by querying ?s ?p ?o, you're searching for all the possible triples of your ontology.
Let's focus on the following subset of results:
row s p o
1 _:node31 owl:someValuesFrom :empresaCliente
2 _:node31 owl:onProperty :worksFor
3 _:node31 rdf:type owl:Restriction
9 :personaCliente rdfs:subClassOf _:node31
This bunch of triples means the same as above: every personaCliente is a subClassOf a certain blank node [9], such that this blank node is a subclassOf owl:Restriction (which is a particular OWL class) [3]. This restriction involves property worksFor [2] and states that its range, in this particular case, must be empresaCliente [1].
Further reading:
https://www.w3.org/TR/owl2-syntax/#Object_Property_Restrictions
https://www.cs.vu.nl/~guus/public/owl-restrictions/

How to express multiple property set criteria for node selection using gremlin query

Here is my simplified graph schema,
package:
property:
- name: str (indexed)
- version: str (indexed)
I want to query the version using multiple set of property criteria within single query. I can use within for a list of single property, but how to do it for multiple properties?
Consider I have 10 package nodes, (p1,v1, p2,v2, p3,v3,.. p10,v10)
I want to select only nodes which has (p1 with v1, p8 with v8, p10 with v10)
Is there a way to do with single gremlin query?
Something equivalent to SELECT * from package WHERE (name, version) in ((p1,v1),(p8,v8),(p10,v10)).
It's always best to provide some sample data when asking questions about Gremlin. I assume that this is an approximation of what your model is:
g.addV('package').property('name','gremlin').property('version', '1.0').
addV('package').property('name','gremlin').property('version', '2.0').
addV('package').property('name','gremlin').property('version', '3.0').
addV('package').property('name','blueprints').property('version', '1.0').
addV('package').property('name','blueprints').property('version', '2.0').
addV('package').property('name','rexster').property('version', '1.0').
addV('package').property('name','rexster').property('version', '2.0').iterate()
I don't think that there is a way that you can compare pairs of inputs and expect an index hit. You therefore have to do what you normally do in graphs and choose the index to best narrow your results before you filter in memory. I would assume that in your case this would be the "name" property, therefore grab those first then filter the pairs:
gremlin> g.V().has('package','name', within('gremlin','blueprints')).
......1> elementMap().
......2> where(select('name','version').is(within([name:'gremlin',version:'2.0'], [name:'blueprints',version:'2.0'])))
==>[id:3,label:package,name:gremlin,version:2.0]
==>[id:12,label:package,name:blueprints,version:2.0]
this might not be the most "creative" way of doing that,
but I think that the easiest way would be to use or:
g.V().or(
hasLabel('v1').has('prop', 'p1'),
hasLabel('v8').has('prop', 'p8'),
hasLabel('v10').has('prop', 'p10')
)
example: https://gremlify.com/6s

Creating relationship queries in py2neo.ogm

I am using the py2neo.ogm api to construct queries of my IssueOGM class based on its relationship to another class.
I can see why this fails:
>>> list(IssueOGM.select(graph).where(
... "_ -[:HAS_TAG]- (t:TagOGM {tag: 'critical'})"))
Traceback (most recent call last):
...
py2neo.database.status.CypherSyntaxError: Variable `t` not defined (line 1, column 42 (offset: 41))
"MATCH (_:IssueOGM) WHERE _ -[:HAS_TAG]- (t:TagOGM {tag: 'critical'}) RETURN _"
Is there a way using the OGM api to create a filter that is interpreted as this?
"MATCH (_:IssueOGM) -[:HAS_TAG]- (t:TagOGM {tag: 'critical'}) RETURN _"
Like an ORM, the OGM seems to be really good for quickly storing and/or retrieving nodes from your graph, and saving special methods and so forth to make each node 'work' nicely in your application. In this instance, you could use the RelatedFrom class on TagOGM to list all the issues tagged with a particular tag. However, this approach can sometimes lead to making lots of inadvertent db calls without realising (especially in a big application).
Often for cases like this (where you're looking for a pattern rather than a specific node), I'd recommend just writing a cypher query to get the job done. py2neo.ogm actually makes this remarkably simple, by allowing you to store it as a class method of the GraphObject. In your example, something like the following should work. Writing similar queries in the future will also allow you to search based on much more complex criteria and leverage the functionality of neo4j and cypher to make really complex queries quickly in a single transaction (rather than going back and forth to the db as you manipulate an OGM object).
from py2neo import GraphObject, Property
class TagOGM(GraphObject):
name = Property()
class IssueOGM(GraphObject):
name = Property()
time = Property()
description = Property()
#classmethod
def select_by_tag(cls, tag_name):
'''
Returns an OGM instance for every instance tagged a certain way
'''
q = 'MATCH (t:TagOGM { name: {tag_name} })<-[:HAS_TAG]-(i:IssueOGM) RETURN i'
return [
cls.wrap(row['i'])
for row in graph.eval(q, { 'tag_name': tag_name }).data()
]

Find path in Neo4j with directed edges

This is my first attempt at Neo4j, please excuse me if I am missing something very trivial.
Here is my problem:
Consider the graph as created in the following Neo4j console example:
http://console.neo4j.org/?id=y13kbv
We have following nodes in this example:
(Person {memberId, memberName, membershipDate})
(Email {value, badFlag})
(AccountNumber {value, badFlag})
We could potentially have more nodes capturing features related to a Person like creditCard, billAddress, shipAddress, etc.
All of these nodes will be the same as Email and AccountNumber nodes:
(creditCard {value, badFlag}), (billAddress {value, badFlag}),etc.
With the graph populated as seen in the Neo4j console example, assume that we add one more Person to the graph as follows:
(p7:Person {memberId:'18' , memberName:'John', membershipDate:'12/2/2015'}),
(email6:Email {value: 'john#gmail.com', badFlag:'false'}),
(a2)-[b13:BELONGS_TO]->(p7),
(email6)-[b14:BELONGS_TO]->(p7)
When we add this new person to the system, the use case is that we have to check if there exists a path from features of the new Person ("email6" and "a2" nodes) to any other node in the system where the "badFlag=true", in this case node (a1 {value:1234, badFlag:true}).
Here, the resultant path would be (email6)-[BELONGS_TO]->(p7)<-[BELONGS_TO]-(a2)-[BELONGS_TO]->(p6)<-[BELONGS_TO]-(email5)-[BELONGS_TO]->(p5)<-[BELONGS_TO]-(a1:{badFlag:true})
I tried something like this:
MATCH (newEmail:Email{value:'john#gmail.com'})-[:BELONGS_TO]->(p7)-[*]-(badPerson)<-[:BELONGS_TO]-(badFeature{badFlag:'true'}) RETURN badPerson, badFeature;
which seems to work when there is only one level of chaining, but it doesn't work when the path could be longer like in the case of Neo4j console example.
I need help with the Cypher query that will help me solve this problem.
I will eventually be doing this operation using Neo4j's Java API using my application. What could be the right way to go about doing this using Java API?
You had a typo in you query. PART_OF should be BELONGS_TO. This should work for you:
MATCH (newEmail:Email {value:'john#gmail.com'})-[:BELONGS_TO]->(p7)-[*]-(badPerson)<-[:BELONGS_TO]-(badFeature {badFlag:'true'})
RETURN badPerson, badFeature;
Aside: You seem to use string values for all properties. I'd replace the string values 'true' and 'false' with the boolean values true and false. Likewise, values that are always numeric should just use integer or float values.

Cypher query with literal map syntax & dynamic keys

I'd like to make a cypher query that generates a specific json output. Part of this output includes an object with a dynamic amount of keys relative to the children of a parent node:
{
...
"parent_keystring" : {
child_node_one.name : child_node_one.foo
child_node_two.name : child_node_two.foo
child_node_three.name : child_node_three.foo
child_node_four.name : child_node_four.foo
child_node_five.name : child_node_five.foo
}
}
I've tried to create a cypher query but I do not believe I am close to achieving the desired output mentioned above:
MATCH (n)-[relone:SPECIFIC_RELATIONSHIP]->(child_node)
WHERE n.id='839930493049039430'
RETURN n.id AS id,
n.name AS name,
labels(n)[0] AS type,
{
COLLECT({
child.name : children.foo
}) AS rel_two_representation
} AS parent_keystring
I had planned for children.foo to be a count of how many occurrences of each particular relationship/child of the parent. Is there a way to make use of the reduce function? Where a report would generate based on analyzing the array proposed below? ie report would be a json object where each key is a distinct RELATIONSHIP and the property value would be the amount of times that relationship stems from the parent node?
Thank you greatly in advance for guidance you can offer.
I'm not sure that Cypher will let you use a variable to determine an object's key. Would using an Array work for you?
COLLECT([child.name, children.foo]) AS rel_two_representation
I think, Neo4j Server API output by itself should be considered as any database output (like MySQL). Even if it is possible to achieve, with default functionality, desired output - it is not natural way for database.
Probably you should look into creating your own server plugin. This allows you to implement any custom logic, with desired output.

Resources