Apache Jena Split Graph based on Parent - jena

I am learning Apache Jena. I am able to get Statements, Objects, Subject and graph.
i want to split the graph based on parent example: B and C are child of A and further more D is Child of B. Again in Same graph A1 if parent of B1,C1 and D1. So I want to get details of Graph into 2 different Sections.
A -> [B -> [D],C] and A1 -> [B1,C1,D1]
I am using Apache Jena API
Let me know if these is any other api available to process and am reading Data from Turtle File.
Thanks

Related

Clickstream flow analysis using neo4j cypher

I am trying to analyze user behaviour on clickstream data using Neo4j and my use case is to create Flow-Chart/Sankey chart on user's journey from a specific node that user start browsing to N next clicks; my data model and query pattern is similar to this article.
MATCH p=((s:Page{type:"Home"})-[:NEXT*..10]->(d:Page))
where length(p)>3
RETURN p
I am unable to figure out approach for-
How to extract all pairs of nodes within p
I need count of each pairs from step1 to create Flow chart .
I tried various functions and able to retrieve above pairs but there few (<1% anomalies) circular references such that user1 and user2 reach same destination with exact reverse flow that break flow chart as flow charts are directional. Any suggestions on how to filter these anomalies.
Any suggestions on how to implement above would be appreciated. It need not be exact answer but pseudo code or reference articles would help. Thanks

How to send out a collection of nodes via neo4j-stream to kafka

We want to use Neo4J to build a hierarchy (tree-structure) of (product) categories. Our data enters from a Kafka (Sink Connector). We plan to use Neo4J Streams Source to live stream the updates on our category-tree back onto Kafka, using neo4j-4.0.3.
Normally, the - Streams Source - way to go is to specify a pattern and link it to a Kafka topic, as explained here: https://neo4j.com/docs/labs/neo4j-streams/current/#neo4j_streams_source
In order to leverage the power of Neo4J we'd like to send a collection of nodes each time at once. This collection consists of all the nodes on a - query that returns a - path: nodes(path). More specifically, such a collection of nodes represents a path from leaf to root.
Two non-working alternatives we could think of:
Use a pattern definition. From what I understand a pattern seems to be limited to match a single node only (possibly a node with multiple Labels) - https://neo4j.com/docs/labs/neo4j-streams/current/#source-patterns - and hence we can only stream out one node at a time.
Stream the collection of nodes (on the path) back into a node with a different label, and use that output node label to pattern match {*}. Neo4J being a property oriented graph database, it does not allow me to write 'a collection of nodes' into one node.
To summarize, what we want is to stream out a collection of nodes into each Kafka record.
Any suggestions on how we can achieve this?
The streams.publish procedure will send any arbitrary data that you can format with Cypher to the topic of your choosing. It's just up to you to format the data as you wish.
I'm imagining something like this:
MATCH (a:MyLabel { id: 'startingPoint' })
WITH a
MATCH p=shortestPath((a)-[:REL*]->(b:MyLabel { id: 'EndingPoint' }))
UNWIND nodes(p) as node
WITH collect({
my: 'custom-object',
prop: node.prop
}) as recordsToSendToKafka
CALL streams.publish('my-topic', recordsToSendToKafka)
This would send an array of JSON records formatted as you choose, from the original matched path.
Note that using APOC triggers, you can do these kinds of things in response to other transactions within Neo4j, and so this doesn't have to be a one-and-done manual execute query pattern.
tl;dr if you can match anything out of the database, you can use cypher to reformat it into JSON objects and dispatch any data to any topic on Kafka.

Protege Ontology - creating individuals

I'm doing for the first time an ontology in Protege, but I have never worked with it.
I have a manufacturing process, where I have two robots, a machine tool, two storages (S1 and S2), a working table, a computer vision system, a conveyor and 6 types of pieces (A, B, C, D, E, F). I have some goals set (ex: Storage S2 must have a piece of type A in position (row, column) (1,4) with orientation orientation1. I though to create a class for Robot which will have the following properties: hasState (the robot can be free or can have a piece), hasPosition (the robot can be in four predefined positions) and hasPiece.
The question is the following: when I will create the individuals for the two robots, what I will set in the hasPiece properties? I need to create the ontology in Protege and after that, to create a CLIPS program that will resolve the problem(will move the pieces from the storage S1 in storage S2 in the desired positions). Will the individuals be the initial facts? I only saw examples of ontologies for pizza and countries and these didn't have properties that will be modified during CLIPS program running.
Will the individuals be the initial facts?
I would assume so from your description.
Individuals and properties are created the same way no matter how they will be subsequently modified. I would assume that all you need to change from the pizza example is the name of properties, classes and individuals required.

Retrieving "when" a property was added to an ontology

I was wondering if it was possible to get the exact time stamp in a dateTime format for when a particular property, object or data, was added to the ontology. For example if I have three owl individuals A, B and C and through my code in either OWL API or Apache Jena I add a property relatedTo to the ontology and create the assertions A relatedTo B and A relatedTo C, is there some function I can call on A to see that A relatedTo B was asserted at some hh:mm:ss dd:mm:yyyy?
Thanks in advance for any help.
Not in any API that I'm familiar with - by default, OWL does not record any such information. You could build a pattern to add a timestamp to each axiom (adding an axiom annotation to each axiom) but it would only be available for data you produce.
Some OWL storage systems might have that information internally, but I'm not aware of any of them exposing this for SPARQL interrogation.

neo4j spatial contain search

i'm trying to develop a web service able to give me back the name of the administrative area that contains a given gps position.
I have already developed a java application able to insert some polygons (administrative areas of my country) in neo4j using spatial plugin and Java API. Then, giving a gps position, i'm able to get the name of the polygon that contains it.
Now i'm trying to do the same using REST API of Neo4j (instead of java api) but i'm not able to find any example.
So my questions are:
1) Is possible to insert polygons in Neo4j using REST API (if i well understood is possible using WKT format) ?
2) is possible to execute a spatial query that finds all polygons that contain a given gps position ?
thanks, Enrico
The answer to both of your questions is yes. Here are example steps that use REST and Cypher.
1) Create your spatial layer and index (REST). In this example, my index is named 'test' (a layer of the same name and base spatial nodes will be created), and the name of the property on my nodes that will contain the wkt geometry information is 'wkt'.
POST http://localhost:7474/db/data/index/node {"name":"test", "config":{"provider":"spatial", "wkt":"wkt"}}
2) Create a node (Cypher). You can have labels and various properties. The only part that Neo4j Spatial cares about is the 'wkt' property. (You could do this step with REST.)
CREATE (n { name : "Fooville", wkt : "POLYGON((11.0 11.0, 11.0 12.0, 12.0 12.0, 12.0 11.0, 11.0 11.0))" })
3) Add the node to the layer. You can do this by adding the node to the index or to the layer, but there is an important difference. If you add it to the index, a copy node containing only the geometry data will be created, and that will be added to the layer. Querying via Cypher will return your original node, but querying via REST or Java will return the copy node. If you add the node directly to the layer, then you must take an extra step if you want to be able to query with Cypher later. In both cases you will need the URI of the node, the last element of which is the Neo4j node number. In the example below, I assume the node number is 4 (which it will be if you do this example on a fresh, empty database).
Method 1:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addNodeToLayer { "layer":"test", "node":"http://localhost:7474/db/data/node/4" }
To make this node searchable via Cypher, add the node number to the node as a user 'id' property. (You could do this with REST.)
START n = node(4) SET n.id = id(n)
Method 2: Using this method will double your node count, double your WKT storage, and produce differing results when querying via REST vs Cypher.
POST http://localhost:7474/db/data/index/node/test {"value":"dummy","key":"dummy","uri":"http://localhost:7474/db/data/node/4"}
3) Run your query. You can do a query in REST or Cypher (assuming you conditioned the nodes as described above). The Cypher queries available are: 'withinDistance', 'withinWKTGeometry', and 'bbox'. The REST queries available are: 'findGeometriesWithinDistance', 'findClosestGeometries', and 'findGeometriesInBBox'. It's interesting to note that only Cypher allows you to query for nodes within a WKT geometry. There's also a difference in REST between the findClosestGeometries and findGeometriesWithinDistance that I don't yet understand, even though the arguments are the same. To see how to make the REST calls, you can issue these commands:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findClosestGeometries
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesInBBox
The Cypher queries are: (replace text between '<>', including the '<>', with actual values)
START n = node:<layer>("withinDistance:[<y>, <x>, <max distance in km>]")
START n = node:<layer>("withinWKTGeometry:POLYGON((<x1> <y1>, ..., <xN> <yN>, <x1> <y1>))")
START n = node:<layer>("bbox:[<min x>, <max x>, <min y>, <max y>]")
I have assumed in all of this that you are using a longitude/latitude coordinate reference system (CRS), so x is longitude and y is latitude. (This preserves a right-handed coordinate system in which z is up.)

Resources