i'm trying to develop a web service able to give me back the name of the administrative area that contains a given gps position.
I have already developed a java application able to insert some polygons (administrative areas of my country) in neo4j using spatial plugin and Java API. Then, giving a gps position, i'm able to get the name of the polygon that contains it.
Now i'm trying to do the same using REST API of Neo4j (instead of java api) but i'm not able to find any example.
So my questions are:
1) Is possible to insert polygons in Neo4j using REST API (if i well understood is possible using WKT format) ?
2) is possible to execute a spatial query that finds all polygons that contain a given gps position ?
thanks, Enrico
The answer to both of your questions is yes. Here are example steps that use REST and Cypher.
1) Create your spatial layer and index (REST). In this example, my index is named 'test' (a layer of the same name and base spatial nodes will be created), and the name of the property on my nodes that will contain the wkt geometry information is 'wkt'.
POST http://localhost:7474/db/data/index/node {"name":"test", "config":{"provider":"spatial", "wkt":"wkt"}}
2) Create a node (Cypher). You can have labels and various properties. The only part that Neo4j Spatial cares about is the 'wkt' property. (You could do this step with REST.)
CREATE (n { name : "Fooville", wkt : "POLYGON((11.0 11.0, 11.0 12.0, 12.0 12.0, 12.0 11.0, 11.0 11.0))" })
3) Add the node to the layer. You can do this by adding the node to the index or to the layer, but there is an important difference. If you add it to the index, a copy node containing only the geometry data will be created, and that will be added to the layer. Querying via Cypher will return your original node, but querying via REST or Java will return the copy node. If you add the node directly to the layer, then you must take an extra step if you want to be able to query with Cypher later. In both cases you will need the URI of the node, the last element of which is the Neo4j node number. In the example below, I assume the node number is 4 (which it will be if you do this example on a fresh, empty database).
Method 1:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addNodeToLayer { "layer":"test", "node":"http://localhost:7474/db/data/node/4" }
To make this node searchable via Cypher, add the node number to the node as a user 'id' property. (You could do this with REST.)
START n = node(4) SET n.id = id(n)
Method 2: Using this method will double your node count, double your WKT storage, and produce differing results when querying via REST vs Cypher.
POST http://localhost:7474/db/data/index/node/test {"value":"dummy","key":"dummy","uri":"http://localhost:7474/db/data/node/4"}
3) Run your query. You can do a query in REST or Cypher (assuming you conditioned the nodes as described above). The Cypher queries available are: 'withinDistance', 'withinWKTGeometry', and 'bbox'. The REST queries available are: 'findGeometriesWithinDistance', 'findClosestGeometries', and 'findGeometriesInBBox'. It's interesting to note that only Cypher allows you to query for nodes within a WKT geometry. There's also a difference in REST between the findClosestGeometries and findGeometriesWithinDistance that I don't yet understand, even though the arguments are the same. To see how to make the REST calls, you can issue these commands:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findClosestGeometries
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesInBBox
The Cypher queries are: (replace text between '<>', including the '<>', with actual values)
START n = node:<layer>("withinDistance:[<y>, <x>, <max distance in km>]")
START n = node:<layer>("withinWKTGeometry:POLYGON((<x1> <y1>, ..., <xN> <yN>, <x1> <y1>))")
START n = node:<layer>("bbox:[<min x>, <max x>, <min y>, <max y>]")
I have assumed in all of this that you are using a longitude/latitude coordinate reference system (CRS), so x is longitude and y is latitude. (This preserves a right-handed coordinate system in which z is up.)
Related
The answer to this question shows how to get a list of all nodes connected to a particular node via a path of known relationship types.
As a follow up to that question, I'm trying to determine if traversing the graph like this is the most efficient way to get all nodes connected to a particular node via any path.
My scenario: I have a tree of groups (group can have any number of children). This I model with IS_PARENT_OF relationships. Groups can also relate to any other groups via a special relationship called role playing. This I model with PLAYS_ROLE_IN relationships.
The most common question I want to ask is MATCH(n {name: "xxx") -[*]-> (o) RETURN o.name, but this seems to be extremely slow on even a small number of nodes (4000 nodes - takes 5s to return an answer). Note that the graph may contain cycles (n-IS_PARENT_OF->o, n<-PLAYS_ROLE_IN-o).
Is connectedness via any path not something that can be indexed?
As a first point, by not using labels and an indexed property for your starting node, this will already need to first find ALL the nodes in the graph and opening the PropertyContainer to see if the node has the property name with a value "xxx".
Secondly, if you now an approximate maximum depth of parentship, you may want to limit the depth of the search
I would suggest you add a label of your choice to your nodes and index the name property.
Use label, e.g. :Group for your starting point and an index for :Group(name)
Then Neo4j can quickly find your starting point without scanning the whole graph.
You can easily see where the time is spent by prefixing your query with PROFILE.
Do you really want all arbitrarily long paths from the starting point? Or just all pairs of connected nodes?
If the latter then this query would be more efficient.
MATCH (n:Group)-[:IS_PARENT_OF|:PLAYS_ROLE_IN]->(m:Group)
RETURN n,m
I'm using Neo4j to find Users who are in a 50 km radius and who are available on specific days.
This question is similar to this other question but Indexes have changed since Neo4J 2.0 so the solution does not work.
I use Neo4j 2.2.1, Neo4j-spatial 0.14 and py2neo / py2neo-spatial to interact with the graph.
To add user geometries to the graph I use:
def create_geo_node(graph, node, layer_name, name, latitude, longitude):
spatial = Spatial(graph)
layer = spatial.create_layer(layer_name)
node_id = node._id
shape = parse_lat_long(latitude, longitude)
spatial.create_geometry(geometry_name=name, wkt_string=shape.wkt, layer_name="Users", node_id=node_id)
..which creates the spatial nodes as wanted.
I then would like to query the graph by doing:
START user=node:Users('withinDistance:[4.8,45.8,100]')
MATCH period=(start_date:Date)-[:NEXT_DAY*]->(end_date:Date)
WHERE start_date.date="2014-03-03" AND end_date.date="2014-03-04"
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))
but the query returns:
Index `Users` does not exist
Is seems that the py2neo spatial.create_layer(..) creates the layer but not the Index (but should it? ..as Indexes are now a "Legacy" of Neo4j 1.*)
Using py2neo spatial find_within_distance works but since it uses the REST api I cannot make mixed requests which take into account other parameters
From what I understand, START is deprecated since Neo4j 2.0 but I am unable to find the correct Cypher query for withinDistance in Neo4j 2.2
Thank you in advance,
Benjamin
create_layer creates a "spatial" index which is different to Neo's Indexes. It actually creates a graph which models some bounding boxes for you so that you can then carry out spatial queries over your data. You don't need to reference this index directly. Think of it more like a GIS layer.
You can inspect your graph and dig out the Node attributes you need to write your own cypher query.
But you could also use the py2neo spatial API find_within_distance
http://py2neo.org/2.0/ext/spatial.html#py2neo.ext.spatial.plugin.Spatial.find_within_distance
Hope this helps.
I think that those links can be usefull :
* Neo4j Spatial 'WithinDistance' Cypher query returns empty while REST call returns data
* https://github.com/neo4j-contrib/spatial/issues/106
But the problem is not that you haven't got any result, but an error on the index ... that's why I'm perplex.
Can you test neo4j spatial directly with some REST query (to create layer & spatial node) to see what happen ?
Otherwise, for your question about cypher start condition, you just have to put this condition into the match one like this :
MATCH
user=node:Users('withinDistance:[4.8,45.8,100]'),
period=(start_date:Date {date:'2014-03-03'})-[:NEXT_DAY*]->(end_date:Date {date:'2014-03-04'})
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))
I created a spatial index in neo4j but when searching for nearby places I only get one result.
My query is:
START n=node:geom('withinDistance:[63.36, 10.35, 50.0]') RETURN n
And I have 3 nodes in the spatial index with this coords:
Node 1 lat,lon: 63.3654, 10.3578
Node 2 lat,lon: 63.3654, 10.3577
Node 3 lat,lon: 63.3654, 10.3578 (same node 1)
Theoretically the three nodes are in the same area.
Any idea?
UPDATE
I performed these steps to use spatial (all executed from neo4j browser -> rest api)
1) Index creation
:POST /db/data/index/node/
{
"name" : "geom",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"lat" : "lat",
"lon" : "lon"
}
}
2) Nodes creation (all in the same way)
:POST /db/data/node
{
"name":"Franciscatos Pizza",
"lat": 63.3654,
"lon": 10.3578
}
3) Node to spatial index
:POST /db/data/index/node/geom
{
"value":"dummy",
"key":"dummy"
"uri":"http://localhost:7474/db/data/node/8"
}
4) Node to layer
:POST /db/data/ext/SpatialPlugin/graphdb/addNodeToLayer
{
"layer":"geom",
"node":"http://localhost:7474/db/data/node/8"
}
Any API response are OK and all nodes indexed contain the :RTREE_REFERENCE relationship.
Depending on the distance parameter in the query, this returns me different nodes, but always one...
Darios,
First thing, don't do step 3). Steps 3) and 4) are somewhat redundant, but step 3) makes a copy of the geometry information in the node and creates a second node that is stored into the layer. Instead, do this new step 3).
START n = NODE(8)
SET n.id = ID(n)
This Cypher code adds an 'id' parameter on the node that contains the Neo4j node number. Once you do this, you can use the Cypher spatial index query. Note that the first line will have a different node number each time. This 'id' property is self-referential.
Alternatively, do your step 3), but don't do step 4). But then you won't get what you expect if you do a REST geometry query.
See if your results improve.
Grace and peace,
Jim
PS.
Michael,
There's actually two competing approaches in play with spatial right now. If you use addNodeToLayer to add your node to a layer (as in step 4), the node is linked into the RTree graph directly and Cypher queries won't find the node. This is also true if you are using Java. You can query via REST using findGeometriesWithinDistance and findGeometriesInBBox.
If you use the 'add the node to the spatial index' method to add your node to a layer (as in step 3), it doesn't actually add your node to the layer. A new node is made that contains a copy of the geometry properties on the original node and an 'id' property that contains the Neo4j node number of the original node, and this copy node is added to the RTree graph. The 'spatial index' does not actually contain a list of nodes. It is an access point for the spatial extension code. When you do a Cypher spatial query, the spatial extension finds the copy nodes that satisfy the query, then dereferences the 'id' properties on each to build a return list of original nodes.
It's the lack of the 'id' property to dereference that causes Cypher spatial index queries to fail if you add a node to a layer using step 4) alone. By adding the 'id' property, the dereference succeeds, and you get results from your query.
The shapefile importer links nodes directly into the RTree, and if you want to be able to do Cypher spatial index queries, you need to add the 'id' property to each node as I described. The OSM importer builds related 'domain' and geometry nodes, but I don't think it makes them accessible to Cypher-based queries. If you add the 'id' property to each geometry node, then they will be.
I may have missed it, but I haven't seen anyone point out that if you use the 'add the node to the spatial index' method, that you just doubled the number of nodes you have, as well as doubled the number of geometry properties stored in your database. Since there is no relationship built between the original nodes and the copy nodes, there is no way to access the geometry properties in the copy nodes, so you can't really delete the geometry properties from the original nodes.
As a result, I find it more desirable to add my nodes to the RTree graph directly and make them queryable (queriable?) through the Cypher spatial index by adding self-referential 'id' properties.
As for deleting nodes, there is no REST SpatialPlugin method for removing a node from a layer. If you add the node to the RTree graph using the REST spatial index method, then the REST call
:DELETE /db/data/index/node/geom/{ID}
will remove the node from the RTree, but there is a catch. You must get the Neo4j node number of the copy node in order for this to work! Which you can't in any straightforward way. If you manage to obtain the node number of the copy node, it will remove it from the RTree, but the copy node is not deleted.
Somewhat ironically, if you add the node to the RTree using addNodeToLayer and don't add the 'id' property, the call to remove the node from the index removes the node from the RTree. If you add the self-referential 'id' property and then remove the node from the index, the node is deleted. So every approach is flawed.
I am using neo4j 2.3 and found that step 3) is useless but not step 4), also if you do not clone the id as property the query from cypher do not work anymore ( return no results )
I am using Neo4j 1.8.2 with Neo4j Spatial 0.9 for 1.8.2 (http://m2.neo4j.org/content/repositories/releases/org/neo4j/neo4j-spatial/0.9-neo4j-1.8.2/)
Followed the example code from here http://architects.dzone.com/articles/neo4jcypher-finding-football with one change- instead of SpatialIndexProvider.SIMPLE_WKT_CONFIG, I used SpatialIndexProvider.SIMPLE_POINT_CONFIG_WKT
Everything works fine until you execute the following query:
START n=node:stadiumsLocation('withinDistance:[53.489271,-2.246704, 5.0]')
RETURN n.name, n.wkt;
n.name is null. When I explored the graph, I found this data:
Node[80]{lon:-2.20024,lat:53.483,id:79,gtype:1,bbox:-2.20024,53.483,-2.20024,53.483]}
Node[168]{lon:-2.29139,lat:53.4631,id:167,gtype:1,bbox:-2.29139,53.4631,-2.29139,53.4631]}
For Node 80 returned, it looks like this is the node created for the spatial record, which contains a property id:79. Node 79 is the actual stadium record from the example.
As per the source of IndexProviderTest, the comments
//We not longer need this as the node we get back already a 'Real' node
// Node node = db.getNodeById( (Long) spatialRecord.getProperty( "id" ) );
seem to indicate that this feature isn't available in the version I am using.
My question is, what is the recommended way to use withinDistance with other match conditions? There are a couple of other conditions to be fulfilled but I can't seem to get a handle on the actual node to actually match them.
Should I explicitly create relations? Not use Cypher and use the core API to do a traversal? Split the queries?
Two options:
a) Use GeoPipline.startNearestNeighborLatLonSearch to get a starting set of nodes, supply to subsequent Cypher query to do matching/filtering on other properties
b) Since my lat/longs are common across many entities [using centroid of an area], I can create a relation from the spatial node to all entities that are located in that area and then use one Cypher query such as:
START n=node:stadiumsLocation('withinDistance:[53.489271,-2.246704, 5.0]')
MATCH (n)<-[:LOCATED_IN]-(something)
WHERE something.someProp=5
RETURN something
As advised by Peter, went with option b.
Note though, there is no way to get the spatially indexed node back so that you can create relations from it. Had to do a withinDistance query for 0.0 distance.
can you execute the enhanced testcase I did at https://github.com/neo4j/spatial/blob/2803093d544f56d7dfe8f1d122e049fa73489d8a/src/test/java/org/neo4j/gis/spatial/IndexProviderTest.java#L199 ? It shows how to find a location, and traverse with cypher to the next node.
I have a large network stored in Neo4j. Based on a particular root node, I want to extract a subgraph around that node and store it somewhere else. So, what I need is the set of nodes and edges that match my filter criteria.
Afaik there is no out-of-the-box solution available. There is a graph matching component available, but it works only for perfect matches. The Neo4j API itself defines only graph traversal which I can use to define which nodes/edges should be visited:
Traverser exp = Traversal
.description()
.breadthFirst()
.evaluator(Evaluators.toDepth(2))
.traverse(root);
Now, I can add all nodes/edges to sets for all paths, but this is very inefficient. How would you do it? Thanks!
EDIT Would it make sense to add the last node and the last relationship of each traversal to the subgraph?
As for graph matching, that has been superseded by http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html which would fit nicely, and supports fuzzy matchin with optional relationships.
For subgraph representation, I would use the Cypher output to maybe construct new Cypher statements for recreating the graph, much like a SQL export, something like
start n=node:node_auto_index(name='Neo')
match n-[r:KNOWS*]-m
return "create ({name:'"+m.name+"'});"
http://console.neo4j.org/r/pqf1rp for an example
I solved it by constructing the induced subgraph based on all traversal endpoints.
Building the subgraph from the set of last nodes and edges of every traversal does not work, because edges that are not part of any shortest paths would not be included.
The code snippet looks like this:
Set<Node> nodes = new HashSet<Node>();
Set<Relationship> edges = new HashSet<Relationship>();
for (Node n : traverser.nodes())
{
nodes.add(n);
}
for (Node node : nodes)
{
for (Relationship rel : node.getRelationships())
{
if (nodes.contains(rel.getOtherNode(node)))
edges.add(rel);
}
}
Every edge is added twice. One time for the outgoing node and one time for the incoming node. Using a Set, I can ensure that it's in the collection only once.
It is possible to iterate over incoming/outgoing edges only, but it is unclear how loops (edge from a node to itself) are handled. To which category do they belong to? This snippet does not have this issue.
See dumping the database to cypher statements
dump START n=node({self}) MATCH p=(n)-[r:KNOWS*]->(m) RETURN n,r,m;
There's also an example for importing the subgraph of first database (db1) into a second (db2).