Neo4j modelling - sorting nodes ordered by distance - neo4j

I am using Neo4j with PHP.
In my project, I have restaurant nodes. Each node has latitude, longitude and taxonomy properties.
I need to return the restaurant nodes matching user's given taxonomy with results ordered by distance from user's location (that is nearest restaurant at the first).
What is the easiest solution?
I have worked on Mongo DB and Elasticsearch,this is very easy to achieve there using special indexing. But I could not find a straightforward way in Neo4j.

There are a couple of solutions :
Using neo4j spatial plugin : https://github.com/neo4j-contrib/spatial
Computing the distance yourself with haversin in Cypher : http://neo4j.com/docs/stable/query-functions-mathematical.html#functions-spherical-distance-using-the-haversin-function
In 3.0Mx there should be basic Cypher functions for point and distance : https://github.com/neo4j/neo4j/pull/5397/files (I didn't tested it though)

Besides the aforementioned Neo4j-Spatial, in Neo4j 3.0 there is also a built in distance() function.
See this GraphGist:
http://jexp.github.io/graphgist/idx?dropbox-14493611%2Fcypher_spatial.adoc
So if you find and match your restaurants some way you can order them by distance:
MATCH (a:Location), (b:Restaurant)
WHERE ... filtering ...
RETURN b
ORDER BY distance(point(a),point(b))

Neo4j Spatial features distance queries (among lots of other things) and also cares about ordering.

Related

How to search all nodes within particular radius using latitude and longitude in neo4j

I have two types of nodes Idea and Location, Idea contains some general information and Location node have 3 properties its id, latitude and longitude. The relationships between these nodes is of following types:
(i:Idea)-[:DEVELOPED_AT]->(l:Location)
(i: Idea)-[:DEPLOYED_AT]->(l:Location)
Now when a user search for any idea by geography using google place autocomple then I receive lat, long of searched location. Now I have to return all the related ideas either developed or deployed within a particular radius of that searched location. On searching I came accross spatial Neo4j but I don't know how to use it.
You have a few options here.
neo4j-spatial
As you mentioned, the Neo4j Spatial extension can be used for efficient geospatial indexing. One type of query that the spatial extension provides is withinDistance which will query for indexed nodes within a given radius. There are a few tutorials online that explain how to get started with neo4j spatial, but once you have it installed and added the nodes to the spatial index you can use a Cypher query like this to filter for nodes within 50km of a specified latitude, longitude:
// Find all Location nodes within 50km of specified lat/lon
START l=node:geom('withinDistance:[46.9163, -114.0905, 50.0]')
// Find all Idea nodes developed or deployed at these locations
MATCH (l)<-[:DEVELOPED_AT|:DEPLOYED_AT]-(i:Idea)
RETURN i
The spatial query is backed by an RTree index and is therefore efficient.
Compute distance
Another option is to use the Haversine formula to compute distance and use this as a filter. Note that this method will not be as efficient as the indexed backed neo4j-spatial approach as the distance will be computed for each Location node.
Since Cypher includes a haversin function this can be done using Cypher:
// Find all
WITH 46.9163 AS lat, -114.0905 AS lon
MATCH (l:Location)
WHERE 2 * 6371 * asin(sqrt(haversin(radians(lat - l.lat))+ cos(radians(lat))* cos(radians(l.lat))* haversin(radians(lon - l.lon)))) < 50.0
MATCH (l)<-[:DEVELOPED_AT|:DEPLOYED_AT]-(i:Idea)
RETURN i
UPDATE
With Neo4j > 3.4 you can use the built in Spatial index to achieve this
MATCH (i:Idea)-[:DEVELOPED_AT|:DEPLOYED_AT]->(l:Location)
WHERE distance(l.coord, point({ latitude: 46.9163}, longitude: -114.0905}})) < 50.0
Here 50.0 is assumed to be a meter value.
Just to mention when you create the location data you must mention the point index.
References:-
https://neo4j.com/docs/cypher-manual/current/functions/spatial/
https://neo4j.com/docs/cypher-manual/current/syntax/spatial/

Neo4j 2.2: Cypher Spatial request with other parameters returns Index does not exist

I'm using Neo4j to find Users who are in a 50 km radius and who are available on specific days.
This question is similar to this other question but Indexes have changed since Neo4J 2.0 so the solution does not work.
I use Neo4j 2.2.1, Neo4j-spatial 0.14 and py2neo / py2neo-spatial to interact with the graph.
To add user geometries to the graph I use:
def create_geo_node(graph, node, layer_name, name, latitude, longitude):
spatial = Spatial(graph)
layer = spatial.create_layer(layer_name)
node_id = node._id
shape = parse_lat_long(latitude, longitude)
spatial.create_geometry(geometry_name=name, wkt_string=shape.wkt, layer_name="Users", node_id=node_id)
..which creates the spatial nodes as wanted.
I then would like to query the graph by doing:
START user=node:Users('withinDistance:[4.8,45.8,100]')
MATCH period=(start_date:Date)-[:NEXT_DAY*]->(end_date:Date)
WHERE start_date.date="2014-03-03" AND end_date.date="2014-03-04"
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))
but the query returns:
Index `Users` does not exist
Is seems that the py2neo spatial.create_layer(..) creates the layer but not the Index (but should it? ..as Indexes are now a "Legacy" of Neo4j 1.*)
Using py2neo spatial find_within_distance works but since it uses the REST api I cannot make mixed requests which take into account other parameters
From what I understand, START is deprecated since Neo4j 2.0 but I am unable to find the correct Cypher query for withinDistance in Neo4j 2.2
Thank you in advance,
Benjamin
create_layer creates a "spatial" index which is different to Neo's Indexes. It actually creates a graph which models some bounding boxes for you so that you can then carry out spatial queries over your data. You don't need to reference this index directly. Think of it more like a GIS layer.
You can inspect your graph and dig out the Node attributes you need to write your own cypher query.
But you could also use the py2neo spatial API find_within_distance
http://py2neo.org/2.0/ext/spatial.html#py2neo.ext.spatial.plugin.Spatial.find_within_distance
Hope this helps.
I think that those links can be usefull :
* Neo4j Spatial 'WithinDistance' Cypher query returns empty while REST call returns data
* https://github.com/neo4j-contrib/spatial/issues/106
But the problem is not that you haven't got any result, but an error on the index ... that's why I'm perplex.
Can you test neo4j spatial directly with some REST query (to create layer & spatial node) to see what happen ?
Otherwise, for your question about cypher start condition, you just have to put this condition into the match one like this :
MATCH
user=node:Users('withinDistance:[4.8,45.8,100]'),
period=(start_date:Date {date:'2014-03-03'})-[:NEXT_DAY*]->(end_date:Date {date:'2014-03-04'})
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))

How to implement Dijkstra's algorithm in Neo4j using Cypher

My question is: is it possible to implement Dijkstra's algorithm using Cypher? the explanation on the neo4j website only talks about REST API and it is very difficult to understand for a beginner like me
Please note that I want to find the shortest path with the shortest distance between two nodes, and not the shortest path (involving least number of relationships) between two nodes. I am aware of the shortestPath algorithm that is very easy to implement using Cypher, but it does not serve my purpose.
Kindly guide me on how to proceed if I have a graph database with nodes, and relationships between the nodes having the property 'distance'. All I want is to write a code with the help of which we will be able to find out the shortest distance between two nodes in the database. Or any tips if I need to change my approach and use some other program for this?
In this case you can implement the allShortestPaths, ordering the paths in an ascending order based on your distance property of the relationships and return only one, based on your last post it would be something like this :
MATCH (from: Location {LocationName:"x"}), (to: Location {LocationName:"y"}) ,
paths = allShortestPaths((from)-[:CONNECTED_TO*]->(to))
WITH REDUCE(dist = 0, rel in rels(paths) | dist + rel.distance) AS distance, paths
RETURN paths, distance
ORDER BY distance
LIMIT 1
No, it's not possible in a reasonable way unless you use transactions and basically rewrite the algorhythm.
The previous answer is wrong as longer but less expensive paths will not be returned by the allShortestPaths subset. You will be filtering a subset of paths that have been chosen without considering relationship cost.

Nearest nodes to a give node, assigning dynamically weight to relationship types

I need to find the N nodes "nearest" to a given node in a graph, meaning the ones with least combined weight of relationships along the path from given node.
Is is possible to do so with a pure Cypher only solution? I was looking about path functions but couldn't find a viable way to express my query.
Moreover, is it possible to assign a default weight to a relationship at query time, according to its type/label (or somehow else map the relationship type to the weight)? The idea is to experiment with different weights without having to change a property for every relationship.
Otherwise I would have to change the weight property's value to each relationship and re-do it to before each query, which is very time-consuming (my graph has around 10M relationships).
Again, a pure Cypher solution would be the best, or please point me in the right direction.
Please use variable length Cypher queries to find the nearest nodes from a single node.
MATCH (n:Start { id: 0 }),
(n)-[:CONNECTED*0..2]-(x)
RETURN x
Note that the syntax [CONNECTED*0..2] is a range parameter specifying the min and max relationship distance from a given node, with relationship type CONNECTED.
You can swap this relationship type for other types.
In the case you wanted to traverse variably from the start node to surrounding nodes but constrain via a stop criteria to a threshold, that is a bit more difficult. For these kinds of things it is useful to get acquainted with Neo4j's spatial plugin. A good starting point to learn more about Neo4j spatial can be found in this blog post: http://neo4j.com/blog/neo4j-spatial-part1-finding-things-close-to-other-things
The post is a little outdated but if you do some Google searching you can find more updated materials.
GitHub repository: https://github.com/neo4j-contrib/spatial

Neo4j Spatial- two nodes created for every spatially indexed node

I am using Neo4j 1.8.2 with Neo4j Spatial 0.9 for 1.8.2 (http://m2.neo4j.org/content/repositories/releases/org/neo4j/neo4j-spatial/0.9-neo4j-1.8.2/)
Followed the example code from here http://architects.dzone.com/articles/neo4jcypher-finding-football with one change- instead of SpatialIndexProvider.SIMPLE_WKT_CONFIG, I used SpatialIndexProvider.SIMPLE_POINT_CONFIG_WKT
Everything works fine until you execute the following query:
START n=node:stadiumsLocation('withinDistance:[53.489271,-2.246704, 5.0]')
RETURN n.name, n.wkt;
n.name is null. When I explored the graph, I found this data:
Node[80]{lon:-2.20024,lat:53.483,id:79,gtype:1,bbox:-2.20024,53.483,-2.20024,53.483]}
Node[168]{lon:-2.29139,lat:53.4631,id:167,gtype:1,bbox:-2.29139,53.4631,-2.29139,53.4631]}
For Node 80 returned, it looks like this is the node created for the spatial record, which contains a property id:79. Node 79 is the actual stadium record from the example.
As per the source of IndexProviderTest, the comments
//We not longer need this as the node we get back already a 'Real' node
// Node node = db.getNodeById( (Long) spatialRecord.getProperty( "id" ) );
seem to indicate that this feature isn't available in the version I am using.
My question is, what is the recommended way to use withinDistance with other match conditions? There are a couple of other conditions to be fulfilled but I can't seem to get a handle on the actual node to actually match them.
Should I explicitly create relations? Not use Cypher and use the core API to do a traversal? Split the queries?
Two options:
a) Use GeoPipline.startNearestNeighborLatLonSearch to get a starting set of nodes, supply to subsequent Cypher query to do matching/filtering on other properties
b) Since my lat/longs are common across many entities [using centroid of an area], I can create a relation from the spatial node to all entities that are located in that area and then use one Cypher query such as:
START n=node:stadiumsLocation('withinDistance:[53.489271,-2.246704, 5.0]')
MATCH (n)<-[:LOCATED_IN]-(something)
WHERE something.someProp=5
RETURN something
As advised by Peter, went with option b.
Note though, there is no way to get the spatially indexed node back so that you can create relations from it. Had to do a withinDistance query for 0.0 distance.
can you execute the enhanced testcase I did at https://github.com/neo4j/spatial/blob/2803093d544f56d7dfe8f1d122e049fa73489d8a/src/test/java/org/neo4j/gis/spatial/IndexProviderTest.java#L199 ? It shows how to find a location, and traverse with cypher to the next node.

Resources