I'm pretty new to neo4j and the spatial plugin for it so bear with me.
I've used the OSM importer to import the whole of Ireland into the db and now I'm able to query it with the rest API to find nodes within X km of a point. (side note: I am unable to get the Cypher query to return any results? Does the OSMImporter add the data to an index for querying or must I loop through it all and add to an index myself now?)
What I actually want is a rudimental reverse geocoder style query. I want to query the graph for the geometry that contains a geo coordinate, check if this is a town/city/village etc, if not check its ancestors until it is able to tell me what town/county/state I am inside.
Unfortunately I'm quite lost and I've tried, unsuccessfully, looking through the neo4j-spatial code and examples for a start point.
Related
I am trying to analyze user behaviour on clickstream data using Neo4j and my use case is to create Flow-Chart/Sankey chart on user's journey from a specific node that user start browsing to N next clicks; my data model and query pattern is similar to this article.
MATCH p=((s:Page{type:"Home"})-[:NEXT*..10]->(d:Page))
where length(p)>3
RETURN p
I am unable to figure out approach for-
How to extract all pairs of nodes within p
I need count of each pairs from step1 to create Flow chart .
I tried various functions and able to retrieve above pairs but there few (<1% anomalies) circular references such that user1 and user2 reach same destination with exact reverse flow that break flow chart as flow charts are directional. Any suggestions on how to filter these anomalies.
Any suggestions on how to implement above would be appreciated. It need not be exact answer but pseudo code or reference articles would help. Thanks
I am using Neo4j with PHP.
In my project, I have restaurant nodes. Each node has latitude, longitude and taxonomy properties.
I need to return the restaurant nodes matching user's given taxonomy with results ordered by distance from user's location (that is nearest restaurant at the first).
What is the easiest solution?
I have worked on Mongo DB and Elasticsearch,this is very easy to achieve there using special indexing. But I could not find a straightforward way in Neo4j.
There are a couple of solutions :
Using neo4j spatial plugin : https://github.com/neo4j-contrib/spatial
Computing the distance yourself with haversin in Cypher : http://neo4j.com/docs/stable/query-functions-mathematical.html#functions-spherical-distance-using-the-haversin-function
In 3.0Mx there should be basic Cypher functions for point and distance : https://github.com/neo4j/neo4j/pull/5397/files (I didn't tested it though)
Besides the aforementioned Neo4j-Spatial, in Neo4j 3.0 there is also a built in distance() function.
See this GraphGist:
http://jexp.github.io/graphgist/idx?dropbox-14493611%2Fcypher_spatial.adoc
So if you find and match your restaurants some way you can order them by distance:
MATCH (a:Location), (b:Restaurant)
WHERE ... filtering ...
RETURN b
ORDER BY distance(point(a),point(b))
Neo4j Spatial features distance queries (among lots of other things) and also cares about ordering.
I'm using Neo4j to find Users who are in a 50 km radius and who are available on specific days.
This question is similar to this other question but Indexes have changed since Neo4J 2.0 so the solution does not work.
I use Neo4j 2.2.1, Neo4j-spatial 0.14 and py2neo / py2neo-spatial to interact with the graph.
To add user geometries to the graph I use:
def create_geo_node(graph, node, layer_name, name, latitude, longitude):
spatial = Spatial(graph)
layer = spatial.create_layer(layer_name)
node_id = node._id
shape = parse_lat_long(latitude, longitude)
spatial.create_geometry(geometry_name=name, wkt_string=shape.wkt, layer_name="Users", node_id=node_id)
..which creates the spatial nodes as wanted.
I then would like to query the graph by doing:
START user=node:Users('withinDistance:[4.8,45.8,100]')
MATCH period=(start_date:Date)-[:NEXT_DAY*]->(end_date:Date)
WHERE start_date.date="2014-03-03" AND end_date.date="2014-03-04"
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))
but the query returns:
Index `Users` does not exist
Is seems that the py2neo spatial.create_layer(..) creates the layer but not the Index (but should it? ..as Indexes are now a "Legacy" of Neo4j 1.*)
Using py2neo spatial find_within_distance works but since it uses the REST api I cannot make mixed requests which take into account other parameters
From what I understand, START is deprecated since Neo4j 2.0 but I am unable to find the correct Cypher query for withinDistance in Neo4j 2.2
Thank you in advance,
Benjamin
create_layer creates a "spatial" index which is different to Neo's Indexes. It actually creates a graph which models some bounding boxes for you so that you can then carry out spatial queries over your data. You don't need to reference this index directly. Think of it more like a GIS layer.
You can inspect your graph and dig out the Node attributes you need to write your own cypher query.
But you could also use the py2neo spatial API find_within_distance
http://py2neo.org/2.0/ext/spatial.html#py2neo.ext.spatial.plugin.Spatial.find_within_distance
Hope this helps.
I think that those links can be usefull :
* Neo4j Spatial 'WithinDistance' Cypher query returns empty while REST call returns data
* https://github.com/neo4j-contrib/spatial/issues/106
But the problem is not that you haven't got any result, but an error on the index ... that's why I'm perplex.
Can you test neo4j spatial directly with some REST query (to create layer & spatial node) to see what happen ?
Otherwise, for your question about cypher start condition, you just have to put this condition into the match one like this :
MATCH
user=node:Users('withinDistance:[4.8,45.8,100]'),
period=(start_date:Date {date:'2014-03-03'})-[:NEXT_DAY*]->(end_date:Date {date:'2014-03-04'})
UNWIND nodes(period) as nodes_in_period
OPTIONAL MATCH (nodes_in_period)<-[a:AVAILABLE]-(user:User)
RETURN user.uuid, count(a)/count(nodes(period))
i'm trying to develop a web service able to give me back the name of the administrative area that contains a given gps position.
I have already developed a java application able to insert some polygons (administrative areas of my country) in neo4j using spatial plugin and Java API. Then, giving a gps position, i'm able to get the name of the polygon that contains it.
Now i'm trying to do the same using REST API of Neo4j (instead of java api) but i'm not able to find any example.
So my questions are:
1) Is possible to insert polygons in Neo4j using REST API (if i well understood is possible using WKT format) ?
2) is possible to execute a spatial query that finds all polygons that contain a given gps position ?
thanks, Enrico
The answer to both of your questions is yes. Here are example steps that use REST and Cypher.
1) Create your spatial layer and index (REST). In this example, my index is named 'test' (a layer of the same name and base spatial nodes will be created), and the name of the property on my nodes that will contain the wkt geometry information is 'wkt'.
POST http://localhost:7474/db/data/index/node {"name":"test", "config":{"provider":"spatial", "wkt":"wkt"}}
2) Create a node (Cypher). You can have labels and various properties. The only part that Neo4j Spatial cares about is the 'wkt' property. (You could do this step with REST.)
CREATE (n { name : "Fooville", wkt : "POLYGON((11.0 11.0, 11.0 12.0, 12.0 12.0, 12.0 11.0, 11.0 11.0))" })
3) Add the node to the layer. You can do this by adding the node to the index or to the layer, but there is an important difference. If you add it to the index, a copy node containing only the geometry data will be created, and that will be added to the layer. Querying via Cypher will return your original node, but querying via REST or Java will return the copy node. If you add the node directly to the layer, then you must take an extra step if you want to be able to query with Cypher later. In both cases you will need the URI of the node, the last element of which is the Neo4j node number. In the example below, I assume the node number is 4 (which it will be if you do this example on a fresh, empty database).
Method 1:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addNodeToLayer { "layer":"test", "node":"http://localhost:7474/db/data/node/4" }
To make this node searchable via Cypher, add the node number to the node as a user 'id' property. (You could do this with REST.)
START n = node(4) SET n.id = id(n)
Method 2: Using this method will double your node count, double your WKT storage, and produce differing results when querying via REST vs Cypher.
POST http://localhost:7474/db/data/index/node/test {"value":"dummy","key":"dummy","uri":"http://localhost:7474/db/data/node/4"}
3) Run your query. You can do a query in REST or Cypher (assuming you conditioned the nodes as described above). The Cypher queries available are: 'withinDistance', 'withinWKTGeometry', and 'bbox'. The REST queries available are: 'findGeometriesWithinDistance', 'findClosestGeometries', and 'findGeometriesInBBox'. It's interesting to note that only Cypher allows you to query for nodes within a WKT geometry. There's also a difference in REST between the findClosestGeometries and findGeometriesWithinDistance that I don't yet understand, even though the arguments are the same. To see how to make the REST calls, you can issue these commands:
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findClosestGeometries
POST http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesInBBox
The Cypher queries are: (replace text between '<>', including the '<>', with actual values)
START n = node:<layer>("withinDistance:[<y>, <x>, <max distance in km>]")
START n = node:<layer>("withinWKTGeometry:POLYGON((<x1> <y1>, ..., <xN> <yN>, <x1> <y1>))")
START n = node:<layer>("bbox:[<min x>, <max x>, <min y>, <max y>]")
I have assumed in all of this that you are using a longitude/latitude coordinate reference system (CRS), so x is longitude and y is latitude. (This preserves a right-handed coordinate system in which z is up.)
I'd like to be able to order my search results by score and location. Each user in the DB has lat/lot and I am currently indexing:
location :coordinates do
Sunspot::Util::Coordinates.new latlon[0], latlon[1]
end
The model which I would performing the search against is also indexed in the same manner. Essentially what I am trying to achieve is that the results be ordered by score and then by location. So if I search for Walmart, I would like to see all Walmart's ordered by their geo proximity to my location.
I remember reading something about solr's new geo-sort but not sure if it is out of alpha and/or if sunspot has implemented a wrapper.
What would you recommend?
Because of the way that Sunspot calculates location types you'll need to do some extra leg work to have it sort by distance from your target as well. The way it works is that it creates a geo-hash for each point and then searches using regular fulltext search on that geo-hash. The result is that you probably won't be able to determine if a point 10km away is further than a point that is 5km away, but you'll be able to tell if a point 50km away is further than a point 1-2km away. The exact distances are arbitrary but the result is that you probably won't have as fine-grained of a result as you would like and the search acts more as a way to filter points that are within an acceptable proximity. After you have filtered your points using the built-in location search, there are three ways to accomplish what you want:
Upgrade to Solr 3.1 or later and upgrade your schema.xml to use the new spatial search columns. You'll then need to make custom modifications to Sunspot to create fields and orderings that work with these new data types. As far as I know these aren't available in Sunspot yet, so you'll have to make those connections on your own and you'll have to dig around in Solr to do some manual configurations.
Leverage the Spatial Solr Plugin. You'll have to install a new JAR into your Solr directory and you'll have to make some modifications to Sunspot, but they are relatively painless and the full instructions can be found here.
Leverage your DB, if your DB is also indexed on the location columns then you can use the Sunspot built-in location search to filter your results down to a reasonable sized set. You can then query the DB for those results and order them by proximity to your location using your own distance function.