searching for closest number in a rails app - ruby-on-rails

Give two parameters which correspond to two attributes on an object how can one find 20 records in a database that are closest to those two numbers.
The parameters you have are x, and y. The object also has those attributes. For example. x = 1, and y = 9999. You need to find the record that is the closest to x and y.

That depends on how you define the distance between two points. If you are using a two-dimensional cartesian coordinate system, this SQL statement will work:
SELECT id, x, y FROM points ORDER BY SQRT(POWER((X-x),2)+POWER((Y-y),2)) ASC LIMIT 20;
Where X,Y are the inputs.

It sounds like you're using geolocated data. If your database backend is Postgres, check to see if you have or can install the PostGIS extensions. This gives you very fast tools which give you searches like 'search for the nearest thing to this point', 'search for everything within this circle', 'search for everything within this square', and so on.
http://postgis.refractions.net/
You would do something like this:
CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometrycolumn] gist_geometry_ops);
Then you can do something like this - find everything within 100 metres of a point:
SELECT * FROM GEOTABLE WHERE
GEOM && GeometryFromText(’BOX3D(900 900,1100 1100)’,-1) AND
Distance(GeometryFromText(’POINT(1000 1000)’,-1),GEOM) < 100;
Examples from the manual.

Related

Neo4j generate heatmap data from X and Y coordinates

I need to generate heatmap data in CSV format, something like this:
X,Y,OCCURRENCES
269,697,41
199,493,8
125,318,2
205,526,24
261,572,2
My neo4j database has an entity called "Point" that contains a date, an X and a Y coordinate and it looks like this:
Point: {
"at": "2018-06-26T06:54:42.671141000+12:00"
"locationPlanX": 367,
"locationPlanY": 716
}
I have a query that gives the desired output, it works well with a few thousands of points but it starts to struggle with millions.
Query:
MATCH (point:Point)
WHERE datetime("2018-06-22T15:00:00.000000+12:00") <= point.at < datetime("2018-06-23T16:00:00.000000+12:00")
AND point.locationPlanX >= 0
AND point.locationPlanY >= 0
WITH point.locationPlanX as x, point.locationPlanY as y, COUNT(point) AS occurrences
RETURN x, y, occurrences
As I said before, the query works well for an hour of data, but it starts to struggle with days/weeks.
Is there any other thing I can do to improve my query? Or any other way to do it?
UPDATE: The 3 properties in the node are indexed.
You should create an index on :Point(at):
CREATE INDEX ON :Point(at);
That would allow your query to avoid scanning through every Point node to find the ones with acceptable at values. This should greatly speed up your query.
Also, if it is not necessary to test locationPlanX and locationPlanY for non-negativity, eliminate those tests.

neo4j - shortest path with conditions include plugin functions

I have a problem with the implementation in cypher. My problem is this: I have a database model, which is photographed here as an overview: https://www.instpic.de/QTIhBbPgVHBHg5pKwVdk.PNG
Short for the explanation. The red nodes simulate star systems, the yellow one jump points. Each jump point has a certain size, which determines which body can pass the point. The size is stored as a property at the relation between the yellow nodes. Among the red nodes are other nodes that represent the orbital celestial bodies of a star system. (Planets, moons, stations, etc.) Now, from any point within a solar system (planet, station, moon), I would like to find the shortest path to another lying point in the same solar system or another. In addition, I can calculate the distance of two celestial bodies within a system using the plugin that I have programmed. This value should now be included in finding the path, so I have the shortest path on the database and also the smallest distance between the celestial bodies within a solar system. I already have a query, unfortunately it fails partly because of its performance. I also think that the paths here are very variable, so a change to the database model is well considered.
Here is a part of my acutal query iam using:
MATCH (origin:Marketplace)
WHERE origin.eid = 'c816c4fa501244a48292f5d881103d7f'
OPTIONAL MATCH (marketplace:Marketplace)-[:Sell]->(currentPrice:Price)-[:Content]->(product:Product)
OPTIONAL MATCH p = shortestPath((origin)-[:HasMoon|:HasStation|:HasLandingZone|:HasPlanet|:HasJumpPoint|:CanTravel*]-(marketplace))
WHERE SIZE([rel in relationships(p) WHERE EXISTS(rel.size)]) <= 3 AND ALL(rel IN [rel in relationships(p) WHERE EXISTS(rel.size)] WHERE rel.size IN ['small', 'medium', 'large'])
WITH origin, marketplace, p, currentPrice, product
CALL srt.getRoutes(origin, marketplace, p) YIELD node, jump_sizes, jump_gates, jump_distance, hops, distance
OPTIONAL MATCH (currentPrice)-[:CompletedVotes]->(:Wrapper)-[:CompletedVote]->(voteHistory:CompletedVote)
OPTIONAL MATCH (currentPrice)-[:CurrentVote]->(vote:Vote)-[:VotedPrices]->(currentVotings)
WITH node, currentPrice, product, jump_sizes, jump_gates, jump_distance, hops, distance, voteHistory, currentVotings, vote, origin
WITH {eid: product.eid, displayName: product.displayName, name: product.name, currentPrice: {eid: currentPrice.eid, price: currentPrice.price}, currentVoting: {approved: vote.approved, count: Count(currentVotings), declined: vote.declined, users: Collect(currentVotings.userId), votes: Collect(currentVotings.price), voteAvg: round(100 * avg(currentVotings.price)) / 100}, voteHistory: Collect({votings: voteHistory.votings, users: voteHistory.users, completed: voteHistory.completed,
vote: voteHistory.votes}), marketplace: {eid: node.eid, name: node.name, type: node.type, designation: node.designation}, travel: {jumpSizes: jump_sizes, jumpGate: jump_gates, jumpDistance: jump_distance, jumps: hops, totalDistance: distance}} as sellOptions, currentPrice ORDER BY currentPrice.price
WITH Collect(sellOptions) as sellOptions
For the moment, this query works pretty well, but now I want to filter (after ".... dium ',' large '])" -> line 5) the minimum total distance you need to travel to reach your destination , I would like to realize this with my written plugin, which calculates the total distance in the path (getTotalDistance (path AS PATH))
For additional: when I cut of 'big' from the possible jump sizes, I get no result, but there is still a path in my graph that leads me to the goal.
For additional 2: iam working on neo4j 3.3.1 and i have set these config:
cypher.forbid_shortestpath_common_nodes=false
which not works in 3.3.3
EIDT 1: (More detailed explanation)
I have a place where I am. Then I search for marketplaces that sell some product. For this I can specify further filters. I can e.g. say that I can travel only through jump points of the size "large". Also, I only want marketplaces that are 4 system away.
Now, looking in the database for the above restrictions, I search for the shortest path to the market places I found.
It may well be that I have several paths that meet the conditions. If this is the case, I would like to filter out of all the shortest paths, the one in which one has to overcome the smallest distance within each solar system.
Is that accurate enough? Otherwise, please just report.
The latest APOC releases may be able to help here, though the APOC path expanders work best with labels and relationship types, so a slight change to your model may be needed.
In particular, the size of your jump points. Right now this is a property on the relationships between them, but for APOC to work optimally, these might be better modeled with the size as a label on the :JumpPoint nodes themselves, so you might have :JumpPoint:Small, :JumpPoint:Medium, and :JumpPoint:Large (you can add this in addition to the rel properties if you like).
Keep in mind this approach will be more complex than shortestPath(), as the idea is we're trying to find systems within a certain number of jumps, then find :Marketplaces reachable at those star systems, then filter based on whether they sell the product we want, and we'll stitch the path together as we find the pieces.
MATCH localSystemPath = (origin:Marketplace)-[*]-(s:Solarsystem)
WHERE origin.eid = $originId
WITH origin, localSystemPath, s
LIMIT 1
WITH origin, localSystemPath, s,
CASE WHEN coalesce($maxJumps, -1) = -1
THEN -1,
ELSE 3*$maxJumps
END as maxJumps,
CASE $shipSize
WHEN 'small' THEN ''
WHEN 'medium' THEN '|-Small'
ELSE '|-Small|-Medium'
END as sizeBlacklist
CALL apoc.path.spanningTree(s,
{relationshipFilter:'HasJumpPoint|CanTravel>', maxLevel:maxJumps,
labelFilter:'>Solarsystem' + sizeBlacklist, filterStartNode:true}) YIELD path as jumpSystemPath
WITH origin, localSystemPath, jumpSystemPath, length(jumpSystemPath) / 3 as jumps, last(nodes(jumpSystemPath)) as destSystem
MATCH destSystemPath = (destSystem)-[*]-(marketplace:Market)
WHERE none(rel in relationships(destSystemPath) WHERE type(rel) = 'HasJumpPoint')
AND <insert predicate for filtering which :Markets you want>
WITH origin, apoc.path.combine(apoc.path.combine(localSystemPath, jumpSystemPath), destSystemPath) as fullPath, jumps, destSystem, marketplace
CALL srt.getRoutes(origin, marketplace, fullPath) YIELD node, jump_sizes, jump_gates, jump_distance, hops, distance
...
This assumes parameter inputs of $shipSize for the minimum size of all jump gates to pass through, $originId as the id of the origin :Marketplace (plus you DEFINITELY need an index or unique constraint on :Marketplace(eid) for fast lookups here), and $maxJumps, for the maximum number of jumps to reach a destination system.
Keep in mind the expansion procedure used, spanningTree(), will only find the single shortest path to another system. If you need all possible paths, including multiple paths to the same system, then change the procedure to expandConfig() instead.

How to find nodes being contained in a node's properties interval?

I'm currently developing some kind of a configurator using neo4j as a backend. Now I ran into a problem, I don't know how to solve best.
I've got nodes created like this:
(A:Product {name:'ProductA', minWidth:20, maxWidth:200, minHeight:10, maxHeight:400})
(B:Product {name:'ProductB', minWidth:40, maxWidth:100, minHeight:20, maxHeight:300})
...
There is an interface where the user can input a desired width & height, f.e. Width=30, Height=250. Now I'd like to check which products match the input criteria. As the input might be any long value, the approach used in http://neo4j.com/blog/modeling-a-multilevel-index-in-neoj4/ with dates doesn't seem to be suitable for me. How can I run a cypher query giving me all the nodes matching the input criteria?
I don't know if I understand well what you are asking for, but if I do, here a simple query to get this:
Assuming the user wants width = 30 and height = 50
Match (p:Product)
WHERE
p.minWidth < 30 AND p.maxWidth > 30 AND
p.minHeight < 50 AND p.maxHeight > 50
RETURN
p
If this is not what you are looking for, feel free to say it as comment.

PostGIS: How to find N closest sets of points to a given set?

I am using PostGIS/Rails and have sets of points with geolocations.
class DataSet < ActiveRecord::Base # these are the sets containing the points
has_many :raw_data
# attributes: id , name
end
class RawData < ActiveRecord::Base # these are the data points
belongs_to :data_set
# attributes: id, location which is "Point(lon,lat)"
end
For a given set of points I need to find the N closest sets and their distance;
or alternatively:
For a given max distance and set of points I need to find the N closest sets.
What is the best way to do this with PostGIS?
My versions are PostgreSQL 9.3.4 with PostGIS 2.1.2
The answer on how to find the N-closest neighbours in PostGIS are given here:
Postgis SQL for nearest neighbors
To summarize the answer there:
You need to create a geometry object for your points. If you are using latitude, longitude, you need to use 4326.
UPDATE season SET geom = ST_PointFromText ('POINT(' || longitude || ' ' || latitude || ')' , 4326 ) ;
Then you create an index on the geom field
CREATE INDEX [indexname] ON [tablename] USING GIST ( [geometryfield] );
Then you get the kNN neightbors:
SELECT *,ST_Distance(geom,'SRID=4326;POINT(newLon newLat)'::geometry)
FROM yourDbTable
ORDER BY
yourDbTable.geom <->'SRID=4326;POINT(newLon newLat)'::geometry
LIMIT 10;
Where newLon newLat are the query points coordinates.
This query will take advantage of kNN functionality of the gist index (http://workshops.boundlessgeo.com/postgis-intro/knn.html).
Still the distance returned will be in degrees, not meters (projection 4326 uses degrees).
To fix this:
SELECT *,ST_Distance(geography(geom),ST_GeographyFromText('POINT(newLon newLat)')
FROM yourDbTable
ORDER BY
yourDbTable.geom <->'SRID=4326;POINT(newLon newLat)'::geometry
LIMIT 10;
When you calculate the ST_distance use the geography type. There the distance is always in meters:
http://workshops.boundlessgeo.com/postgis-intro/geography.html
All this functionality will probably need a recent Postgis version (2.0+). I am not sure though.
Check this for reference https://gis.stackexchange.com/questions/91765/improve-speed-of-postgis-nearest-neighbor-query/
EDIT. This covers the necessary steps for one point. For set of points:
SELECT n1.*,n2.*, ST_Distance(n1.geom,n2.geom)
FROM yourDbTable n1, yourDbTable n2
WHERE n1.setId=1 AND n1.setId=2 //your condition here for the separate sets
AND n1.id<>n2.id // in case the same object belong to 2 sets
ORDER BY n1.geom <->n2.geom
LIMIT 20;

Order by nearest - PostGIS, GeoRuby, spatial_adapter

I'm trying to do an order query that finds records nearest to the current_user.
I know the distance between the two points is: current_location.euclidean_distance(#record.position)
How can I work this into a PostGIS (or active_record/spatial_adapter) query?
To get the 5 closest:
SELECT * FROM your_table
ORDER BY ST_Distance(your_table.geom, ST_Geomfromtext(your point as wkt))
limit 5;
If you have a big dataset and know that you don't want to search further than , say 1 km, the query will be more efficient if you do:
SELECT * FROM your_table
WHERE ST_DWithin(your_table.geom, ST_Geomfromtext(your point as wkt, 1000)
ORDER BY ST_Distance(your_table.geom, ST_Geomfromtext(your point as wkt))
limit 5;
/Nicklas
Just in case somebody stumbles upon this issue in rails 4.
I am using rgeo gem and this works for me
scope :closest, ->(point) { order("ST_Distance(lonlat, ST_GeomFromText('# {point.as_text}', #{SRID}))").limit(5) }
To wrap this up, with everyone's help I've got it working how I wanted:
order("ST_Distance(items.position, ST_GeomFromText('POINT (#{current_location.y} #{current_location.x})', #{SRID}))")
If you really want to find literally the 5 records nearest to the current_user, consider neighborhood search, which is supported by KNN index in PostGIS 2.0 (see '<->' operator):
SELECT * FROM your_table ORDER BY your_table.geom <-> ST_Geomfromtext(your point as wkt, 1000) LIMIT 5
Look at the ST_Distance documentation in PostGIS.
If you dont NEED to use PostGIS, geo-kit does this perfectly using google or yahoo (I've only used Google) and in your queries you can sort by distance, its awesome..

Resources