In my database I have these records:
path :[ { location : "some gps coords" , time : "some time"}, etc ]
Each "path"-record represents any real path.
How can I check any two paths on equality?
Thanks!
Related
how to get the distance or radian between two point on the earth with lng and lat?
You probably don't want mapReduce in this case but actually the aggregation framework. Apart from the general first stage query you can run via $geoNear which is more efficient in your purpose.
db.places.aggregate([
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [ -88 , 30 ]
},
"distanceField": "dist"
}},
{ "$match": {
"loc": {
"$centerSphere": [ [ -88 , 30 ] , 0.1 ]
}
}}
])
Or frankly, because the initial $geoNear stage will "project" an additional field into the document containing the "distance" from the queried "point of origin", then you can just "filter" on that element in a subsequent stage:
db.places.aggregate([
{ "$geoNear": {
"near": {
"type": "Point",
"coordinates": [ -88 , 30 ]
},
"distanceField": "dist"
}},
{ "$match": {
"dist": { "$lte": 0.1 }
}}
])
Since this is one option that can "produce/project" a value representing the distance in the result then that satisfies your first criteria. The "chaining" nature of the "aggregation framework" allows "additional filtering" or any other operation you need to perform after the filtering of the initial query.
So $geoWithin works just as well in the aggregation framework under a $match stage as it would in any standard query since it is not "dependant" on an "index" of geospatial origin to be present. It performs better in an initial query with one, but it does not need it.
Since your requirement is the "distance" from the point of origin, then the most logical thing to do is to perform an operation that will return such information. Such as this does.
Would love to include all of the relevant links in this response, but as a new responder then two links is all I am allowed for now.
One more relevant note:
The measurement of "distance" or "radius" in any operation is dependant on how your data is stored. If it is in a "legacy" or "key/pair or plain array" format then the value will be expressed in "radians", otherwise where the data is expressed in GeoJSON format on the "location" then the "distance data" is expressed in "meters" instead.
That is an important consideration given the libraries implemented by the MongoDB service and how this interacts with the data as you have it stored. There is of course documentation on this in the official resources should you care to look at that properly. And again, I cannot add those links at this time, unless this response sees some much needed love.
https://github.com/mongodb/mongo/blob/master/src/third_party/s2/s2latlng.cc#L37
GetDistance() return a S1Angle, S1Angle::radians() will return the radians.
This belong to s2-geometry-library(google code will close,i export it to my github. Java).
I'm trying to do an exact location search, meaning that each term in the location should exactly match at least one location field. For example, if I search for "Sudbury, Middlesex, Massachusetts" then I want to only get results that have an exact match for each of those three terms. A result with location.city.name = Sudbury, location.county.name = Middlesex, and location.region.name = Massachusetts would match.
{
"multi_match": {
"fields": [
"location.city.name",
"location.region.name",
"location.county.name",
"location.country.name"
],
"query": "Sudbury, Middlesex, Massachusetts",
"type": "cross_fields",
"operator": "and"
}
This is very close, however I also get results for "East Sudbury." I don't want East Sudbury, I only want results that match the field exactly. How can I do this? I know that "type":"phrase" is wrong because then it would be searching for the entire phrase "Sudbury, Middlesex, Massachusetts" in each field and would get no results.
Sounds like the field location.city.name is being analysed and splitting 'East Sudbury' into 'East' and 'Sudbury' and getting returned for a search for 'Sudbury'
Try setting the field to not_analyzed if you are always searching for specific terms?
So my Rails app using elasticsearch (with searchkick), is working just fine using the _geo_distance ordering function, however I need to do a more complex ordering that includes location AND an attempt to promote a business name exact string match.
For example, if I make a query and there are 10 ascending distance returned results, but the #5 result is also an exact string match on the business name in the record, I would like to promote/elevate that to the #1 position (basically overriding the distance sorting for that record).
There are two ways I can see to try to solve this issue, but I am running into issues with both.
First, would be to do this on the initial query, so that elasticsearch handles the work.
Second, would be to do some type of post-process re-sort on the result returned by elasticsearch to look for an exact match and re-order if needed.
The issue with the first method is that the built in scoring mechanisms seem to shift completely to distance when invoking _geo_distance, leaving me to wonder how to mix custom scoring functions with location.
And the issue with the second method is that the search results returned are a custom type of SearchKick object that does not seem to like normal array or hash sorting mechanisms for a post-process.
Is there a way to do something pre- or post- query to promote a document in the results in this manner?
Thanks.
In fact, there are many ways to "control" the scoring. Before indexing, if you already some document is meant to get high score/boost. You can give high score for the special document before indexing, please reference here.
If you cannot determine the boost before the indexing, you can boost it in the query command. About the boosting query, there are also many options and it's dependent on what kind query you used.
For query string query:
You can boost some fields, such as fields" : ["content", "name.*^5"], or boost some query command such as, quick^2 fox(this might work for you, just extra boost the name).
For others:
You can give boost for term query, such as boosting the "ivan" case:
"term" : {"name" : {"value" : "ivan","boost" : 10.0}}
you can wrap it into bool query and boost the desired case. ex. find all 'ivan', boost 'ji' on name field.
{ "query" : { "bool" : { "must": [{"match":{"name":"ivan"}}],
"should" : [ { "term" : { "name": { "value" : "ji", "boost" : 10 }}}]}}}
Except for term query, there are a lot of queries that support boost, such as prefix query, match query. You can use it under situations. Here are some official examples: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_boosting_query_clauses.html
Boosting might not easy for controlling score, because it needs normalization. You can specify the score using the function_score query to specify the direct score: It's really a useful query if you need more directly control.
In short, you can wrap your query in bool and add some boost for the name matching, as follow:
{ "query" : {
"bool" : {
"must": [
{"filtered" : {
"filter" : {
"geo_distance" : {
"distance" : "2000km",
"loc" : {
"lat" : 10,
"lon" : 10
}
}
}
}}],
"should" : [ { "term" : { "name": { "value" : "ivan", "boost" : 10 }}}]}},
"sort" : [
"_score",
{
"_geo_distance" : {
"loc" : [10, 10],
"order" : "asc",
"unit" : "km",
"mode" : "min",
"distance_type" : "sloppy_arc"
}
}
]
}
For more detailed, you can check my gist https://gist.github.com/hxuanji/e5acd9a5174ea10c08b8. I boost the "ivan" name. In the result, the "ivan" document becomes first rather than the (10,10) document.
My application is trying to match an incoming string against documents in my Mongo Database where a field has a list of keywords. The goal is to see if the keywords are present in the string.
Here's an example:
Incoming string:
"John Doe is from Florida and is a fan of American Express"
the field for the documents in the MongoDB has a value such as:
in_words: "georgia,american express"
So, the database record has inwords or keywords separate by comman and some of them are two words or more.
Currently, my RoR application pulls the documents and pulls the inwords for each one issuing a split(',') command on the inwords, then loops through each one and sees if it is present in the string.
I really want to find a way to push this type of search into the actual database query in order to speed up the processing. I could change the in_words in the database to an array such as follows:
in_words: ["georgia", "american express"]
but I'm still not sure how to query this?
To Sum up, my goal is to find the person that matches an incoming string by comparing a list of inwords/keywords for that person against the incoming string. And do this query all in the database layer.
Thanks in advance for your suggestions
You should definitely split the in_words into an array as a first step.
Your query is still a tricky one.
Next consider using a $regex query against that array field.
Constructing the regex will be a bit hard since you want to match any single word from your input string, or, it appears any pair of works (how many words??). You may get some further ideas for how to construct a suitable regex from my blog entry here where I am matching a substring of the input string against the database (the inverse of a normal LIKE operation).
You can solve this by splitting the long string into seperate tokens and put them in to the separate array. And use $all query to effectively find the matching keywords.
Check out the sample
> db.splitter.insert({tags:'John Doe is from Florida and is a fan of American Express'.split(' ')})
> db.splitter.insert({tags:'John Doe is a super man'.split(' ')})
> db.splitter.insert({tags:'John cena is a dummy'.split(' ')})
> db.splitter.insert({tags:'the rock rocks'.split(' ')})
and when you query
> db.splitter.find({tags:{$all:['John','Doe']}})
it would return
> db.splitter.find({tags:{$all:['John','Doe']}})
{ "_id" : ObjectId("4f9435fa3dd9f18b05e6e330"), "tags" : [ "John", "Doe", "is", "from", "Florida", "and", "is", "a", "fan", "of", "American", "Express" ] }
{ "_id" : ObjectId("4f9436083dd9f18b05e6e331"), "tags" : [ "John", "Doe", "is", "a", "super", "man" ] }
And remember, this operation is case-sensitive.
If you are looking for a partial match, use $in instead $all
Also you probably need to remove the noise words('a','the','is'...) before insert for accurate results.
I hope it is clear
I need to get all the buildings with "church" function that are far 100km from a specified point (lat, lng). I made in this way:
[{
"id": null,
"name": null,
"type": "/architecture/building",
"building_function" : [{"name" : 'church'}],
"/location/location/geolocation" : {"latitude" : 45.1603653, "longitude" : 10.7976976}
"/location/location/area" : 100
}]
but I alway get an empty response
code: "/api/status/ok"
result: []
status: "200 OK"
transaction_id: "cache;cache03.p01.sjc1:8101;2011-04-16T12:32:45Z;0035"
What am I missing?
Thanks
An area isn't a distance and you probably don't want an exact match to the value "100" anyway. You've asked for things which are precisely at that long/lat and have exactly that area.
Are you looking for churches which are less than a certain distance, more than a certain distance, or exactly the given distance? You probably want to look at the Geosearch API http://api.freebase.com/api/service/geosearch?help (although it's not a long term solution since it's been deprecated)
The /location/location/area property is used to query locations which cover a certain amount of area. So your query looks for buildings centered at (45.1603653, 10.7976976) which cover an area of 100km. Naturally there are no results that match.
Searching for topics within 100km of a those coordinates takes a little more work. You'll need to use the Geosearch service which is still in alpha. The following query should give you the results that you're looking for:
http://www.freebase.com/api/service/geosearch?location={%22type%22:%22Point%22,%22coordinates%22:[10.7976976,45.1603653]}&type=/architecture/building&within=100&indent=1
Once you have that list of buildings, you can query the MQL Read API to find out which ones are churches like this:
[{
"id": null,
"name": null,
"type": "/architecture/building",
"building_function" : [{"name" : 'church'}],
"filter:id|=":[
"/en/verona_arena",
"/en/basilica_palladiana",
"/en/teatro_olimpico",
"/en/palazzo_del_te",
"/en/villa_capra_la_rotonda",
"/en/villa_badoer",
"/en/san_petronio_basilica",
"/en/palazzo_schifanoia",
"/en/palazzo_chiericati",
"/en/basilica_di_santandrea_di_mantova",
"/en/basilica_of_san_domenico",
"/en/castello_estense",
"/en/palazzo_dei_diamanti",
"/en/villa_verdi",
"/en/cathedral_of_cremona",
"/en/monte_berico",
"/en/villa_pojana",
"/en/san_sebastiano",
"/en/cremona_baptistery",
"/en/palazzo_della_pilotta"
]
}]
Right now its only matching 2 results so you'll probably need to edit some of those topics to mark them as churches.