I have this relatively complex search query that's already being built and working with perfect sorting.
But I think here searching is slow just because of script so all I want to remove script and write query accordingly.
current code :-
"sort": [
{
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": "double pscore = 0;for(id in params.boost_ids){if(params._source.midoffice_master_id == id){pscore = -999999999;}}return pscore;",
"params": {
"boost_ids": [
3,
4,
5
]
}
}
}
}]
Above code explaination:-
For example, if a match query would give a result like:
[{m_id: 1, name: A}, {m_id: 2, name: B}, {m_id: 3, name: C}, {m_id: 4, name: D}, ...]
So I want to boost document with m_id array [3, 4, 5] which would then transform the result into:
[{m_id: 3, name: C}, {m_id: 4, name: D}, {m_id: 1, name: A}, {m_id: 2, name: B}, ...]
You can make use of the below query using Function Score Query(for boosting) and Terms Query (used to query array of values)
Note that the logic I've mentioned is in the should clause of the bool query.
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {} //just a sample must clause to retrieve all docs
}
],
"should": [
{
"function_score": { <---- Function Score Query
"query": {
"terms": { <---- Terms Query
"m_id": [
3,4,5
]
}
},
"boost": 100 <---- Boosting value
}
}
]
}
}
}
So basically, you can remove the sort logic completely and add the above function query in your should clause, which would give you the results in the order you are looking for.
Note that you'd have to find a way to add the logic correctly in case if you have much complex query, and if you are struggling with anything, do let me know. I'd be happy to help!!
Hope this helps!
Related
I have a problem with WHERE clause in the URL query. Shortly, this works:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE time='2021-05-19T11:21:11.448Z'
this doesn’t:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE lon='23.53815'
Difference: in first statement I use ‘time’ in the WHERE clause, and in second one I use ‘lon’ instead:
WHERE time='2021-05-19T11:21:11.448Z' vs. WHERE lon='23.53815'
It doesn’t make sense to me why the second one doesn’t work. Any help would be much appreciated. Thanks.
P.S. Here’s an output of these two:
#1:
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "navigation.position",
"columns": [
"time",
"lat"
],
"values": [
[
"2021-05-19T11:21:11.448Z",
60.084066666666665
]
]
}
]
}
]
}
#2
{
"results": [
{
"statement_id": 0
}
]
}
It makes sense - lat (lon) is not a string type.
Filtering, where lat is a string type:
lat='60.084066666666665'
vs. filtering, where lat is a float type:
lat=60.084066666666665
I have a compound index as follows.
index({ account_id: 1, is_private: 1, visible_in_list: 1, sent_at: -1, user_id: 1, status: 1, type: 1, 'tracking.last_opened_at' => -1 }, {name: 'email_page_index'})
Then I have a query with these exact fields,
selector:
{"account_id"=>BSON::ObjectId('id'), "is_private"=>false, "visible_in_list"=>{:$in=>[true, false]}, "status"=>{:$in=>["ok", "queued", "processing", "failed"]}, "sent_at"=>{"$lte"=>2021-03-22 15:29:18 UTC}, "tracking.last_opened_at"=>{"$gt"=>1921-03-22 15:29:18 UTC}, "user_id"=>BSON::ObjectId('id')}
options: {:sort=>{"tracking.last_opened_at"=>-1}}
The winningPlan is the following
"inputStage": {
"stage": "SORT_KEY_GENERATOR",
"inputStage": {
"stage": "FETCH",
"filter": {
"$and": [
{
"account_id": {
"$eq": {
"$oid": "objectid"
}
}
},
{
"is_private": {
"$eq": false
}
},
{
"sent_at": {
"$lte": "2021-03-22T14:06:10.000Z"
}
},
{
"tracking.last_opened_at": {
"$gt": "1921-03-22T14:06:10.716Z"
}
},
{
"status": {
"$in": [
"failed",
"ok",
"processing",
"queued"
]
}
},
{
"visible_in_list": {
"$in": [
false,
true
]
}
}
]
},
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {
"user_id": 1
},
"indexName": "user_id_1",
"isMultiKey": false,
"multiKeyPaths": {
"user_id": []
},.....
And the rejected plan has the compound index and forms as follows
"rejectedPlans": [
{
"stage": "FETCH",
"inputStage": {
"stage": "SORT",
"sortPattern": {
"tracking.last_opened_at": -1
},
"inputStage": {
"stage": "SORT_KEY_GENERATOR",
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {
"account_id": 1,
"is_private": 1,
"visible_in_list": 1,
"sent_at": -1,
"user_id": 1,
"status": 1,
"type": 1,
"tracking.last_opened_at": -1
},
"indexName": "email_page_index",
"isMultiKey": false,
"multiKeyPaths": {
"account_id": [],
"is_private": [],
"visible_in_list": [],
"sent_at": [],
"user_id": [],
"status": [],
"type": [],
"tracking.last_opened_at": []
},
"isUnique": false,
The problem is that the winningPlan is slow, wouldn't be better if mongoid choose the compound index? Is there a way to force it?
Also, how can I see the execution time for each separate STAGE?
I am posting some information that can help resolve the issue of performance and use an appropriate index. Please note this may not be the solution (and the issue is open to discussion).
...Also, how can I see the execution time for each separate STAGE?
For this, generate the query plan using the explain with the executionStats verbosity mode.
The problem is that the winningPlan is slow, wouldn't be better if
mongoid choose the compound index? Is there a way to force it?
As posted the plans show a "stage": "SORT_KEY_GENERATOR", implying that the sort operation is being performed in the memory (that is not using an index for the sort). That would be one (or main) of the reasons for the slow performance. So, how to make the query and the sort use the index?
A single compound index can be used for a query with a filter+sort operations. That would be an efficient index and query. But, it requires that the compound index be defined in a certain way - some rules need to be followed. See this topic on Sort and Non-prefix Subset of an Index - as is the case in this post. I quote the example from the documentation for illustration:
Suppose there is a compound index: { a: 1, b: 1, c: 1, d: 1 }
And, all the fields are used in a query with filter+sort. The ideal query is, to have a filter+sort as follows:
db.test.find( { a: "val1", b: "val2", c: 1949 } ).sort( { d: 1 })
Note the query filter has three fields with equality condition (there are no $gt, $lt, etc.). Then the query's sort has the last field d of the index. This is the ideal situation where the index will be used for the query''s filter as well as sort operations.
In your case, this cannot be applied from the posted query. So, to work towards a solution you may have to define a new index so as to take advantage of the rule Sort and Non-prefix Subset of an Index.
Is it possible? It depends upon your application and the use case. I have an idea like this and it may help. Create a compound index like the follows and see how it works:
account_id: 1,
is_private: 1
visible_in_list: 1,
status: 1,
user_id: 1,
'tracking.last_opened_at': -1
I think having a condition "tracking.last_opened_at"=>{"$gt"=>1921-03-22 15:29:18 UTC}, in the query''s filter may not help for the usage of the index.
Also, include some details like the version of the MongoDB server, the size of collection and some platform details. In general, query performance depends upon many factors, including, indexes, RAM memory, size and type of data, and the kind of operations on the data.
The ESR Rule:
When using compound index for a query with multiple filter conditions and sort, sometimes the Equality Sort Range rule is useful for optimizing the query. See the following post with such a scenario: MongoDB - Index not being used when sorting and limiting on ranged query
I am trying to use cypher to perform the query in full text index. It returns results, but they are not ranked. Is there a way to get the match score?
start recordEmployee=node:fidx_RecordEmployee("F01:Leela* OR F01:Ph*") return recordEmployee.F01
Returns this, and I cannot find match score:
{
"results": [
{
"columns": [
"recordEmployee.F01"
],
"data": [
{
"row": [
"Philip"
],
"graph": {
"nodes": [],
"relationships": []
}
},
{
"row": [
"Leela"
],
"graph": {
"nodes": [],
"relationships": []
}
}
],
"stats": {
"contains_updates": false,
"nodes_created": 0,
"nodes_deleted": 0,
"properties_set": 0,
"relationships_created": 0,
"relationship_deleted": 0,
"labels_added": 0,
"labels_removed": 0,
"indexes_added": 0,
"indexes_removed": 0,
"constraints_added": 0,
"constraints_removed": 0
}
}
],
"errors": []
}
It's not possible in Cypher yet, but with stored procedures in Neo4j 3.0 it will be again.
Until then if you really need the score you can use the REST endpoint.
http://neo4j.com/docs/stable/rest-api-indexes.html#rest-api-find-node-by-query
Getting the results with a predefined ordering requires adding the
request parameter
?order=<ordering>
where <ordering> is one of index, relevance or score. In this case an
additional field will be added to each result, named score, that holds
the float value that is the score reported by the query result.
I'm experimenting with using Falcor to front the Guild Wars 2 API and want to use it to show game item details. I'm especially interested in building a router that can use multiple datasources to combine the results of different APIs.
The catch is, Item IDs in Guild Wars 2 aren't contiguous. Here's an example:
[
1,
2,
6,
11,
24,
56,
...
]
So I can't just write paths on the client like items[100..120].name because there's almost certainly going to be a bunch of holes in that list.
I've tried adding a route to my router so I can just request items, but that sends it into an infinite loop on the client. You can see that attempt on GitHub.
Any pointers on the correct way to structure this? As I think about it more maybe I want item.id instead?
You shouldn't find your self asking for ids from a Falcor JSON Graph object.
It seems like you want to build an array of game ids:
{
games: [
{ $type: "ref", value: ["gamesById", 352] },
{ $type: "ref", value: ["gamesById", 428] }
// ...
],
gamesById: {
352: {
gameProp1: ...,
},
428: {
gameProp2: ...
}
}
}
[games, {from: 5, to: 17 }, "gameProp1"]
Does that work?
You can use 'get' API of Falcor, It retrieves multiple values.. You can pass any number of required properties as shown below
var model=new falcor.Model({
cache:{
genereList:[
{name:"Recently Watched",
titles:[
{id:123,
name: "Ignatius",
rating: 4}
]
},
{name:"New Release",
titles:[
{id:124,
name: "Jessy",
rating: 3}
]
}
]
}
});
Getting single value
model.getValue('genereList[0].titles[0].name').
then(function(value){
console.log(value);
});
Getting multiple values
model.get('genereList[0..1].titles[0].name', 'genereList[0..1].titles[0].rating').
then(function(json){
console.log(JSON.stringify(json, null, 4));
})
I am using the cypher rest api manually for this.
I would like to return the data in a way that is easy to parse.
Here's an example of some data with the same sort of relationships I'm dealing with:
(me:Person:Normal),
(dad:Person:Geezer),
(brother:Person:Punk),
(niece:Person:Tolerable),
(daughter:Person:Awesome),
(candy:Rule),
(tv:Rule),
(dad)-[:HAS_CHILD {Num:1}]->(brother),
(dad)-[:HAS_CHILD {Num:2}]->(me),
(me)-[:HAS_CHILD {Num:1}]->(daughter),
(brother)-[:HAS_CHILD {Num:1}]->(niece),
(me)-[:ALLOWS]->(candy),
(me)-[:ALLOWS]->(tv)
I want to get all of the HAS_CHILD relationships and if any of those child nodes have :ALLOWS relationships I want the ids of those too.
so if I were to do something like...
START n=node({idofdad}) MATCH n-[r?:HAS_CHILD]->h-[?:ALLOWS]->allowed
WITH n, r, h, collect(ID(allowed)) as allowedIds
WITH n, r,
CASE
WHEN h IS NOT NULL THEN [ID(h), LABELS(h), r.Num?, allowedIds]
ELSE NULL
END as has
RETURN
LABELS(n) as labels,
ID(n) as id,
n as node,
COLLECT(has) as children;
The :HAS_CHILD may not exist, so I have to do this wierd case thing.
The data that comes back is 'ok' but the JSON mapper that I have (Newtonsoft) doesn't make it easy to map an array to an object (meaning I know that array index[0] is ID of the (children) collections.
The results of the above look like this:
{
"columns": ["labels", "id", "node", "children"],
"data": [
["Person", "Geezer"],
6,
{},
[
[7, ["Person", "Normal"], 2, [2, 1]],
[5, ["Person", "Punk"], 1, []]
]
]
}
Since this is more/less a document and it'd be easier to map the 'children' column I'd like to get something that looks like this:
{
"columns": ["labels", "id", "node", "children"],
"data": [
["Person", "Geezer"],
6,
{},
[
[
"id": 7,
"labels": ["Person", "Normal"],
"ChildNumber": 2,
"AllowedIds": [2, 1]
},
{
"id": 5,
"labels": ["Person", "Punk"],
"ChildNumber": 1,
"AllowedIds": []
}
]
]
}
I would expect the query to look something like:
START n=node({idofdad}) MATCH n-[r?:HAS_CHILD]->h-[?:ALLOWS]->allowed
WITH n, r, h, collect(ID(allowed)) as allowedIds
WITH n, r,
CASE
WHEN h IS NOT NULL THEN
{ id: ID(h), labels: LABELS(h), ChildNumber: r.Num?, AllowedIds: allowedIds }
ELSE NULL
END as has
RETURN
LABELS(n) as labels,
ID(n) as id,
n as node,
COLLECT(has) as children;
Is this even remotely possible?