not equal condtions in elasticsearch aggregation - ruby-on-rails

I have a rails app that connects to elastic search and the user could define some conditions like (at least),(at most), and ... to compare and find his specific result. everything is fine but now I have to add not equal to my comparison operator list and even calculate not equal for all of my conditions described before. for example, if the user wants to search, "not equal at least" I want to calculate the result based on the user query and then make them reverse in result I know make query response reverse, it's impossible in elastic search but does it possible elastic calculate not equal by himself, assume user want to know ((the count of the specific event did not happen at most 1 time in last 30 days)) I know elastic support not equal query, but just in bool and query string query, but in my case, I should use aggregation and terms query in it. aggregation like :
body_data = {
group_by_profile_id: {
terms: {
field: "profile_id.keyword",
min_doc_count: #count_first,
size: 10000
},
aggs: {
filter: {
bucket_selector: {
buckets_path: {
docCount: "_count"
},
script: "params.docCount #{#operator} #{#counts.last}"
}
}
}
}
}
does anyone knows how can i handle not equal query in aggregation and term query.

Related

How to only return subset of larger query in GraphQL without returning the whole thing?

I am new to GraphQL and trying to understand how to only return a subsection of the resulting query. For example, here is my cypher query in Neo4j:
MATCH (u:User)-[HAS_FRIEND]-(f:User)
WHERE u.name = "Joe"
RETURN f.name
Notice how the above query only returns the friend names.
GraphQL
Here's a sample of GraphQL version of the above cypher query.
Input
{
"where": {
"name": "Joe"
}
}
Query
query Query($where: NameWhere) {
names(where: $where) {
name
friends {
name
}
}
}
Expected Output
Obviously, I expect output to include friends name array and the name of the user having the friend. But, what if I only want the friends array like the cypher query gives me?
While this is a simple example, I am using a graph database that has a series of connections and my nesting is looking pretty deep, but I only really need a subquery from the result. A great example is with localization - I don't need the country and language (locale), etc. in every query; I only want the subquery result (e.g. the spelling in UK/ENG is colour and in US/ENG is color).
Neo4j offers a Javascript GraphQL library that is able to handle your use case (and many more)!
Please have a look at the documentation for information on how to get started. See even the example queries, where this one is fairly close to what you are looking for.
Anyhow, the GraphQL query provided above, using the #neo4j/graphql library, could look like so:
query Query {
users (where: { name: "Joe" }) {
friends {
name
}
}
}
(Note that the where parameter ("Joe") is provided in-line in this case)

controlling search results position SearchKick

Im using SearchKick which is great, Im migrating from a really bad search implementation and the product team didn't give me much trust as such before migrating to SearchKick and doing an overhaul to our search, they made me add hardcoded query results, so they can say for this search input I want this product to come up first. right now Im taking the query results that answer a certain the requested query from the db and add them at the top ( I don't care if you want the result at position 48, if there are 4 hard coded results it will be the 4th). although if possible it would be nice to do put them in the middle.
What is the cleanest way to do it with SearchKick, so that the querying will happen inside elastic ( index the hardcoded results in the product to do so )
I have 2 models Product and QueryResult, QueryResult contains a product, a query string & a wanted_rank
in my Product model I do have a method to get search results that looks something like this:
def get_search_results(query_string)
# get search results from elastic using searchkick
search_results = Product.search(query_string)
# get hardcoded results matching this query
hardcoded_results = QueryResults.where(query: query).order(:wanted_rank).map(&:product)
# remove hardcoded results from search_results
search_results = search_results - hardcoded_results
# return results where hardcoded results are first
hardcoded_results + search_results
end
In the end I want all the search logic to happen over elastic including inserting hardcoded search results
So after some very helpful comments and more search I found a solution.
first of all my first mistake was to try to fix it using boosts instead of order, to do so we index all QueryResults of a specific product under a field called called query_results.
for a product x with query results:
{
query: 'foo',
wanted_rank: 1,
},
{
query: 'bar',
wanted_rank: 2,
}
I will index:
{
name: 'x',
query_results: {
'foo': 1,
'bar': 2
}
}
than when searching, given a attribute named query, I will do as follow:
Product.search(
query,
order: [
{ "query_results.#{query}": { unmapped_type: :long },
{ _score: :desc }
]
)
2 important things to notice:
use unmapped_type this will tell elastic what mapping to use when there is no mapping, each random query that does not have a query result ( which is most of them ) will have no mapping for "query_results.#{query}" because it wont be indexed, as such we add unmapped_type to tell elastic if you have no mapping act like its long.
both when searching and when saving the db I downcase and strip query so it will match properly.
also I index the queries under another field and do a search over it with low weight to make sure that the product will come up for that query.

Dynamically fire different activerecord queries using procs in Rails?

Say I have many different classes that inherit from Tree and each of them implements a method called grow! but with a slightly different ActiveRecord implementation. Say each method begins with an ActiveRecord query to find the right trees to grow with something like:
trees = Tree
.joins(:fruits)
.where(land_id: land.id)
.where(fruits: { sweet: true })
.where(fruits: { season_id: season.id })
Say the part we want to swap out from query to query is this part:
.where(fruits: { sweet: true })
Say we want to then build a WinterTree class and its own grow method but it only grows non sweet fruits and so we want to return trees that only grow non-sweet fruits. Is there anyway to not have to rewrite the rest of the query and only swap out that one piece of the query and maybe write the rest of the query in the parent Tree class? Is there anyway to call AR segments of queries dynamically?
I found it easy to build dynamic queries using where statements in sql such as: Tree.joins(:fruits).where("land_id = ?", land.id ) etc. Below is what I did yesterday to give you some idea of what I'm talking about but you'll need to extrapolate it to fit your needs:
query = ''
counter = 1
sets_of_data_ill_query.each do |set|
if counter == 1
query += "district = '#{set[0]}' AND second_district = '#{set[1]}'"
else
query += " OR district = '#{set[0].to_s}' AND second_district = '#{set[1]}'"
end
end
voters = Voter.where(query)
NOTE: I knew the data I was querying was safe so I just used the raw info but you'll want to do it as I showed in the first paragraph with ?escaping values if it's data that will be entered by users. Also, since you're chaining where statements you would want to use an "AND" instead of where I used "OR" if you need to loop through sets etc.

How to get total number of db-hits from Cypher query within a Java code?

I am trying to get total number of db-hits from my Cypher query. For some reason I always get 0 when calling this:
String query = "PROFILE MATCH (a)-[r]-(b)-[p]-(c)-[q]-(a) RETURN a,b,c";
Result result = database.execute(query);
while (result.hasNext()) {
result.next();
}
System.out.println(result.getExecutionPlanDescription().getProfilerStatistics().getDbHits());
The database seems to be ok. Is there something wrong about the way of reaching such value?
ExecutionPlanDescription is a tree like structure. Most likely the top element does not directly hit the database by itself, e.g. a projection.
So you need to write a recursive function using ExecutionPlanDescription.getChildren() to drill to the individual parts of the query plan. E.g. if one of the children (or sub*-children) is a plan of type Expand you can use plan.getProfilerStatistics().getDbHits().

Mongoid MapReduce giving irregular results for recursive reduce function

I have an Item model which has an attribute category. I want the items count grouped by category. I wrote a map reduce for this functionality. It was working fine. I recently wrote a script to create 5000 items. Now I realize my map reduce only gives the result for the last 80 records. The following is the code for the mapreduce function.
map = %Q{
function(){
emit({},{category: this.category});
}
}
reduce = %Q{
function(key, values){
var category_count = {};
values.forEach(function(value){
if(category_count.hasOwnProperty(value.category))
category_count[value.category]++;
else
category_count[value.category] = 1
})
return category_count;
}
}
Item.map_reduce(map,reduce).out(inline: true).first.try(:[],"value")
After researching a bit and I discovered mongodb invokes reduce function multiple times. How can achieve the functionality I intended for?
There is a rule you must follow when writing map-reduce code in MongoDB (a few rules, actually). One is that the emit (which emits key/value pairs) must have the same format for the value that your reduce function will return.
If you emit(this.key, this.value) then reduce must return the exact same type that this.value has. If you emit({},1) then reduce must return a number. If you emit({},{category: this.category}) then reduce must return the document of format {category:"string"} (assuming category is a string).
So that clearly can't be what you want, since you want totals, so let's look at what reduce is returning and work out from that what you should be emitting.
It looks like at the end you want to accumulate a document where there is a keyname for each category and its value is a number representing the number of its occurrences. Something like:
{category_name1:total, category_name2:total}
If that's the case then the correct map function would emit({},{"this.category":1}) in which case your reduce will need to add up the numbers for each key corresponding to a category.
Here is what the map should look like:
map=function (){
category = { };
category[this.category]=1;
emit({},category);
}
And here is the correct corresponding reduce:
reduce=function (key,values) {
var category_count = {};
values.forEach(function(value){
for (cat in value) {
if( !category_count.hasOwnProperty(cat) ) category_count[cat]=0;
category_count[cat] += value[cat];
}
});
return category_count;
}
Note that it satisfies two other requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case if there is only one document in your collection) and it will work correctly if the reduce function gets called multiple times (which is what's happening when you have more than 100 documents).
A more conventional way to do that would be to emit category name as key and the number as value. This simplifies map and reduce:
map=function() {
emit(this.category, 1);
}
reduce=function(key,values) {
var count=0;
values.forEach(function(val) {
count+=val;
}
return count;
}
This will sum the number of times each category appears. This also satisfies requirements for MapReduce - it works correctly if the reduce function is never called (which will be the case for any category that only appears once) and it will work correctly if the reduce function gets called multiple times (which will happen if any category appears more than 100 times).
As others pointed out, aggregation framework makes the same exercise much simpler with:
db.collection.aggregate({$group:{_id:"$category",count:{$sum:1}}})
although that matches the format of the second mapReduce I showed, and not the original format that you had which is outputting category names as keys. However aggregation framework will always be significantly faster than MapReduce.
I agree with Neil Lunn's comment.
What I can see from the info that is provided is that if you are on a version of MongoDB greater or equal than 2.2 you can use the aggregation framework instead of map-reduce.
db.items.aggregate([
{ $group: { _id: '$category', category_count: { $sum: 1 } }
])
Which is a lot simpler and performant (see Map/Reduce vs. Aggregation Framework )

Resources