Average of group by date records in Elasticsearch

Average of group by date records in Elasticsearch - ruby-on-rails

I need an average count of records group by date, I am using Elasticsearch and searchkick with Ruby on rails.
For getting records group by date Following code is working:
group_by_date: {
date_histogram: {
field: :created_at,
interval: 'day'
}
},
I am getting the following output of this code
"group_by_date"=>
{"buckets"=>
[{"key_as_string"=>"2020-01-07T00:00:00.000Z",
"key"=>1578355200000,
"doc_count"=>14},
{"key_as_string"=>"2020-01-08T00:00:00.000Z",
"key"=>1578441600000,
"doc_count"=>3}
]
}
I want an average of these records here in this case there are two dates So the average should be (14 + 3)/2 = 8.5
Thanks

I got the average by following
average_properties_by_date:{
avg_bucket: {
buckets_path: 'group_by_date>_count',
gap_policy: "skip",
format: "#,##0.00;(#,##0.00)"
}
}
Be careful here _count will be used instead of doc_count.

Related

How should I write the query for ElasticSearch in Rails?

I have to write a query in my SortBuilder.rb in which I want the count of occurrence of a word (that is coming in this method in the variable value) in the results and sort the results according to the word count.
I also want to display the count later so I want to store them in a variable.
My current logic is --
sort: [
query: value,
aggs: {
my_terms: {
filters: {
value: { term: { "title" => "#{value}" }}
}
}
}
]

How to split an array of objects into subarrays depending on the field value in Rails and Mongodb

I want to get several arrays of objects aggregated by months (and years) in a their property value.
I have class Request like this:
class Request
include Mongoid::Document
include MongoidDocument::Updated
field :name, type: String
field :start_date, type: DateTime
#...
end
And I want the resulting array of multiple hashes with
{month: m_value, year: y_value, request: requests_with_m_value_as_month_and_y_value_as_year_in_start_date_field}
as element of array
Can someone help me with this?

You can use Aggregation Pipeline to get the data in the right shape back from MongoDB:
db.requests.aggregate([
{
$group: {
_id: {
year: {
$year: "$start_date"
},
month: {
$month: "$start_date"
}
},
requests: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
year: "$_id.year",
month: "$_id.month",
requests: "$requests"
}
}
])
Obviously, this is using just the REPL and you will have to translate it to the DSL provided by Mongoid. Based on what I could find it should be possible to just get the underlying collection and call aggregate on it:
Request.collection.aggregate([...])
Now you just need to take the query and convert it into something that Mongoid will accept. I think you just need to add a bunch of quotes around the object keys but I don't the environment set up to try that myself.

Twitter typeahead only showing some items returned by bloodhound

I'm using Bloodhound to fetch data from the database, then twitter typeahead to display the options below a search box.
Currently, the bloodhound part is finding the objects required, but the typeahead is not displaying them.
var artist_retriever = new Bloodhound({
// turns input query into string of tokens to send to database.
queryTokenizer: Bloodhound.tokenizers.whitespace,
remote: {
// URL to fetch information from
url: "/artists?query=%QUERY",
wildcard: '%QUERY',
// Manipulate the array of artists returned, for display to user.
transform: function(array_of_artists){
// array of artists is returned from DB.
// Put each artist into a readable string
array_of_artists = create_artist_descriptions(array_of_artists)
console.log(array_of_artists)
// Returns correctly:
// [
// { artist: "Joe" },
// { artist: "Bob" },
// { artist: "Smith" },
// { artist: "Tom" },
// ]
return array_of_artists
}
},
// turns return value into a string of results, with this 'key' before each result.
datumTokenizer: Bloodhound.tokenizers.obj.whitespace('artist'),
});
// display:
// instantiate the typeahead UI
// https://github.com/twitter/typeahead.js/blob/master/doc/jquery_typeahead.md
searcher = $('.typeahead').typeahead(
// options:
{
hint: false
},
// datasets:
{
// Where to get data: User the bloodhound suggestion engine:
source: artist_retriever.ttAdapter(),
// Which attribute of each result from the database should be shown:
displayKey: 'artist',
templates: {
notFound: new_artist_option_template(),
footer: new_artist_option_template()
}
}
)
Update
It turns out that there's a weird bug in typeahead. It only seems to work with the "limit" attribute set to a maximum of 4. If you set "limit" to 5, the typeahead gives you nothing.
searcher = $('.typeahead').typeahead(
// options:
{
hint: false
},
// datasets:
{
// Where to get data: User the bloodhound suggestion engine:
source: artist_retriever.ttAdapter(),
// Which attribute of each result from the database should be shown:
displayKey: 'signature',
limit: 4, // This can do a max of 4! Odd.
templates: {
notFound: new_artist_option_template(),
footer: new_artist_option_template()
}
}

This issue has been solved. Please see update 2 directly.
I have reproduced this issue in this JSFIDDLE.
As you said, its a bug. You also reported that this bug goes away if you do limit:4.
Actually on my end, or in the FIDDLE, I have experienced that this issue comes when the number of results returned = value in limit.
To test this issue in the FIDDLE, do the following:
Note: Searching for 1947 returns exactly 5 rows.
When limit is set to 4:
Searching for 1947 returns 4 results.
When limit is set to 5:
Searching for 1947 returns nothing.
When limit is set to 6:
Searching for 1947 returns one 1 result - the first result.
Hence if you keep the limit set to 1 less than the actual number of results returned, then this will keep on working.
I have also submitted this issue in their github page. I will be keeping track of this issue and will keep updating this answer as need be.
Update 1:
Found a similar question on SO here. "Luciano García Bes" seems to have figured the solution. Please direct all upvotes there.
Basically he says:
It's counting the number of rendered hints before appending them, so
if the number of hints equals the limit it'll append an empty array.
To prevent this I just switched lines 1723 and 1724 so it looks like this:
that._append(query, suggestions.slice(0, that.limit - rendered));
rendered += suggestions.length;
Update 2:
This issue has been fixed on pull 1212. Closing our own issue 1312. The bug was corrected the same way discussed in update 1.

Mongoid max and embeded collections

I have a Collection Report embeds submissions
class Report
embeds_many :submissions
class Submission
embedded_in :report
field :date_submitted, type: TimeWithZone
field :mistakes, type: Integer
I am trying to create a scope on Report
I want to add a scope query with two parts
get the latest submission (given by max date_submitted) that also has zero mistakes
I can create a scope for the mistakes part, but cannot work out how to get the latest submission
scope :my_scope, where("submissions.mistakes" => 0)
So this report would be returned as it's last enter in submissions has zero mistakes
Report
"submissions" : [
{
"date_submitted" : ISODate("2014-01-28T13:00:00Z"),
"mistakes" : 11
},
{
"date_submitted" : ISODate("2014-03-08T13:00:00Z"),
"mistakes" : 0
}
]
where this one wouldn't be returned
Report
"submissions" : [
{
"date_submitted" : ISODate("2014-01-28T13:00:00Z"),
"mistakes" : 0
},
{
"date_submitted" : ISODate("2014-03-08T13:00:00Z"),
"mistakes" : 11
}
]

This is because you are not filtering the element of the embedded array but the document that contains that element.
There could be an $elemMatch clause here which allows you to combine the conditions on a single element. But find does not have any operation for getting the max value as it were. This is not to be confused with the $max query modifier, which actually clips the index in use to not search beyond those bounds.
So here you use aggregate:
db.collection.aggregate([
// Optionally query to match and filter your documents.
//{ "$match: { /* Same conditions as find */ } },
// Unwind the array
{ "$unwind": "$submissions" },
// Filter all but 0 mistakes
{ "$match": { "submissions.mistakes": 0 } },
// Group the results, taking the max entry and presuming by document `_id`
{ "$group": {
"_id": "$_id",
"date_submitted": { "$max": "$submissions.date_submitted" }
}}
])
That is the general process for filtering the elements of an array. You may look into your driver implementation of aggregate, but the form is always the pipeline represented as an array of documents (hashes) in this form. Possibly using the moped form for getting the collection method. So something like:
Report.collection.aggregate([ /* stages */ ])
For more information on returning the original document form if that is what your requirement is then see here.

MongoDB - Mongoid map reduce basic operation

I have just started with MongoDB and mongoid.
The biggest problem I'm having is understanding the map/reduce functionality to be able to do some very basic grouping and such.
Lets say I have model like this:
class Person
include Mongoid::Document
field :age, type: Integer
field :name
field :sdate
end
That model would produce objects like these:
#<Person _id: 9xzy0, age: 22, name: "Lucas", sdate: "2013-10-07">
#<Person _id: 9xzy2, age: 32, name: "Paul", sdate: "2013-10-07">
#<Person _id: 9xzy3, age: 23, name: "Tom", sdate: "2013-10-08">
#<Person _id: 9xzy4, age: 11, name: "Joe", sdate: "2013-10-08">
Could someone show how to use mongoid map reduce to get a collection of those objects grouped by the sdate field? And to get the sum of ages of those that share the same sdate field?
I'm aware of this: http://mongoid.org/en/mongoid/docs/querying.html#map_reduce
But somehow it would help to see that applied to a real example. Where does that code go, in the model I guess, is a scope needed, etc.
I can make a simple search with mongoid, get the array and manually construct anything I need but I guess map reduce is the way here. And I imagine these js functions mentioned on the mongoid page are feeded to the DB that makes those operations internally. Coming from active record these new concepts are a bit strange.
I'm on Rails 4.0, Ruby 1.9.3, Mongoid 4.0.0, MongoDB 2.4.6 on Heroku (mongolab) though I have locally 2.0 that I should update.
Thanks.

Taking the examples from http://mongoid.org/en/mongoid/docs/querying.html#map_reduce and adapting them to your situation and adding comments to explain.
map = %Q{
function() {
emit(this.sdate, { age: this.age, name : this. name });
// here "this" is the record that map
// is going to be executed on
}
}
reduce = %Q{
function(key, values) {
// this will be executed for every group that
// has the same sdate value
var result = { avg_of_ages: 0 };
var sum = 0; // sum of all ages
var totalnum = 0 // total number of people
values.forEach(function(value) {
sum += value.age;
});
result.avg_of_ages = sum/total // finding the average
return result;
}
}
results = Person.map_reduce(map, reduce) //You can access this as an array of maps
first_average = results[0].avg_of_ages
results.each do |result|
// do whatever you want with result
end
Though i would suggest you use Aggregation and not map reduce for such a simple operation. The way to do this is as follows :
results = Person.collection.aggregate([{"$group" => { "_id" => {"sdate" => "$sdate"},
"avg_of_ages"=> {"$avg" : "$age"}}}])
and the result will be almost identical with map reduced and you would have written a lot less code.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Average of group by date records in Elasticsearch - ruby-on-rails

I got the average by following average_properties_by_date:{ avg_bucket: { buckets_path: 'group_by_date>_count', gap_policy: "skip", format: "#,##0.00;(#,##0.00)" } } Be careful here _count will be used instead of doc_count.

Related

How should I write the query for ElasticSearch in Rails?

How to split an array of objects into subarrays depending on the field value in Rails and Mongodb

Twitter typeahead only showing some items returned by bloodhound

Mongoid max and embeded collections

MongoDB - Mongoid map reduce basic operation

Categories

Resources