Ruby on Rails field average? - ruby-on-rails

Is there an easy way to obtain the average of an attribute in a collection?
For instance, each user has a score.
Given a collection of user(s) (#users), how can you get the average score for the group?
Is there anything like #users.average(:score)? I think I came across something like this for database fields, but I need it to work for a collection...

For your question, one could actually do:
#users.collect(&:score).sum.to_f/#users.length if #users.length > 0
Earlier I thought, #users.collect(&:score).average would have worked. For database fields, User.average(:score) will work. You can also add :conditions like other activerecord queries.

I use to extend our friend Array with this method:
class Array
# Calculates average of anything that responds to :"+" and :to_f
def avg
blank? and 0.0 or sum.to_f/size
end
end

Here's a little snippet to not only get the average but also the standard deviation.
class User
attr_accessor :score
def initialize(score)
#score = score
end
end
#users=[User.new(10), User.new(20), User.new(30), User.new(40)]
mean=#users.inject(0){|acc, user| acc + user.score} / #users.length.to_f
stddev = Math.sqrt(#users.inject(0) { |sum, u| sum + (u.score - mean) ** 2 } / #users.length.to_f )

u can use this here
http://api.rubyonrails.org/classes/ActiveRecord/Calculations.html#method-i-average

Related

How do I set and increment a ticket number before save in Rails?

I'm making tickets for a small (<150 person) event and would like to auto increment ticket numbers and save those numbers to the database. Do I use a "hidden_field"? My database is set up with ticket.number as an array, because a person may buy several tickets. So what's the proper syntax? Thanks!
Is your database PostgreSQL? That supports storing arrays natively, so you can do
"select max(select max(x) from unnest(ticket_array) x) from people"
I haven't tested it so I'm not positive about the phrasing but it's something like that.
However your database is small enough that you can do it in Rails which should work for any type of database if you're storing the array as a serialised string.
last_number = Person.all.map{|person| person.ticket_array.max }.max
You'd use this in a before save, and I assume you have an integer column number_of_tickets, so you could do...
class Person
serialize :ticket_array
before_save :determine_ticket_numbers
def determine_ticket_numbers
return if persisted?
last_number = self.class.all.map{|person| person.ticket_array.max }.max
last_number ||= 0
number_of_tickets.times { self.ticket_array << (last_number += 1) }
end
end

Generate array of daily avg values from db table (Rails)

Context:
Trying to generating an array with 1 element for each created_at day in db table. Each element is the average of the points (integer) column from records with that created_at day.
This will later be graphed to display the avg number of points on each day.
Result:
I've been successful in doing this, but it feels like an unnecessary amount of code to generate the desired result.
Code:
def daily_avg
# get all data for current user
records = current_user.rounds
# make array of long dates
long_date_array = records.pluck(:created_at)
# create array to store short dates
short_date_array = []
# remove time of day
long_date_array.each do |date|
short_date_array << date.strftime('%Y%m%d')
end
# remove duplicate dates
short_date_array.uniq!
# array of avg by date
array_of_avg_values = []
# iterate through each day
short_date_array.each do |date|
temp_array = []
# make array of records with this day
records.each do |record|
if date === record.created_at.strftime('%Y%m%d')
temp_array << record.audio_points
end
end
# calc avg by day and append to array_of_avg_values
array_of_avg_values << temp_array.inject(0.0) { |sum, el| sum + el } / temp_array.size
end
render json: array_of_avg_values
end
Question:
I think this is a common extraction problem needing to be solved by lots of applications, so I'm wondering if there's a known repeatable pattern for solving something like this?
Or a more optimal way to solve this?
(I'm barely a junior developer so any advice you can share would be appreciated!)
Yes, that's a lot of unnecessary stuff when you can just go down to SQL to do it (I'm assuming you have a class called Round in your app):
class Round
DAILY_AVERAGE_SELECT = "SELECT
DATE(rounds.created_at) AS day_date,
AVG(rounds.audio_points) AS audio_points
FROM rounds
WHERE rounds.user_id = ?
GROUP BY DATE(rounds.created_at)
"
def self.daily_average(user_id)
connection.select_all(sanitize_sql_array([DAILY_AVERAGE_SELECT, user_id]), "daily-average")
end
end
Doing this straight into the database will be faster (and also include less code) than doing it in ruby as you're doing now.
I advice you to do something like this:
grouped =
records.order(:created_at).group_by do |r|
r.created_at.strftime('%Y%m%d')
end
At first here you generate proper SQL near to that you wish to get in first approximation, then group result records by created_at field converted to just a date.
points =
grouped.map do |(date, values)|
[ date, values.reduce(0.0, :audio_points) / values.size ]
end.to_h
# => { "1-1-1970" => 155.0, ... }
Then you remap your grouped hash via array, to calculate average values with audio_points.
You can use group and calculations methods built in AR: http://guides.rubyonrails.org/active_record_querying.html#group
http://guides.rubyonrails.org/active_record_querying.html#calculations

Equivalent of find_each for foo_ids?

Given this model:
class User < ActiveRecord::Base
has_many :things
end
Then we can do this::
#user = User.find(123)
#user.things.find_each{ |t| print t.name }
#user.thing_ids.each{ |id| print id }
There are a large number of #user.things and I want to iterate through only their ids in batches, like with find_each. Is there a handy way to do this?
The goal is to:
not load the entire thing_ids array into memory at once
still only load arrays of thing_ids, and not instantiate a Thing for each id
Rails 5 introduced in_batches method, which yields a relation and uses pluck(primary_key) internally. And we can make use of the where_values_hash method of the relation in order to retrieve already-plucked ids:
#user.things.in_batches { |batch_rel| p batch_rel.where_values_hash['id'] }
Note that in_batches has order and limit restrictions similar to find_each.
This approach is a bit hacky since it depends on the internal implementation of in_batches and will fail if in_batches stops plucking ids in the future. A non-hacky method would be batch_rel.pluck(:id), but this runs the same pluck query twice.
You can try something like below, the each slice will take 4 elements at a time and them you can loop around the 4
#user.thing_ids.each_slice(4) do |batch|
batch.each do |id|
puts id
end
end
It is, unfortunately, not a one-liner or helper that will allow you to do this, so instead:
limit = 1000
offset = 0
loop do
batch = #user.things.limit(limit).offset(offset).pluck(:id)
batch.each { |id| puts id }
break if batch.count < limit
offset += limit
end
UPDATE Final EDIT:
I have updated my answer after reviewing your updated question (not sure why you would downvote after I backed up my answer with source code to prove it...but I don't hold grudges :)
Here is my solution, tested and working, so you can accept this as the answer if it pleases you.
Below, I have extended ActiveRecord::Relation, overriding the find_in_batches method to accept one additional option, :relation. When set to true, it will return the activerecord relation to your block, so you can then use your desired method 'pluck' to get only the ids of the target query.
#put this file in your lib directory:
#active_record_extension.rb
module ARAExtension
extend ActiveSupport::Concern
def find_in_batches(options = {})
options.assert_valid_keys(:start, :batch_size, :relation)
relation = self
start = options[:start]
batch_size = options[:batch_size] || 1000
unless block_given?
return to_enum(:find_in_batches, options) do
total = start ? where(table[primary_key].gteq(start)).size : size
(total - 1).div(batch_size) + 1
end
end
if logger && (arel.orders.present? || arel.taken.present?)
logger.warn("Scoped order and limit are ignored, it's forced to be batch order and batch size")
end
relation = relation.reorder(batch_order).limit(batch_size)
records = start ? relation.where(table[primary_key].gteq(start)) : relation
records = records.to_a unless options[:relation]
while records.any?
records_size = records.size
primary_key_offset = records.last.id
raise "Primary key not included in the custom select clause" unless primary_key_offset
yield records
break if records_size < batch_size
records = relation.where(table[primary_key].gt(primary_key_offset))
records = records.to_a unless options[:relation]
end
end
end
ActiveRecord::Relation.send(:include, ARAExtension)
here is the initializer
#put this file in config/initializers directory:
#extensions.rb
require "active_record_extension"
Originally, this method forced a conversion of the relation to an array of activrecord objects and returned it to you. Now, I optionally allow you to return the query before the conversion to the array happens. Here is an example of how to use it:
#user.things.find_in_batches(:batch_size=>10, :relation=>true).each do |batch_query|
# do any kind of further querying/filtering/mapping that you want
# show that this is actually an activerecord relation, not an array of AR objects
puts batch_query.to_sql
# add more conditions to this query, this is just an example
batch_query = batch_query.where(:color=>"blue")
# pluck just the ids
puts batch_query.pluck(:id)
end
Ultimately, if you don't like any of the answers given on an SO post, you can roll-your-own solution. Consider only downvoting when an answer is either way off topic or not helpful in any way. We are all just trying to help. Downvoting an answer that has source code to prove it will only deter others from trying to help you.
Previous EDIT
In response to your comment (because my comment would not fit):
calling
thing_ids
internally uses
pluck
pluck internally uses
select_all
...which instantiates an activerecord Result
Previous 2nd EDIT:
This line of code within pluck returns an activerecord Result:
....
result = klass.connection.select_all(relation.arel, nil, bound_attributes)
...
I just stepped through the source code for you. Using select_all will save you some memory, but in the end, an activerecord Result was still created and mapped over even when you are using the pluck method.
I would use something like this:
User.things.find_each(batch_size: 1000).map(&:id)
This will give you an array of the ids.

Performing row operations on ActiveRecord objects prior to an aggregate function

I am trying to calculate a weighted average of a variable in my model based on a second variable in my model and I'm having trouble finding a way to do it through ActiveRecord.
class Employer < ActiveRecord::Base
attr_accessible :name, :number_of_employees, :average_age
def self.wt_avg_age
#return sum(number_of_employee * average_age)/sum(number_of_employees)
end
end
In straight SQL, I would use:
SELECT id, SUM(number_of_employees*average_age)/SUM(number_of_employees)
FROM employer
GROUP BY name
Can I execute something like this on an ActiveRecord relation in an eloquent way (i.e., without pulling down separate arrays and iterating through every record to get my numerator)? I have tried different combinations using .select(), .pluck(), and sum() without any luck. I'm having trouble getting the ActiveRecord object to perform the sumproduct.
You should be able to do something like:
Employer.select("name, (SUM(number_of_employees*average_age)/SUM(number_of_employees)) as sum").group(:name)
That will return Employer instances to you, but they will only have the .name and .sum attributes on them. This will run the exact SQL query that you wanted.
It looks like ActiveRecord::Calculations#sum takes a block:
# File activerecord/lib/active_record/relation/calculations.rb, line 92
def sum(*args)
if block_given?
self.to_a.sum(*args) {|*block_args| yield(*block_args)}
else
calculate(:sum, *args)
end
end
(also see http://api.rubyonrails.org/classes/Enumerable.html#method-i-sum)
So you might try:
def self.wt_avg_age
numerator = self.all.sum { |e| e.number_of_employee * e.average_age }
denominator = self.sum :number_of_employees
return numerator / denominator
end
Take a try, maybe it can works:
def self.wt_avg_age
a = Employer.sum("number_of_employee * average_age")
b = Employer.sum('number_of_employees')
a/b
end

Mongoid random document

Lets say I have a Collection of users. Is there a way of using mongoid to find n random users in the collection where it does not return the same user twice? For now lets say the user collection looks like this:
class User
include Mongoid::Document
field :name
end
Simple huh?
Thanks
If you just want one document, and don't want to define a new criteria method, you could just do this:
random_model = Model.skip(rand(Model.count)).first
If you want to find a random model based on some criteria:
criteria = Model.scoped_whatever.where(conditions) # query example
random_model = criteria.skip(rand(criteria.count)).first
The best solution is going to depend on the expected size of the collection.
For tiny collections, just get all of them and .shuffle.slice!
For small sizes of n, you can get away with something like this:
result = (0..User.count-1).sort_by{rand}.slice(0, n).collect! do |i| User.skip(i).first end
For large sizes of n, I would recommend creating a "random" column to sort by. See here for details: http://cookbook.mongodb.org/patterns/random-attribute/ https://github.com/mongodb/cookbook/blob/master/content/patterns/random-attribute.txt
MongoDB 3.2 comes to the rescue with $sample (link to doc)
EDIT : The most recent of Mongoid has implemented $sample, so you can call YourCollection.all.sample(5)
Previous versions of mongoid
Mongoid doesn't support sample until Mongoid 6, so you have to run this aggregate query with the Mongo driver :
samples = User.collection.aggregate([ { '$sample': { size: 3 } } ])
# call samples.to_a if you want to get the objects in memory
What you can do with that
I believe the functionnality should make its way soon to Mongoid, but in the meantime
module Utility
module_function
def sample(model, count)
ids = model.collection.aggregate([
{ '$sample': { size: count } }, # Sample from the collection
{ '$project': { _id: 1} } # Keep only ID fields
]).to_a.map(&:values).flatten # Some Ruby magic
model.find(ids)
end
end
Utility.sample(User, 50)
If you really want simplicity you could use this instead:
class Mongoid::Criteria
def random(n = 1)
indexes = (0..self.count-1).sort_by{rand}.slice(0,n).collect!
if n == 1
return self.skip(indexes.first).first
else
return indexes.map{ |index| self.skip(index).first }
end
end
end
module Mongoid
module Finders
def random(n = 1)
criteria.random(n)
end
end
end
You just have to call User.random(5) and you'll get 5 random users.
It'll also work with filtering, so if you want only registered users you can do User.where(:registered => true).random(5).
This will take a while for large collections so I recommend using an alternate method where you would take a random division of the count (e.g.: 25 000 to 30 000) and randomize that range.
You can do this by
generate random offset which will further satisfy to pick the next n
elements (without exceeding the limit)
Assume count is 10, and the n is 5
to do this check the given n is less than the total count
if no set the offset to 0, and go to step 8
if yes, subtract the n from the total count, and you will get a number 5
Use this to find a random number, the number definitely will be from 0 to 5 (Assume 2)
Use the random number 2 as offset
now you can take the random 5 users by simply passing this offset and the n (5) as a limit.
now you get users from 3 to 7
code
>> cnt = User.count
=> 10
>> n = 5
=> 5
>> offset = 0
=> 0
>> if n<cnt
>> offset = rand(cnt-n)
>> end
>> 2
>> User.skip(offset).limit(n)
and you can put this in a method
def get_random_users(n)
offset = 0
cnt = User.count
if n < cnt
offset = rand(cnt-n)
end
User.skip(offset).limit(n)
end
and call it like
rand_users = get_random_users(5)
hope this helps
Since I want to keep a criteria, I do:
scope :random, ->{
random_field_for_ordering = fields.keys.sample
random_direction_to_order = %w(asc desc).sample
order_by([[random_field_for_ordering, random_direction_to_order]])
}
Just encountered such a problem. Tried
Model.all.sample
and it works for me
The approach from #moox is really interesting but I doubt that monkeypatching the whole Mongoid is a good idea here. So my approach is just to write a concern Randomizable that can included in each model you use this feature. This goes to app/models/concerns/randomizeable.rb:
module Randomizable
extend ActiveSupport::Concern
module ClassMethods
def random(n = 1)
indexes = (0..count - 1).sort_by { rand }.slice(0, n).collect!
return skip(indexes.first).first if n == 1
indexes.map { |index| skip(index).first }
end
end
end
Then your User model would look like this:
class User
include Mongoid::Document
include Randomizable
field :name
end
And the tests....
require 'spec_helper'
class RandomizableCollection
include Mongoid::Document
include Randomizable
field :name
end
describe RandomizableCollection do
before do
RandomizableCollection.create name: 'Hans Bratwurst'
RandomizableCollection.create name: 'Werner Salami'
RandomizableCollection.create name: 'Susi Wienerli'
end
it 'returns a random document' do
srand(2)
expect(RandomizableCollection.random(1).name).to eq 'Werner Salami'
end
it 'returns an array of random documents' do
srand(1)
expect(RandomizableCollection.random(2).map &:name).to eq ['Susi Wienerli', 'Hans Bratwurst']
end
end
I think it is better to focus on randomizing the returned result set so I tried:
Model.all.to_a.shuffle
Hope this helps.

Resources