Sum on multiple columns with Activerecord - ruby-on-rails

I am new to Activerecord. I want to do sum on multiple columns of a model Student. My model student is like following:
class Student < ActiveRecord::Base
attr_accessible :class, :roll_num, :total_mark, :marks_obtained, :section
end
I want something like that:
total_marks, total_marks_obtained = Student.where(:id=>student_id).sum(:total_mark, :marks_obtained)
But it is giving following error.
NoMethodError: undefined method `except' for :marks_obtained:Symbol
So I am asking whether I have to query the model two times for the above, i.e. one to find total marks and another to find marks obtained.

You can use pluck to directly obtain the sum:
Student.where(id: student_id).pluck('SUM(total_mark)', 'SUM(marks_obtained)')
# SELECT SUM(total_mark), SUM(marks_obtained) FROM students WHERE id = ?
You can add the desired columns or calculated fields to pluck method, and it will return an array with the values.

If you just want sum of columns total_marks and marks_obtained, try this
Student.where(:id=>student_id).sum('total_mark + marks_obtained')

You can use raw SQL if you need to. Something like this to return an object where you'll have to extract the values... I know you specify active record!
Student.select("SUM(students.total_mark) AS total_mark, SUM(students.marks_obtained) AS marks obtained").where(:id=>student_id)
For rails 4.2 (earlier unchecked)
Student.select("SUM(students.total_mark) AS total_mark, SUM(students.marks_obtained) AS marks obtained").where(:id=>student_id)[0]
NB the brackets following the statement. Without it the statement returns an Class::ActiveRecord_Relation, not the AR instance. What's significant about this is that you CANNOT use first on the relation.
....where(:id=>student_id).first #=> PG::GroupingError: ERROR: column "students.id" must appear in the GROUP BY clause or be used in an aggregate function

Another method is to ActiveRecord::Calculations.pluck then Enumerable#sum on the outer array and again on the inner array pair:
Student
.where(id: student_id)
.pluck(:total_mark, :marks_obtained)
.map(&:sum)
.sum
The resulting SQL query is simple:
SELECT "students"."total_mark",
"students"."marks_obtained"
FROM "students"
WHERE "students"."id" = $1
The initial result of pluck will be an array of array pairs, e.g.:
[[10, 5], [9, 2]]
.map(&:sum) will run sum on each pair, totalling the pair and flattening the array:
[15, 11]
Finally .sum on the flattened array will result in a single value.
Edit:
Note that while there is only a single query, your database will return a result row for each record matched in the where. This method uses ruby to do the totalling, so if there are many records (i.e. thousands), this may be slower than having SQL do the calculations itself like noted in the accepted answer.

Similar to the accepted answer, however, I'd suggest using arel as follows to avoid string literals (apart from renaming columns, if needed).
Student
.where(id: student_id).
.where(Student.arel_table[:total_mark].sum, Student.arel_table[:marks_obtained].sum)
which will give you an ActiveRecord::Relation result over which you can iterate, or, as you'll only get one row, you can use .first (at least for mysql).

Recently, I also had the requirement to sum up multiple columns of a ActiveRecord relation. I ended up with the following (reusable) scope:
scope :values_sum, ->(*keys) {
summands = keys.collect { |k| arel_table[k].sum.as(k.to_s) }
select(*summands)
}
So, having a model e.g. Order with columns net_amount and gross_amount you could use it as follows:
o = Order.today.values_sum(:net_amount, :gross_amount)
o.net_amount # -> sum of net amount
o.gross_amount # -> sum of gross amount

Related

Rails: How to get enum value for joined table?

I have two models with enum fields:
class TempAsset < ApplicationRecord
enum state: { running: 0, stopped: 1, terminated: 2 }
end
class AssetCredential < ApplicationRecord
enum map_status: { pending: 0, inprogress: 1, passed: 2, failed: 3 }
end
When I select column from the first table, it gives proper values from enum:
TempAsset
.joins('INNER JOIN asset_credentials
ON temp_assets.instance_id = asset_credentials.instance_id')
.pluck(:state)
.uniq
# ["stopped", "running"]
But, it gives numbers when I select column from the joined table:
TempAsset
.joins('INNER JOIN asset_credentials
ON temp_assets.instance_id = asset_credentials.instance_id')
.pluck(:map_status)
.uniq
# [0, 3, 2, 1]
So, should I do something like this:
AssetCredential.map_statuses.key(0) => "pending"
AssetCredential.map_statuses.key(1) => "inprogress"
Or is there any better way to do the same?
ActiveRecord::Enum is not designed to work across join table, therefore you cannot select or pluck other table enum column and expect the mapping value in return. As you said what you can do is to pluck the integer value and do the mapping by yourself.
Or in my case, I use enumerize gem which stores values in the database and gives you more options and customization such as validation and I18n. With this gem you can use your code above to pluck the expected values (because it stores exact value not mapping with integer).
See, the problem here is you are using active-record enums. These are stored as integers in the database, and mapping to some text is done on the application level. pluck works on the db level.
Whats happening here is when you do a inner join, it is collection columns from both tables. When you pluck from the table which you are joining-from, active record knows how to map it to the enum. In the second one, you are plucking column from another table, and active-record is unable to get the context.
Here, you can try using postgres enums for proper results.
TempAsset
.joins('
JOIN asset_credentials
ON temp_assets.instance_id = asset_credentials.instance_id
')
.where(
state: TempAsset.states[:terminated],
asset_credentials: { map_status: AssetCredentials.map_statuses[:passed] }
)
You can change the keys being passed in to whatever you need. I'm not sure what model you want returned but your could switch things around to suit your needs
I tried to simulate your problem. The interesting thing which I found here was this issue does not arise when we use 'joins' on 'one-to-many' relationship.
i.e., if one 'TempAsset' has many 'AssetCredentials', the query
TempAsset.joins(:asset_credentials)
gives out enum values as you expect. Whereas
AssetCredential.joins(:temp_asset) does produce the issue which you have mentioned.
So, If you can establish 'has_many' relationship between 'TempAsset' and 'AssetCredentials' through a 'has_one' relationship between 'TempAsset' and 'Instance', the issue could get fixed.
Given the rails associations are setup, can also try something like:
TempAsset
.joins('join asset_credentials on temp_assets.instance_id = asset_credentials.instance_id')
.map { |temp_asset| [temp_asset| [temp_asset.asset_credential.map_status] }
.uniq
Try this:
AssetCredential.joins('INNER JOIN temp_assets ON asset_credentials.instance_id = temp_assets.instance_id').pluck(:map_status).uniq

How to query to return the most common foreign key in a join table with rails

I have 3 models. Project, ProjectMaterial, and Material
A Project has_many ProjectMaterials and many Materials through ProjectMaterials.
This is bidirectional, with ProjectMaterial acting as a join table with user-submittable attributes.
I'd like to query the ProjectMaterial model to find the most frequent value of material_id. This way I can use it to find the most frequently used material.
Any help with a query would be greatly appreciated. I'm stuck. Thanks in advance!
You can chain group, count and sort methods on your ActiveRecord query like this:
ProjectMaterial.group(:material_id).count.values.sort.last
The first part ProjectMaterial.group(:material_id).count gives you the hash of each {material_id0 => rows_count0, material_id1 => rows_count1, ...}. Then, you can just get the values of the hash in an array, sort it and get the last item.
One way could be pluck ids to get the array, then count the most frequent.
ids = ProjectMaterial.pluck[:material_id]
For example: Ruby: How to find item in array which has the most occurrences?
Or better, by query to get a hash with counts:
counts = ProjectMaterial.group(:material_id).count
Once you know that you get a hash, you can sort by any ruby method, picking the most frequent or the n most frequent. Example of sorting:
counts.sort_by { |_, v| v }

How to make ActiveRecord query unique by a column

I have a Company model that has many Disclosures. The Disclosure has columns named title, pdf and pdf_sha256.
class Company < ActiveRecord::Base
has_many :disclosures
end
class Disclosure < ActiveRecord::Base
belongs_to :company
end
I want to make it unique by pdf_sha256 and if pdf_sha256 is nil that should be treated as unique.
If it is an Array, I'll write like this.
companies_with_sha256 = company.disclosures.where.not(pdf_sha256: nil).group_by(&:pdf_sha256).map do |key,values|
values.max_by{|v| v.title.length}
end
companies_without_sha256 = company.disclosures.where(pdf_sha256: nil)
companies = companies_with_sha256 + companeis_without_sha256
How can I get the same result by using ActiveRecord query?
It is possible to do it in one query by first getting a different id for each different pdf_sha256 as a subquery, then in the query getting the elements within that set of ids by passing the subquery as follows:
def unique_disclosures_by_pdf_sha256(company)
subquery = company.disclosures.select('MIN(id) as id').group(:pdf_sha256)
company.disclosures.where(id: subquery)
.or(company.disclosures.where(pdf_sha256: nil))
end
The great thing about this is that ActiveRecord is lazy loaded, so the first subquery will not be run and will be merged to the second main query to create a single query in the database. It will then retrieve all the disclosures unique by pdf_sha256 plus all the ones that have pdf_sha256 set to nil.
In case you are curious, given a company, the resulting query will be something like:
SELECT "disclosures".* FROM "disclosures"
WHERE (
"disclosures"."company_id" = $1 AND "disclosures"."id" IN (
SELECT MAX(id) as id FROM "disclosures" WHERE "disclosures"."company_id" = $2 GROUP BY "disclosures"."pdf_sha256"
)
OR "disclosures"."company_id" = $3 AND "disclosures"."pdf_sha256" IS NULL
)
The great thing about this solution is that the returned value is an ActiveRecord query, so it won't be loaded until you actually need. You can also use it to keep chaining queries. Example, you can select only the id instead of the whole model and limit the number of results returned by the database:
unique_disclosures_by_pdf_sha256(company).select(:id).limit(10).each { |d| puts d }
You can achieve this by using uniq method
Company.first.disclosures.to_a.uniq(&:pdf_sha256)
This will return you the disclosures records uniq by cloumn "pdf_sha256"
Hope this helps you! Cheers
Assuming you are using Rails 5 you could chain a .or command to merge both your queries.
pdf_sha256_unique_disclosures = company.disclosures.where(pdf_sha256: nil).or(company.disclosures.where.not(pdf_sha256: nil))
Then you can proceed with your group_by logic.
However, in the example above i'm not exactly sure what is the objective but I am curious to better understand how you would use the resulting companies variable.
If you wanted to have a hash of unique pdf_sha256 keys including nil, and its resultant unique disclosure document you could try the following:
sorted_disclosures = company.disclosures.group_by(&:pdf_sha256).each_with_object({}) do |entries, hash|
hash[entries[0]] = entries[1].max_by{|v| v.title.length}
end
This should give you a resultant hash like structure similar to the group_by where your keys are all your unique pdf_sha256 and the value would be the longest named disclosure that match that pdf_sha256.
Why not:
ids = Disclosure.select(:id, :pdf_sha256).distinct.map(&:id)
Disclosure.find(ids)
The id sill be distinct either way since it's the primary key, so all you have to do is map the ids and find the Disclosures by id.
If you need a relation with distinct pdf_sha256, where you require no explicit conditions, you can use group for that -
scope :unique_pdf_sha256, -> { where.not(pdf_sha256: nil).group(:pdf_sha256) }
scope :nil_pdf_sha256, -> { where(pdf_sha256: nil) }
You could have used or, but the relation passed to it must be structurally compatible. So even if you get same type of relations in these two scopes, you cannot use it with or.
Edit: To make it structurally compatible with each other you can see #AlexSantos 's answer
Model.select(:rating)
Result of this is an array of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient)
Model.uniq.pluck(:rating)
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']

Active Record - Chain Queries with OR

Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.

Rails: select unique values from a column

I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")

Resources