Currently I have two models
class Author
# gender
# name
end
class Book
# status -> ['published', 'in_progress']
has_one :author
end
I decided to use group_by to group the dataset
def group_by_gender_by_status
books.group_by { |book| [book.author.gender, book.status] }
end
What do I get instead is this
{["male", "published"] => [{BooksRecord}]
["female", "published"] => [{BooksRecord}]
["male", "in_progress"] => [{BooksRecord}]
["female", "in_progress"] => [{BooksRecord}]}
My goal is to get this result
{
female: {
published: 10,
in_progress: 7
},
male: {
published: 6,
in_progress: 9
}
}
so that I can access via data[:male][:published], easier to present the data
I think you can do something like this:
books.group_by { |book| book.author.gender }
.transform_values { |books| books.map(&:status).tally }
In particular, this is leveraging Enumerable#tally, which as added to ruby version 2.7.
You didn't specify which ruby version you're actually using though, so if you're stuck on an older one, you could replace the last line with:
.transform_values { |books| books.group_by(&:status).transform_values(&:count) }
Enumerable#group_by just creates keys for grouping so you cannot use this exclusively in order to produce your desired result. Additionally as books grows iterating in this fashion will be come less and less performant.
You will be better off putting the grouping and counting on the database so that return is closer to your desired end result, like so:
def group_by_gender_by_status
books.joins(:author)
.group(Author.arel_attribute(:gender),Book.arel_attribute(:status))
.count
end
This will have a similar resulting Hash as your current group_by implementation however the counting and grouping will be performed on the database side before returning:
{["male", "published"] => 6,
["female", "published"] => 10,
["male", "in_progress"] => 9,
["female", "in_progress"] => 7}
To transition this into your desired nesting we will need to post process this data.
def group_by_gender_by_status
books.joins(:author)
.group(Author.arel_attribute(:gender),Book.arel_attribute(:status))
.count
.each_with_object(Hash.new {|h,k| h[k] = {}}) do |((gender,status),counter),obj|
obj[gender.to_sym][status.to_sym] = counter
end
end
The end result will be equivalent to your desired result and by moving the grouping and the counting to the database level it should degrade at a much slower rate.
Note: I have no idea where books came from or where this method currently exists. The implementation could potentially be further reduced by this understanding.
Related
I have an ActiveRecord model like this:
class Person < ActiveRecord::Base
attr_accessible :name
end
and need to get a hash mapping Person's ids to their names:
{1 => "Paul", 2 => "Aliyah", 3 => ... }
Now, the obvious way would be
Person.all.collect { |p| [p.id, p.name] }.to_h
However, I don't need to instantiate every Person, I just need the hash. In Rails 4, I can .pluck(:id, :name) instead of collect, however in 3.x, pluck takes only one argument. However I found this workaround to get what I want without loading the models:
Person.all.group(:id).minimum(:name)
Question: will I burn in hell? Also, is there a more elegant way to do this, and are there any drawbacks of this hacky approach that I may not be aware of? Thanks!
Here's a pretty good write up of this situation and various tactics for handling it: Plucking Multiple Columns in Rails 3
My preference of suggested solutions there is to make and include a module:
# multi_pluck.rb
require 'active_support/concern'
module MultiPluck
extend ActiveSupport::Concern
included do
def self.pluck_all(relation, *args)
connection.select_all(relation.select(args))
end
end
end
class Person < ActiveRecord::Base
attr_accessible :name
def self.pluck_id_and_name
result = connection.select_all(select(:id, :name))
if result.any?
# if you are using Ruby 2.1+
result.to_h
# Works in 1.9.3+
Hash[result]
end
end
end
Since the result should be an array of arrays we can use nifty trick to get a hash with the first element as keys and the second as values:
Hash[ [ [1, "Joe"], [2, "Jill"] ] ]
# => { 1 => "Joe", 2 => "Jill"}
See:
Convert array of 2-element arrays into a hash, where duplicate keys append additional values
To avoid loading all of the objects you could do this:
hash = Hash.new
ActiveRecord::Base.connection.execute("SELECT id, name FROM persons").each {|person| hash[person['id'].to_s] = person['name'].to_s}
My company used one of the tactics from Plucking Multiple Columns in Rails 3 before.
But we had trouble upgrading from Rails 3 to Rails 4 because it didn't work in Rails 4.
I suggest using pluck_all gem which has high test coverage in Rails 3, 4, 5, so you will not worry about future upgrades.
I have a model Event that is connected to MongoDB using Mongoid:
class Event
include Mongoid::Document
include Mongoid::Timestamps
field :user_name, type: String
field :action, type: String
field :ip_address, type: String
scope :recent, -> { where(:created_at.gte => 1.month.ago) }
end
Usually when I use ActiveRecord, I can do something like this to group results:
#action_counts = Event.group('action').where(:user_name =>"my_name").recent.count
And I get results with the following format:
{"action_1"=>46, "action_2"=>36, "action_3"=>41, "action_4"=>40, "action_5"=>37}
What is the best way to do the same thing with Mongoid?
Thanks in advance
I think you'll have to use map/reduce to do that. Look at this SO question for more details:
Mongoid Group By or MongoDb group by in rails
Otherwise, you can simply use the group_by method from Enumerable. Less efficient, but it should do the trick unless you have hundreds of thousands documents.
EDIT: Example of using map/reduce in this case
I'm not really familiar with it but by reading the docs and playing around I couldn't reproduce the exact same hash you want but try this:
def self.count_and_group_by_action
map = %Q{
function() {
key = this.action;
value = {count: 1};
emit(key, value);
# emit a new document {"_id" => "action", "value" => {count: 1}}
# for each input document our scope is applied to
}
}
# the idea now is to "flatten" the emitted documents that
# have the same key. Good, but we need to do something with the values
reduce = %Q{
function(key, values) {
var reducedValue = {count: 0};
# we prepare a reducedValue
# we then loop through the values associated to the same key,
# in this case, the 'action' name
values.forEach(function(value) {
reducedValue.count += value.count; # we increment the reducedValue - thx captain obvious
});
# and return the 'reduced' value for that key,
# an 'aggregate' of all the values associated to the same key
return reducedValue;
}
}
self.map_reduce(map, reduce).out(inline: true)
# we apply the map_reduce functions
# inline: true is because we don't need to store the results in a collection
# we just need a hash
end
So when you call:
Event.where(:user_name =>"my_name").recent.count_and_group_by_action
It should return something like:
[{ "_id" => "action1", "value" => { "count" => 20 }}, { "_id" => "action2" , "value" => { "count" => 10 }}]
Disclaimer: I'm no mongodb nor mongoid specialist, I've based my example on what I could find in the referenced SO question and Mongodb/Mongoid documentation online, any suggestion to make this better would be appreciated.
Resources:
http://docs.mongodb.org/manual/core/map-reduce/
http://mongoid.org/en/mongoid/docs/querying.html#map_reduce
Mongoid Group By or MongoDb group by in rails
This is how I went about to query for one specific element.
results << read_db.collection("users").find(:created_at => {:$gt => initial_date}).to_a
Now, I am trying to query by more than one.
db.inventory.find({ $and: [ { price: 1.99 }, { qty: { $lt: 20 } }, { sale: true } ] } )
Now how do I build up my query? Essentially I will have have a bunch of if statements, if true, i want to extend my query. I heard there is a .extend command in another langue, is there something similar in ruby?
Essentially i want to do this:
if price
query = "{ price: 1.99 }"
end
if qty
query = query + "{ qty: { $lt: 20 } }"
end
and than just have
db.inventory.find({ $and: [query]})
This syntax is wrong, what is the best way to go about doing this?
You want to end up with something like this:
db.inventory.find({ :$and => some_array_of_mongodb_queries})
Note that I've switched to the hashrocket syntax, you can't use the JavaScript notation with symbols that aren't labels. The value for :$and should be an array of individual queries, not an array of strings; so you should build an array:
parts = [ ]
parts.push(:price => 1.99) if(price)
query.push(:qty => { :$lt => 20 }) if(qty)
#...
db.inventory.find(:$and => parts)
BTW, you might run into some floating point problems with :price => 1.99, you should probably use an integer for that and work in cents instead of dollars. Some sort of check that parts isn't empty might be a good idea too.
Is there a shorter way to do the following (
#user.employees.map { |e| { id: e.id, name: e.name } }
# => [{ id: 1, name: 'Pete' }, { id: 2, name: 'Fred' }]
User has_many employees. Both classes inherit from ActiveRecord::Base.
Two things I don't like about the above
It loads employees into memory before mapping,
It's verbose (subjective I guess).
Is there a better way?
UPDATE:
see #jamesharker's solution: from ActiveRecord >= 4, pluck accepts multiple arguments:
#user.employees.pluck(:id, :name)
PREVIOUS ANSWER:
for a single column in rails >= 3.2, you can do :
#user.employees.pluck(:name)
... but as you have to pluck two attributes, you can do :
#user.employees.select([:id, :name]).map {|e| {id: e.id, name: e.name} }
# or map &:attributes, maybe
if you really need lower-level operation, just look at the source of #pluck, that uses select_all
In ActiveRecord >= 4 pluck accepts multiple arguments so this example would become:
#user.employees.pluck(:id, :name)
If you are stuck with Rails 3 you can add this .pluck_all extension : http://meltingice.net/2013/06/11/pluck-multiple-columns-rails/
Another option is to:
#user.employees.select(:id, :name).as_json
#=> [{"id" => 1, "name" => "Pete"}, {"id" => 2, "name" => "Fred"}]
I can imagine that you'd rather have symbolized keys.
If that's the case use the #symbolize_keys method.
#user.employees.select(:id, :name).as_json.map(&:symbolize_keys)
#=> [{id: 1, name: "Pete"}, {id: 2, name: "Fred"}]
See: http://api.rubyonrails.org/classes/ActiveModel/Serializers/JSON.html#method-i-as_json
Add this monkey patch which provides the multi columns pluck functionality in Rails 3.
# config/initializers/pluck_all.rb
if Rails.version[0] == '3'
ActiveRecord::Relation.class_eval do
def pluck(*args)
args.map! do |column_name|
if column_name.is_a?(Symbol) && column_names.include?(column_name.to_s)
"#{connection.quote_table_name(table_name)}.#{connection.quote_column_name(column_name)}"
else
column_name.to_s
end
end
relation = clone
relation.select_values = args
klass.connection.select_all(relation.arel).map! do |attributes|
initialized_attributes = klass.initialize_attributes(attributes)
attributes.map do |key, attr|
klass.type_cast_attribute(key, initialized_attributes)
end
end
end
end
end
Rename the method from pluck to pluck_all if you dont want to override the original pluck functionality
In terms of making a rails 3 method that behaves the same as the Rails 4 pluck with multiple columns. This outputs a similar array (rather than a hashed key value collection). This should save a bit of pain if you ever come to upgrade and want to clean up the code.
module ActiveRecord
class Relation
def pluck_all(*args)
args.map! do |column_name|
if column_name.is_a?(Symbol) && column_names.include?(column_name.to_s)
"#{connection.quote_table_name(table_name)}.#{connection.quote_column_name(column_name)}"
else
column_name.to_s
end
end
relation = clone
relation.select_values = args
klass.connection.select_all(relation.arel).map! do |attributes|
initialized_attributes = klass.initialize_attributes(attributes)
attributes.map do |key, attribute|
klass.type_cast_attribute(key, initialized_attributes)
end
end
end
end
end
Standing on the shoulders of giants and all
The pluck_all method worked well until I'm going to upgrade from Rails 3.2 to Rails 4.
Here is a gem pluck_all to solve this, making pluck_all method support not only in Rails 3 but in Rails 4 and Rails 5. Hope this will help those who are going to upgrade rails version.
Is there any way of overriding a model's id value on create? Something like:
Post.create(:id => 10, :title => 'Test')
would be ideal, but obviously won't work.
id is just attr_protected, which is why you can't use mass-assignment to set it. However, when setting it manually, it just works:
o = SomeObject.new
o.id = 8888
o.save!
o.reload.id # => 8888
I'm not sure what the original motivation was, but I do this when converting ActiveHash models to ActiveRecord. ActiveHash allows you to use the same belongs_to semantics in ActiveRecord, but instead of having a migration and creating a table, and incurring the overhead of the database on every call, you just store your data in yml files. The foreign keys in the database reference the in-memory ids in the yml.
ActiveHash is great for picklists and small tables that change infrequently and only change by developers. So when going from ActiveHash to ActiveRecord, it's easiest to just keep all of the foreign key references the same.
You could also use something like this:
Post.create({:id => 10, :title => 'Test'}, :without_protection => true)
Although as stated in the docs, this will bypass mass-assignment security.
Try
a_post = Post.new do |p|
p.id = 10
p.title = 'Test'
p.save
end
that should give you what you're looking for.
For Rails 4:
Post.create(:title => 'Test').update_column(:id, 10)
Other Rails 4 answers did not work for me. Many of them appeared to change when checking using the Rails Console, but when I checked the values in MySQL database, they remained unchanged. Other answers only worked sometimes.
For MySQL at least, assigning an id below the auto increment id number does not work unless you use update_column. For example,
p = Post.create(:title => 'Test')
p.id
=> 20 # 20 was the id the auto increment gave it
p2 = Post.create(:id => 40, :title => 'Test')
p2.id
=> 40 # 40 > the next auto increment id (21) so allow it
p3 = Post.create(:id => 10, :title => 'Test')
p3.id
=> 10 # Go check your database, it may say 41.
# Assigning an id to a number below the next auto generated id will not update the db
If you change create to use new + save you will still have this problem. Manually changing the id like p.id = 10 also produces this problem.
In general, I would use update_column to change the id even though it costs an extra database query because it will work all the time. This is an error that might not show up in your development environment, but can quietly corrupt your production database all the while saying it is working.
we can override attributes_protected_by_default
class Example < ActiveRecord::Base
def self.attributes_protected_by_default
# default is ["id", "type"]
["type"]
end
end
e = Example.new(:id => 10000)
Actually, it turns out that doing the following works:
p = Post.new(:id => 10, :title => 'Test')
p.save(false)
As Jeff points out, id behaves as if is attr_protected. To prevent that, you need to override the list of default protected attributes. Be careful doing this anywhere that attribute information can come from the outside. The id field is default protected for a reason.
class Post < ActiveRecord::Base
private
def attributes_protected_by_default
[]
end
end
(Tested with ActiveRecord 2.3.5)
Post.create!(:title => "Test") { |t| t.id = 10 }
This doesn't strike me as the sort of thing that you would normally want to do, but it works quite well if you need to populate a table with a fixed set of ids (for example when creating defaults using a rake task) and you want to override auto-incrementing (so that each time you run the task the table is populate with the same ids):
post_types.each_with_index do |post_type|
PostType.create!(:name => post_type) { |t| t.id = i + 1 }
end
Put this create_with_id function at the top of your seeds.rb and then use it to do your object creation where explicit ids are desired.
def create_with_id(clazz, params)
obj = clazz.send(:new, params)
obj.id = params[:id]
obj.save!
obj
end
and use it like this
create_with_id( Foo, {id:1,name:"My Foo",prop:"My other property"})
instead of using
Foo.create({id:1,name:"My Foo",prop:"My other property"})
This case is a similar issue that was necessary overwrite the id with a kind of custom date :
# in app/models/calendar_block_group.rb
class CalendarBlockGroup < ActiveRecord::Base
...
before_validation :parse_id
def parse_id
self.id = self.date.strftime('%d%m%Y')
end
...
end
And then :
CalendarBlockGroup.create!(:date => Date.today)
# => #<CalendarBlockGroup id: 27072014, date: "2014-07-27", created_at: "2014-07-27 20:41:49", updated_at: "2014-07-27 20:41:49">
Callbacks works fine.
Good Luck!.
For Rails 3, the simplest way to do this is to use new with the without_protection refinement, and then save:
Post.new({:id => 10, :title => 'Test'}, :without_protection => true).save
For seed data, it may make sense to bypass validation which you can do like this:
Post.new({:id => 10, :title => 'Test'}, :without_protection => true).save(validate: false)
We've actually added a helper method to ActiveRecord::Base that is declared immediately prior to executing seed files:
class ActiveRecord::Base
def self.seed_create(attributes)
new(attributes, without_protection: true).save(validate: false)
end
end
And now:
Post.seed_create(:id => 10, :title => 'Test')
For Rails 4, you should be using StrongParams instead of protected attributes. If this is the case, you'll simply be able to assign and save without passing any flags to new:
Post.new(id: 10, title: 'Test').save # optionally pass `{validate: false}`
In Rails 4.2.1 with Postgresql 9.5.3, Post.create(:id => 10, :title => 'Test') works as long as there isn't a row with id = 10 already.
you can insert id by sql:
arr = record_line.strip.split(",")
sql = "insert into records(id, created_at, updated_at, count, type_id, cycle, date) values(#{arr[0]},#{arr[1]},#{arr[2]},#{arr[3]},#{arr[4]},#{arr[5]},#{arr[6]})"
ActiveRecord::Base.connection.execute sql