In my Rails 4 app I need to find all plans that do either have an interval of month OR an amount of 0.
This doesn't work:
class Plan < ActiveRecord::Base
def self.by_interval(interval)
where("interval = ? OR amount = ?", interval, 0)
end
end
I am getting this error:
Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= 'month' OR amount = 0) ORDER BY amount DESC' at line 1: SELECT `plans`.* FROM `plans` WHERE (interval = 'month' OR amount = 0) ORDER BY amount DESC
What else might work?
Thanks for any help.
'interval' in mysql is a reserved word (http://dev.mysql.com/doc/refman/5.6/en/reserved-words.html).
Try it like this:
def self.by_interval(interval)
where("`interval` = ? OR amount = ?", interval, 0)
end
note the backticks around "interval" (not quotes)
Since you're not doing an exclusive or, but an inclusive, I would do it in two requests:
class Plan < ActiveRecord::Base
def self.by_interval(interval)
where(interval: interval) << where(amount: 0)
end
end
These are both arrays of results and the second set of results get injected into the first. I do realize this is two separate requests so it might not be as optimized as you'd like.
I believe using Rails ActiveRecord caching may be a way to save on a performance hit. I don't know if it's done automatically for you in this case, or if you should load the full table request before the queries are performed.
Just pass the arguments directly into the strings
def self.by_interval(interval)
where("interval = #{interval} OR amount = 0")
end
Related
I have a collection of users with various statuses: active, disabled, or deleted (as an enum). I want a count of users with each status as well as a count of the total number of users. What is the most efficient way for me to do that?
I've read the questions on size vs. length vs. count in Ruby and that makes me think I should load all of the user records and then iterate over the collection multiple times to get the length of each status array.
This is what my code looks like currently:
# pagination code omitted...
all_users = User.all
total_count = all_users.length
active_count = all_users.select {|u| u.status == User.statuses['active']}.length
disabled_count = all_users.select {|u| u.status == User.statuses['disabled']}.length
deleted_count = all_users.select {|u| u.status == User.statuses['deleted']}.length
The requests from the client take about 1.25-1.5 seconds as written for 1,000 users.
I've also tried making multiple DB queries with code like this:
# pagination code omitted...
total_count = User.count
active_count = User.where(status: User.statuses['active']).count
disabled_count = User.where(status: User.statuses['disabled']).count
deleted_count = User.where(status: User.statuses['deleted']).count
That might be marginally faster by ~100ms. Is there a faster way to do this?
I'm not sure if it is relevant, but for background info: I am using Rails as an API in this context to an AngularJS frontend. I am using Kaminari to paginate the collection, but I still need counts of each status. I am in a B2B environment so it is unlikely that any instance will have more than 1,000 users. I don't need to scale higher than that.
Thanks in advance!
Do it all at once, in the database by grouping your count query.
User.group(:status).count
Then to get the total number of users just sum the result. Here's an example from one of my tables. Here I'm grouping on a boolean field, but you can group on whatever you want.
> Course.group(:is_enabled).count
=> {false=>46, true=>26524}
That might be marginally faster by ~100ms.
Create an index on your 'status' column in your database:
# in your terminal
rails g migration AddIndexOnStatusOfUsers
# in db/migrate/xxxxx_add_index_on_status_of_users.rb
def change
add_index :users, :status
end
You should benchmark them all and let us know. Would be interesting. Pure SQL answers are always more scalable of course...
u = User.select('user.status')
active_count = 0
disabled_count = 0
deleted_count = 0
u.each do |u|
if u.status = 'active'
active_count += 1
elsif u.status = 'deleted'
deleted_count +=1
else
disabled_count +=1
end
end
In Ruby on Rails, I'm trying to order the matches of a player by whether the current user is the winner.
The sort order would be:
Sort by whether the current user is the winner
Then sort by created_at, etc.
I can't figure out how to do the equivalent of :
Match.all.order('winner_id == ?', #current_user.id)
I know this line is not syntactically correct but hopefully it expresses that the order must be:
1) The matches where the current user is the winner
2) the other matches
You can use a CASE expression in an SQL ORDER BY clause. However, AR doesn't believe in using placeholders in an ORDER BY so you have to do nasty things like this:
by_owner = Match.send(:sanitize_sql_array, [ 'case when winner_id = %d then 0 else 1 end', #current_user.id ])
Match.order(by_owner).order(:created_at)
That should work the same in any SQL database (assuming that your #current_user.id is an integer of course).
You can make it less unpleasant by using a class method as a scope:
class Match < ActiveRecord::Base
def self.this_person_first(id)
by_owner = sanitize_sql_array([ 'case when winner_id = %d then 0 else 1 end', id])
order(by_owner)
end
end
# and later...
Match.this_person_first(#current_user.id).order(:created_at)
to hide the nastiness.
This can be achived using Arel without writing any raw SQL!
matches = Match.arel_table
Match
.order(matches[:winner_id].eq(#current_user.id).desc)
.order(created_at: :desc)
Works for me with Postgres 12 / Rails 6.0.3 without any security warning
If you want to do sorting on the ruby side of things (instead of the SQL side), then you can use the Array#sort_by method:
query.sort_by(|a| a.winner_id == #current_user.id)
If you're dealing with bigger queries, then you should probably stick to the SQL side of things.
I would build a query and then execute it after it's built (mostly because you may not have #current_user. So, something like this:
query = Match.scoped
query = query.order("winner_id == ?", #current_user.id) if #current_user.present?
query = query.order("created_at")
#results = query.all
I need to limit and order batches of records and am using find_each. I've seen a lot of people asking for this and no really good solution. If I've missed it, please post a link!
I have 30M records and want to deal with 10M with the highest value in the weight column.
I tried using this method someone wrote: find_each_with_order but can't get it to work.
The code from that site doesn't take order as an option. Seems strange given that the name is find_each_with_order. I added it as follows:
class ActiveRecord::Base
# normal find_each does not use given order but uses id asc
def self.find_each_with_order(options={})
raise "offset is not yet supported" if options[:offset]
page = 1
limit = options[:limit] || 1000
order = options[:order] || 'id asc'
loop do
offset = (page-1) * limit
batch = find(:all, options.merge(:limit=>limit, :offset=>offset, :order=>order))
page += 1
batch.each{|x| yield x }
break if batch.size < limit
end
end
and I'm trying to use it as follows:
class GetStuff
def self.grab_em
file = File.open("1000 things.txt", "w")
rels = Thing.find_each_with_order({:limit=>100, :order=>"weight desc"})
binding.pry
things.each do |t|
binding.pry
file.write("#{t.name} #{t.id} #{t.weight}\n" )
if t.id % 20 == 0
puts t.id.to_s
end
end
file.close
end
end
BTW I have the data in postgres and am going to grab a subset and move it to neo4j, so I'm tagging with neo4j in case any of you neo4j people know how to do this. thanks.
Not exactly sure if this is what you're looking for, but you can do something like this:
weight = Thing.order(:weight).select(:weight).last(10_000_000).first.weight
Thing.where("weight > ?", weight).find_each do |t|
...your code...
end
Let's say you have an assocation in one of your models like this:
class User
has_many :articles
end
Now assume you need to get 3 arrays, one for the articles written yesterday, one of for the articles written in the last 7 days, and one of for the articles written in the last 30 days.
Of course you might do this:
articles_yesterday = user.articles.where("posted_at >= ?", Date.yesterday)
articles_last7d = user.articles.where("posted_at >= ?", 7.days.ago.to_date)
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
However, this will run 3 separate database queries. More efficiently, you could do this:
articles_last30d = user.articles.where("posted_at >= ?", 30.days.ago.to_date)
articles_yesterday = articles_last30d.select { |article|
article.posted_at >= Date.yesterday
}
articles_last7d = articles_last30d.select { |article|
article.posted_at >= 7.days.ago.to_date
}
Now of course this is a contrived example and there is no guarantee that the array select will actually be faster than a database query, but let's just assume that it is.
My question is: Is there any way (e.g. some gem) to write this code in a way which eliminates this problem by making sure that you simply specify the association conditions, and the application itself will decide whether it needs to perform another database query or not?
ActiveRecord itself does not seem to cover this problem appropriately. You are forced to decide between querying the database every time or treating the association as an array.
There are a couple of ways to handle this:
You can create separate associations for each level that you want by specifying a conditions hash on the association definition. Then you can simply eager load these associations for your User query, and you will be hitting the db 3x for the entire operation instead of 3x for each user.
class User
has_many articles_yesterday, class_name: Article, conditions: ['posted_at >= ?', Date.yesterday]
# other associations the same way
end
User.where(...).includes(:articles_yesterday, :articles_7days, :articles_30days)
You could do a group by.
What it comes down to is you need to profile your code and determine what's going to be fastest for your app (or if you should even bother with it at all)
You can get rid of the necessity of checking the query with something like the code below.
class User
has_many :articles
def article_30d
#articles_last30d ||= user.articles.where("posted_at >= ?", 30.days.ago.to_date)
end
def articles_last7d
#articles_last7d ||= articles_last30d.select { |article| article.posted_at >= 7.days.ago.to_date }
end
def articles_yesterday
#articles_yesterday ||= articles_last30d.select { |article| article.posted_at >= Date.yesterday }
end
end
What it does:
Makes only one query maximum, if any of the three is used
Calculates only the used array, and the 30d version in any case, but only once
It does not however simplifies the initial 30d query even if you do not use it. Is it enough, or you need something more?
So I have a Vendor model, and a Sale model. An entry is made in my Sale model whenever an order is placed via a vendor.
On my vendor model, I have 3 cache columns. sales_today, sales_this_week, and sales_lifetime.
For the first two, I calculated it something like this:
def update_sales_today
today = Date.today.beginning_of_day
sales_today = Sale.where("created_at >= ?", today).find_all_by_vendor_id(self.id)
self.sales_today = 0
sales_today.each do |s|
self.sales_today = self.sales_today + s.amount
end
self.save
end
So that resets that value everytime it is accessed and re-calculates it based on the most current records.
The weekly one is similar but I use a range of dates instead of today.
But...I am not quite sure how to do Lifetime data.
I don't want to clear out my value and have to sum all the Sale.amount for all the sales records for my vendor, every single time I update this record. That's why I am even implementing a cache in the first place.
What's the best way to approach this, from a performance perspective?
I might use ActiveRecord's sum method in this case (docs). All in one:
today = Date.today
vendor_sales = Sale.where(:vendor_id => self.id)
self.sales_today = vendor_sales.
where("created_at >= ?", today.beginning_of_day).
sum("amount")
self.sales_this_week = vendor_sales.
where("created_at >= ?", today.beginning_of_week).
sum("amount")
self.sales_lifetime = vendor_sales.sum("amount")
This would mean you wouldn't have to load lots of sales objects in memory to add the amounts.
You can use callbacks on the create and destroy events for your Sales model:
class SalesController < ApplicationController
after_save :increment_vendor_lifetime_sales
before_destroy :decrement_vendor_lifetime_sales
def increment_vendor_lifetime_sales
vendor.update_attribute :sales_lifetime, vendor.sales_lifetime + amount
end
def decrement_vendor_lifetime_sales
vendor.update_attribute :sales_lifetime, vendor.sales_lifetime - amount
end
end