Rails Active Record .where statement optimization - ruby-on-rails

I'm building an ecommerce site and want to use an active record where statement to find shipments that are scoped to a certain supplier and certain shipment states. Here's what I have now:
Spree::Shipment.where("stock_location_id = ? and "state = ?", spree_current_user.supplier.stock_locations.first.stock_location_id, 'shipped' || 'ready')
I've found that this results in only 'shipped' statements get returned. I'd like it to display both shipped, and ready shipments. So far I can only get it to show one or the other, depending on if i put 'shipped' or 'ready' first in the query.
I'm guessing I have put the OR operator (||) in the wrong place, even though there are no errors. Can someone tell me a proper way to place OR operators in a condition in the where statement?
Thanks,
Brandon

id = spree_current_user.supplier.stock_locations
.first.stock_location_id
Spree::Shipment.where(stock_location_id: id, state: %w[shipped ready])

I figured out my answer as I was writing out the question a bit more. I wanted to share the answer in case anyone else comes across this. This seemed to do the trick:
Spree::Shipment.where("stock_location_id = ? and (state = ? or state = ?)", spree_current_user.supplier.stock_locations.first.id, 'shipped', 'ready')
I tested it in the console and it returned this SQL Output with shipments in both 'ready' and 'shipped' states:
Spree::Shipment Load (0.6ms) SELECT "spree_shipments".* FROM "spree_shipments" WHERE (stock_location_id = 8 and (state = 'shipped' or state = 'ready')) LIMIT 15 OFFSET 0
Hope this can help others. Also, if you notice that this statement seems weird or inefficient I would really like to know.
Thanks!

Related

I need advice in speeding up this rails method that involves many queries

I'm trying to display a table that counts webhooks and arranges the various counts into cells by date_sent, sending_ip, and esp (email service provider). Within each cell, the controller needs to count the webhooks that are labelled with the "opened" event, and the "sent" event. Our database currently includes several million webhooks, and adds at least 100k per day. Already this process takes so long that running this index method is practically useless.
I was hoping that Rails could break down the enormous model into smaller lists using a line like this:
#today_hooks = #m_webhooks.where(:date_sent => this_date)
I thought that the queries after this line would only look at the partial list, instead of the full model. Unfortunately, running this index method generates hundreds of SQL statements, and they all look like this:
SELECT COUNT(*) FROM "m_webhooks" WHERE "m_webhooks"."date_sent" = $1 AND "m_webhooks"."sending_ip" = $2 AND (m_webhooks.esp LIKE 'hotmail') AND (m_webhooks.event LIKE 'sent')
This appears that the "date_sent" attribute is included in all of the queries, which implies that the SQL is searching through all 1M records with every single query.
I've read over a dozen articles about increasing performance in Rails queries, but none of the tips that I've found there have reduced the time it takes to complete this method. Thank you in advance for any insight.
m_webhooks.controller.rb
def index
def set_sub_count_hash(thip) {
gmail_hooks: {opened: a = thip.gmail.send(#event).size, total_sent: b = thip.gmail.sent.size, perc_opened: find_perc(a, b)},
hotmail_hooks: {opened: a = thip.hotmail.send(#event).size, total_sent: b = thip.hotmail.sent.size, perc_opened: find_perc(a, b)},
yahoo_hooks: {opened: a = thip.yahoo.send(#event).size, total_sent: b = thip.yahoo.sent.size, perc_opened: find_perc(a, b)},
other_hooks: {opened: a = thip.other.send(#event).size, total_sent: b = thip.other.sent.size, perc_opened: find_perc(a, b)},
}
end
#m_webhooks = MWebhook.select("date_sent", "sending_ip", "esp", "event", "email").all
#event = params[:event] || "unique_opened"
#m_list_of_ips = [#List of three ip addresses]
end_date = Date.today
start_date = Date.today - 10.days
date_range = (end_date - start_date).to_i
#count_array = []
date_range.times do |n|
this_date = end_date - n.days
#today_hooks = #m_webhooks.where(:date_sent => this_date)
#count_array[n] = {:this_date => this_date}
#m_list_of_ips.each_with_index do |ip, index|
thip = #today_hooks.where(:sending_ip => ip) #Stands for "Today Hooks ip"
#count_array[n][index] = set_sub_count_hash(thip)
end
end
Well, your problem is very simple, actually. You gotta remember that when you use where(condition), the query is not straight executed in the DB.
Rails is smart enough to detect when you need a concrete result (a list, an object, or a count or #size like in your case) and chain your queries while you don't need one. In your code, you keep chaining conditions to the main query inside a loop (date_range). And it gets worse, you start another loop inside this one adding conditions to each query created in the first loop.
Then you pass the query (not concrete yet, it was not yet executed and does not have results!) to the method set_sub_count_hash which goes on to call the same query many times.
Therefore you have something like:
10(date_range) * 3(ip list) * 8 # (times the query is materialized in the #set_sub_count method)
and then you have a problem.
What you want to do is to do the whole query at once and group it by date, ip and email. You should have a hash structure after that, which you would pass to the #set_sub_count method and do some ruby gymnastics to get the counts you're looking for.
I imagine the query something like:
main_query = #m_webhooks.where('date_sent > ?', 10.days.ago.to_date)
.where(sending_ip:#m_list_of_ips)
Ok, now you have one query, which is nice, but I think you should separate the query in 4 (gmail, hotmail, yahoo and other), which gives you 4 queries (the first one, the main_query, will not be executed until you call for materialized results, don forget it). Still, like 100 times faster.
I think this is the result that should be grouped, mapped and passed to #set_sub_count instead of passing the raw query and calling methods on it every time and many times. It will be a little work to do the grouping, mapping and counting for sure, but hey, it's faster. =)
In case this helps anybody else, I learned how to fill a hash with counts in a much simpler way. More importantly, this approach runs a single query (as opposed to the 240 queries that I was running before).
#count_array[esp_index][j] = MWebhook.where('date_sent > ?', start_date.to_date)
.group('date_sent', 'sending_ip', 'event', 'esp').count

Rails left join with conditions

I have users, problems, and attempts which is a join table between users and problems. I'm looking to show an index of all the problems along with the current user's most recent attempt for each, if they have one.
I've tried four things to get a left join with conditions and none of them have worked.
The naive approach is something like...
#problems = Problem.enabled
#problems.each do { |prob|
prob.last_attempt = prob.attempts
.where(user_id: current_user.id)
.last
end
This gets all the problems and the attempts I want but is N+1 queries. So...
#problems = Problem.enabled
.includes(:attempts)
This does the left join (or the equivalent two queries) getting all the problems but also all the attempts, not just those for the current user. So...
#problems = Problem.enabled
.includes(:attempts)
.where(attempts: {user_id: current_user.id})
This gets only those problems that the current user has already attempted.
So...
//problem.rb
has_many :user_attempts,
-> (user) { where(user_id: user.id) },
class_name: 'Attempt'
//problem_controller.index
#problems = Problem.enabled
.includes(:user_attempts, current_user)
And this gives an error message from rails saying joins with instance
arguments are not supported.
So I'm stuck. What is the best way to do this? Is Arel the right tool? Can I skip active record and just get back a JSON blob? Am I just being dumb?
This question is quite similar to this one but I'd need a argument to the joined scope which isn't supported. And I'm hoping rails added something in last couple years.
Thanks so much for your help.
The way I solved this was to use raw sql. It's ugly and a security risk but I didn't find better.
results = Problem.connection.exec_query(%(
SELECT *
FROM problems
LEFT JOIN (
SELECT *
//etc.
)
))
And then manipulating the results array in memory.

Get first entry from an associated table Ruby on Rails

I have a one to many relationship: User has many Payments. I am trying to find a query that gets the first payment of each user(using created_at from the payments table).
I have found a similar question with an SQL response, but I have no idea how to write it with Active Record.
how do I query sql for a latest record date for each user
Quoting the answer:
select t.username, t.date, t.value
from MyTable t
inner join (
select username, max(date) as MaxDate
from MyTable
group by username
) tm on t.username = tm.username and t.date = tm.MaxDate
For me, it would be min instead of max.
Thank you :)
Try this one for POSTGRES
Payment.select("DISTINCT ON(user_id) *").order("user_id, created_at ASC")
And For SQL
Payment.group(:user_id).having('created_at = MAX(created_at)')
If I'm going to answer the question above with: (I don't based on given raw SQL)
User has many Payments. I am trying to find a query that gets the first payment of each user(using created_at from the payments table).
Let say:
# Assumed to have a Single User, as reference
user = User.first
# Now, get first payment (from Payment model)
user.payments.last
# .last since it will always get the first created row by created_at.
If I fully understand what you're trying to do. I'm don't know why you need max or min date?
What about this?
If you want first payment of each user
dates = Payment.group(:user_id).minimum(:created_at).values
payments = Payment.where(created_at: dates)
From payment you can find user too.
I think you have username as foreign key, you can change accordingly. :)
Let me know if you face any issue, as I tested it works.
I know this answer is not the best, but it will work even or transactions with milliseconds difference, as rails saves date(created_at and updated_at) with ms level.
I am sorry for not replying to everything, but after multiple tests, this is the quickest answer (in run time) I came with:
Payment.where(:id => Payment.group(:user_id).pluck(:id))
I am saying it might not be the quickest way because I am using a sub query. I am getting the unique values and getting the ID's:
Payment.group(:user_id).pluck(:id)
Then I am matching those ID's.
The downside of this is that it won't work reversed, for getting the last payment.
There was also a possibility to use group_by and map but, since map is coming from ruby, it is taking much more time.
I'm not sure but try this :
In your controller :
def Page
#payments = Payment.first
end
in your html.erb :
<% #payments.each do |payment| %>
<p> <%= payment.amount %> </p>
Hope this help !
Record.association.order(:created_at).first

Which is a quicker way to fetch current_user's post

I'm using Rails, I'm doing something like this - which is more efficient?
post = current_user.posts.find(29)
OR
post = Post.where("user_id = ? AND id = ?", user.id, 29).first
I'm guessing the first statement would do something like SELECT * FROM posts WHERE user_id = x (current_user is a preset User instance) then find post #29 amongst the returned array/rows; however, the second one might do something like SELECT * FROM posts WHERE user_id = x AND id = 29 LIMIT 0,1 .. is it quicker to fetch all, without any criteria, then let ruby search within the returned array/rows; OR, is criteria and a limit a quicker way to do it; OR, does it depend on the length/width of the table and countless other things? Thanks
SQL query in both cases will be the same. So there's no difference in time of execution - but the first statement is more idiomatic, hence should be preferred.

Please help me with think of a better algorithm in Ruby on Rails

I have some Events, People. There is a many-to-many relationship between them so there is a PersonEvent connecting Events to People.
Event has a date and type
PersonEvent has an event_id and a person_id
Person has a name
I'm trying to build a search form that allows the user to search by the type of an Event, and then returns a list of People who attended a Event of that type in the past and the last date they attended such an Event. This should be in the controller.
The only solution I can think of involves nested loops and will probably run very slowly. I'm definitely looping through a lot of things I don't need to be.
For each person in Person.all
For each personevent in PersonEvent.all
Add the personevent to an array if the person_event.event.type is correct
Now, loop through the array and find the event with the latest date. That's the date of the last Event attendance.
Can anyone suggest a better algorithm?
In RoR, it would be:
Person.joins(:events).where(events: { type: params[:type] })
Rails joins will create an INNER JOIN, which will discard people who don't have an associated event that meets the criteria in where.
You don't explain how your keeping the date of attendance information, so I'll leave that bit up to you.
As you have the associations already set up you should be able to do something like:
f = Person.joins(:events)
f = f.where(:events => { :type => "the_type_you_are_searching_for" })
f = f.group('people.id')
f = f.select('people.*, max(events.date) as last_event_date')
people = f.all # though you probably want to paginate really
I've done it line by line to make it easier to read in here but often you'd see the where, group and select chained together one after the other on the same line.
You need the group otherwise you'll get people returned multiple times if they have been to multiple events.
The custom select is to include the last_event_date in the results.
Why not just write a custom SQL query? It would look something like this:
SELECT * FROM person_events
INNER JOIN people ON people.id = person_events.person_id
INNER JOIN events ON events.id = person_events.event_id AND events.type = 'EventType'

Resources