Rails 3.2 Query - .exists? - ruby-on-rails

I have about 500 outlets. Each outlet will be monitored a minimum of one time per day. I am trying to get a list of outlets that have been monitored each day.
I am having a problem with the query at the moment, any help is appreciated:
<% for outlet in #outlets %>
<% if Monitoring.exists?( :outlet_id => outlet.id, 'DATE(created_at) = ?', Date.today ) %>
The #outlets is an instance variable containing Outlet.all.
This query leaves me with a syntax error. What would be the correct way to do this? I'm trying to check that the Monitoring belongs to the Outlet, and that the Monitoring record was created today.
Also, I'm not entirely sure of the speed implications of this query. There will be a max of 2000 outlets on a page at one time (it's a dashboard, so they appear as either red or green dots).
Any help greatly appreciated.

You're getting a syntax error because you're trying to mix implicit-Hash and implicit-Array arguments:
Monitoring.exists?(:outlet_id => outlet.id, 'DATE(created_at) = ?', Date.today)
The exists? methods wants a Hash as its single argument. You want to use an SQL function in the query though, that means that you have to use the Model.where(...).exists? form:
Monitoring.where(:outlet_id => outlet.id).where('date(created_at) = ?', Date.today).exists?
That still leaves you hitting the database over and over again to light up your lights. You could precompute the whole mess with something like this:
counts = Monitoring.where('date(created_at) = ?', Date.today).count(:group => :outlet_id)
And then look use counts.has_key? outlet.id in your loop. Adding a where(:outlet_id => outlet_ids) (where outlet_ids are the IDs you're interested in) might make sense as well. You might be able to combine the count query with the query that is generating the #outlets too.

Related

I need advice in speeding up this rails method that involves many queries

I'm trying to display a table that counts webhooks and arranges the various counts into cells by date_sent, sending_ip, and esp (email service provider). Within each cell, the controller needs to count the webhooks that are labelled with the "opened" event, and the "sent" event. Our database currently includes several million webhooks, and adds at least 100k per day. Already this process takes so long that running this index method is practically useless.
I was hoping that Rails could break down the enormous model into smaller lists using a line like this:
#today_hooks = #m_webhooks.where(:date_sent => this_date)
I thought that the queries after this line would only look at the partial list, instead of the full model. Unfortunately, running this index method generates hundreds of SQL statements, and they all look like this:
SELECT COUNT(*) FROM "m_webhooks" WHERE "m_webhooks"."date_sent" = $1 AND "m_webhooks"."sending_ip" = $2 AND (m_webhooks.esp LIKE 'hotmail') AND (m_webhooks.event LIKE 'sent')
This appears that the "date_sent" attribute is included in all of the queries, which implies that the SQL is searching through all 1M records with every single query.
I've read over a dozen articles about increasing performance in Rails queries, but none of the tips that I've found there have reduced the time it takes to complete this method. Thank you in advance for any insight.
m_webhooks.controller.rb
def index
def set_sub_count_hash(thip) {
gmail_hooks: {opened: a = thip.gmail.send(#event).size, total_sent: b = thip.gmail.sent.size, perc_opened: find_perc(a, b)},
hotmail_hooks: {opened: a = thip.hotmail.send(#event).size, total_sent: b = thip.hotmail.sent.size, perc_opened: find_perc(a, b)},
yahoo_hooks: {opened: a = thip.yahoo.send(#event).size, total_sent: b = thip.yahoo.sent.size, perc_opened: find_perc(a, b)},
other_hooks: {opened: a = thip.other.send(#event).size, total_sent: b = thip.other.sent.size, perc_opened: find_perc(a, b)},
}
end
#m_webhooks = MWebhook.select("date_sent", "sending_ip", "esp", "event", "email").all
#event = params[:event] || "unique_opened"
#m_list_of_ips = [#List of three ip addresses]
end_date = Date.today
start_date = Date.today - 10.days
date_range = (end_date - start_date).to_i
#count_array = []
date_range.times do |n|
this_date = end_date - n.days
#today_hooks = #m_webhooks.where(:date_sent => this_date)
#count_array[n] = {:this_date => this_date}
#m_list_of_ips.each_with_index do |ip, index|
thip = #today_hooks.where(:sending_ip => ip) #Stands for "Today Hooks ip"
#count_array[n][index] = set_sub_count_hash(thip)
end
end
Well, your problem is very simple, actually. You gotta remember that when you use where(condition), the query is not straight executed in the DB.
Rails is smart enough to detect when you need a concrete result (a list, an object, or a count or #size like in your case) and chain your queries while you don't need one. In your code, you keep chaining conditions to the main query inside a loop (date_range). And it gets worse, you start another loop inside this one adding conditions to each query created in the first loop.
Then you pass the query (not concrete yet, it was not yet executed and does not have results!) to the method set_sub_count_hash which goes on to call the same query many times.
Therefore you have something like:
10(date_range) * 3(ip list) * 8 # (times the query is materialized in the #set_sub_count method)
and then you have a problem.
What you want to do is to do the whole query at once and group it by date, ip and email. You should have a hash structure after that, which you would pass to the #set_sub_count method and do some ruby gymnastics to get the counts you're looking for.
I imagine the query something like:
main_query = #m_webhooks.where('date_sent > ?', 10.days.ago.to_date)
.where(sending_ip:#m_list_of_ips)
Ok, now you have one query, which is nice, but I think you should separate the query in 4 (gmail, hotmail, yahoo and other), which gives you 4 queries (the first one, the main_query, will not be executed until you call for materialized results, don forget it). Still, like 100 times faster.
I think this is the result that should be grouped, mapped and passed to #set_sub_count instead of passing the raw query and calling methods on it every time and many times. It will be a little work to do the grouping, mapping and counting for sure, but hey, it's faster. =)
In case this helps anybody else, I learned how to fill a hash with counts in a much simpler way. More importantly, this approach runs a single query (as opposed to the 240 queries that I was running before).
#count_array[esp_index][j] = MWebhook.where('date_sent > ?', start_date.to_date)
.group('date_sent', 'sending_ip', 'event', 'esp').count

Rails left join with conditions

I have users, problems, and attempts which is a join table between users and problems. I'm looking to show an index of all the problems along with the current user's most recent attempt for each, if they have one.
I've tried four things to get a left join with conditions and none of them have worked.
The naive approach is something like...
#problems = Problem.enabled
#problems.each do { |prob|
prob.last_attempt = prob.attempts
.where(user_id: current_user.id)
.last
end
This gets all the problems and the attempts I want but is N+1 queries. So...
#problems = Problem.enabled
.includes(:attempts)
This does the left join (or the equivalent two queries) getting all the problems but also all the attempts, not just those for the current user. So...
#problems = Problem.enabled
.includes(:attempts)
.where(attempts: {user_id: current_user.id})
This gets only those problems that the current user has already attempted.
So...
//problem.rb
has_many :user_attempts,
-> (user) { where(user_id: user.id) },
class_name: 'Attempt'
//problem_controller.index
#problems = Problem.enabled
.includes(:user_attempts, current_user)
And this gives an error message from rails saying joins with instance
arguments are not supported.
So I'm stuck. What is the best way to do this? Is Arel the right tool? Can I skip active record and just get back a JSON blob? Am I just being dumb?
This question is quite similar to this one but I'd need a argument to the joined scope which isn't supported. And I'm hoping rails added something in last couple years.
Thanks so much for your help.
The way I solved this was to use raw sql. It's ugly and a security risk but I didn't find better.
results = Problem.connection.exec_query(%(
SELECT *
FROM problems
LEFT JOIN (
SELECT *
//etc.
)
))
And then manipulating the results array in memory.

Which is faster "count" or "length"?

Assuming there are 2 models called User and Post
Which will be better performance(fast) either "Plan A" or "Plan B"?
"Plan A"
controller
#users = User.find_all_by_country(params[:country])
#posts = Post.find_all_by_category(params[:category])
view
<%= #users.count.to_s %>
<%= #posts.count.to_s %>
"Plan B"
controller
#users = User.find_all_by_country(params[:country])
#posts = Post.find_all_by_category(params[:category])
view
<%= #users.length.to_s %>
<%= #posts.length.to_s %>
In ruby, count, length and size all do pretty much the same thing regarding arrays. See here for more info.
When using ActiveRecord objects, however, count is better than length, and size is even better.
find_all_by_country returns a dumb array so you shouldn't use that method (because it always returns an array). Instead, use where(country: params[:country]).
I'll let Code School's Rails Best Practices slide nÂș 93 speak for itself (and hope they don't get mad at me for reproducing it here).
Just in case the image gets taken down, basically:
length always pulls all the records and then calls .length on the array - bad
count always does a count query - good
size looks at the cache if you have a cache counter, otherwise does a count query - best
Both will be the same, count with no arguments and length are identical as you are invoking them on a Ruby array (returned by the magic find_* method), and not an ActiveRecord object.
That said, both methods are the worst way to do this, if you're simply interested in the number of matching records.
Instead of instantiating the entire result set just to find its length, use .count on an actual ActiveRecord relation:
#num_users = User.where(country: params[:country]).count
#num_posts = Post.where(category: params[:category]).count
This will actually execute as select count(*) from instead of a full select * from, which will be much faster depending on the number of results.

Building an ILIKE clause from an array

I'm experimenting with a few concepts (actually playing and learning by building a RoR version of the 1978 database WHATSIT?).
It basically is a has_many :through structure with Subject -> Tags <- Value. I've tried to replicate a little of the command line structure by using a query text field to enter the commands. Basically things like: What's steve's phone.
Anyhow, with that interface most of the searches use ILIKE. I though about enhancing it by allowing OR conditions using some form of an array. Something like What's steve's [son,daugher]. I got it working by creating the ILIKE clause directly, but not with string replacement.
def bracket_to_ilike(arrel,name,bracket)
bracket_array = bracket.match(/\[([^\]]+)\]/)[1].split(',')
like_clause = bracket_array.map {|i| "#{name} ILiKE '#{i}' "}.join(" OR ")
arrel.where(like_clause)
end
bracket_to_ilike(tags,'tags.name','[son,daughter]') produces the like clause tags.name ILiKE 'son' OR tags.name ILiKE 'daughter'
And it get the relations, but with all the talk about using the form ("tags.name ILiKE ? OR tags.name ? ",v1,v2,vN..)., I though I'd ask if anyone has any ideas on how to do that.
Creating variables on the fly is doable from what I've searched, but not in favor. I just wondered if anyone has tried creating a method that can add a where clause that has a variable number parameters.I tried sending the where clause to the relation, but it didn't like that.
Steve
Couple of things to watch out for in your code...
What will happen when one of the elements of bracket_array contains a single quote?
What will happen if I take it step farther and set an element to say "'; drop tables..."?
My first stab at refactoring your code would be to see if Arel can do it. Or Sequeel, or whatever they call the "metawhere" gem these days. My second stab would be something like this:
arrel.where( [ bracket_array.size.times.map{"#{name} ILIKE ?"}.join(' OR '), *bracket_array ])
I didn't test it, but the idea is to use the size of bracket_array to generate a string of OR'd conditions, then use the splat operator to pass in all the values.
Thanks to Phillip for pointing me in the right direction.
I didn't know you could pass an array to a where clause - that opened up some options
I had used the splat operator a few times, but it didn't hit me that it actually creates an object(variable)
The [son,daughter] stuff was just a console exercise to see what I could do, but not sure what I was going to do with it. I ended up taking the model association and creating the array out of the picture and implemented OR searches.
def array_to_ilike(col_name,keys)
ilike = [keys.map {|i| "#{col_name} ILiKE ? "}.join(" OR "), *keys ]
#ilike = [keys.size.times.map{"#{col_name} ILIKE ?"}.join(' OR '), *keys ]
#both work, guess its just what you are use to.
end
I then allowed a pipe(|) character in my subject,tag,values searches, so a WHATSIT style question
What's Steve's Phone Home|Work => displays home and work phone
steve phone home|work The 's stuff is just for show
steve son|daughter => displays children
phone james%|lori% => displays phone number for anyone who's name starts with james or lori
james%|lori% => dumps all information on anyone who's name starts with james or lori
The query then parses the command and if it encounters a | in any of the words, it will do things like:
t_ilike = array_to_ilike('tags.name',name.split("|"))
# or I actually stored it off on the inital parse
t_ilike = #tuple[:tag][:ilike] ||= ['tags.name ilike ?',tag]
Again this is just a learning exercise in creating a non-CRUD class to deal with the parsing and searching.
Steve

Getting the last document of limited Mongoid query result and .count()

I'm using Mongoid to work with MongoDB. Everything is fine, I like it very much and so on. In my blog application (posts controller, index action) I have this code:
#posts = Post.without(:comments)
#posts = #posts.my_search(params[:s]) if params[:s]
#posts = #posts.order_by([:created_at, :desc])
#posts = #posts.where(:pid.lt => params[:p].to_i+1) if params[:p]
#posts = #posts.limit(items_per_page+1)
The part with "where" is implementation of my own pagination method (allows to page results in one direction only, but without skip(), what I consider a plus). Now, there are few small problems that make me feel uncomfortable:
For my pagination to work I need to get the last post within that limit. But when I do #posts.last I'm getting last document of the whole query without limit. Ok, this is strange, but not a big problem. Other than that, query results act like almost-ordinary-array, so at this moment I'm getting the last element with #posts.pop (funny, but it doesn't remove any documents) or #posts.fetch(-1)
I have a feeling that this isn't "right way" and there mush be something more elegant. Also
#posts.count generates second query exactly the same as first one (without limit) but with "count" only and I don't like it.
If I make the last line look like
#posts = #posts.limit(items_per_page+1).to_ary
to convert query results into array, everything generates only one query (good), but now #posts.count stops reporting what I need (total amount of documents without limit applied) and behaves exactly like #posts.size - it returns items_per_page+1 or less (bad).
So, here are my questions:
1) What is a "correct" way to get the last document of query results within given limit?
2) How to get total amount of documents with given conditions applied without generating additional query?
UPD:
3) #posts.first generates additional query, how to prevent it and just get first document before I iterate all documents?
Getting the last document:
Post.last
Getting last document with some other queries:
Post.order_by([:created_at, :desc]).last
Getting total number documents:
Post.order_by([:created_at, :desc]).count
Recommendation: Just use the built in pagination
#posts = Post.limit(10).paginate(:page=>pararms[:page])
later:
<%= will_paginate #posts %>
Regarding the additional queries -- mongoid lazy loads everything:
#posts = Post.all #no query has been run yet
#posts.first #Ok, a query has finally been run because you are accessing the objects

Resources