The answer to this question should be simple, but I haven't found one through the active record querying guide, other questions here on SO, or through messing around in the Rails console.
I simply want to query the database through the active record querying interface and return the value of a single column of the first or last entry, without having to traverse through the entire table (will explain in a moment).
There is a way to do this with pluck, however the structure of the query messages are as follows:
initial = Message.where("id = ?", some_id).pluck(:value).first
final = Message.where("id = ?", some_id).pluck(:value).last
Unfortunately, this is an extremely inefficient operation as it plucks the value attribute out of every record where there is a match on the provided id, before returning either just the first or last entry. I would like to basically reorder the statements to be something along the lines of:
initial = Message.where("id = ?", some_id).first.pluck(:value)
final = Message.where("id = ?", some_id).last.pluck(:value)
However, I get an NoMethodError explaining there is no method pluck for Message. I've tried to do this various ways:
initial = Message.where("id = ?", some_id).first(:value)
initial = Message.where("id = ?", some_id).first.select(:value)
...
But all return some sort of error. I know returning the
Oops
Somehow part of my question got cut off (including the answer I had at the end) - I'll see if I can find it, but I explored using the select() method, in which I found that select only takes one argument meaning a query string must be built as it cannot take optional arguments like id = ?, some_id, but then I found that just appending a .value (where value is the column attribute that you are trying to get) works, so I switched back to the where method as shown in the answer below.
Answer is in the question, but if you're trying to do something like this:
initial = Message.where("id = ?", some_id).pluck(:value).first
final = Message.where("id = ?", some_id).pluck(:value).last
Change it to this (just reference the column name, in this example it is value, but it could be amount or something):
initial = Message.where("id = ?", some_id).first.value
final = Message.where("id = ?", some_id).last.value
Related
I'm trying to display a table that counts webhooks and arranges the various counts into cells by date_sent, sending_ip, and esp (email service provider). Within each cell, the controller needs to count the webhooks that are labelled with the "opened" event, and the "sent" event. Our database currently includes several million webhooks, and adds at least 100k per day. Already this process takes so long that running this index method is practically useless.
I was hoping that Rails could break down the enormous model into smaller lists using a line like this:
#today_hooks = #m_webhooks.where(:date_sent => this_date)
I thought that the queries after this line would only look at the partial list, instead of the full model. Unfortunately, running this index method generates hundreds of SQL statements, and they all look like this:
SELECT COUNT(*) FROM "m_webhooks" WHERE "m_webhooks"."date_sent" = $1 AND "m_webhooks"."sending_ip" = $2 AND (m_webhooks.esp LIKE 'hotmail') AND (m_webhooks.event LIKE 'sent')
This appears that the "date_sent" attribute is included in all of the queries, which implies that the SQL is searching through all 1M records with every single query.
I've read over a dozen articles about increasing performance in Rails queries, but none of the tips that I've found there have reduced the time it takes to complete this method. Thank you in advance for any insight.
m_webhooks.controller.rb
def index
def set_sub_count_hash(thip) {
gmail_hooks: {opened: a = thip.gmail.send(#event).size, total_sent: b = thip.gmail.sent.size, perc_opened: find_perc(a, b)},
hotmail_hooks: {opened: a = thip.hotmail.send(#event).size, total_sent: b = thip.hotmail.sent.size, perc_opened: find_perc(a, b)},
yahoo_hooks: {opened: a = thip.yahoo.send(#event).size, total_sent: b = thip.yahoo.sent.size, perc_opened: find_perc(a, b)},
other_hooks: {opened: a = thip.other.send(#event).size, total_sent: b = thip.other.sent.size, perc_opened: find_perc(a, b)},
}
end
#m_webhooks = MWebhook.select("date_sent", "sending_ip", "esp", "event", "email").all
#event = params[:event] || "unique_opened"
#m_list_of_ips = [#List of three ip addresses]
end_date = Date.today
start_date = Date.today - 10.days
date_range = (end_date - start_date).to_i
#count_array = []
date_range.times do |n|
this_date = end_date - n.days
#today_hooks = #m_webhooks.where(:date_sent => this_date)
#count_array[n] = {:this_date => this_date}
#m_list_of_ips.each_with_index do |ip, index|
thip = #today_hooks.where(:sending_ip => ip) #Stands for "Today Hooks ip"
#count_array[n][index] = set_sub_count_hash(thip)
end
end
Well, your problem is very simple, actually. You gotta remember that when you use where(condition), the query is not straight executed in the DB.
Rails is smart enough to detect when you need a concrete result (a list, an object, or a count or #size like in your case) and chain your queries while you don't need one. In your code, you keep chaining conditions to the main query inside a loop (date_range). And it gets worse, you start another loop inside this one adding conditions to each query created in the first loop.
Then you pass the query (not concrete yet, it was not yet executed and does not have results!) to the method set_sub_count_hash which goes on to call the same query many times.
Therefore you have something like:
10(date_range) * 3(ip list) * 8 # (times the query is materialized in the #set_sub_count method)
and then you have a problem.
What you want to do is to do the whole query at once and group it by date, ip and email. You should have a hash structure after that, which you would pass to the #set_sub_count method and do some ruby gymnastics to get the counts you're looking for.
I imagine the query something like:
main_query = #m_webhooks.where('date_sent > ?', 10.days.ago.to_date)
.where(sending_ip:#m_list_of_ips)
Ok, now you have one query, which is nice, but I think you should separate the query in 4 (gmail, hotmail, yahoo and other), which gives you 4 queries (the first one, the main_query, will not be executed until you call for materialized results, don forget it). Still, like 100 times faster.
I think this is the result that should be grouped, mapped and passed to #set_sub_count instead of passing the raw query and calling methods on it every time and many times. It will be a little work to do the grouping, mapping and counting for sure, but hey, it's faster. =)
In case this helps anybody else, I learned how to fill a hash with counts in a much simpler way. More importantly, this approach runs a single query (as opposed to the 240 queries that I was running before).
#count_array[esp_index][j] = MWebhook.where('date_sent > ?', start_date.to_date)
.group('date_sent', 'sending_ip', 'event', 'esp').count
I've got two attributes I'm trying to average, but it's only averaging the second field here. is there a way to do this?
e = TiEntry.where('ext_trlid = ? AND mat_pidtc = ?', a.trlid, a.pidtc).average(:mat_mppss_rprcp && :mat_fppss_rprcp)
e = TiEntry.where('ext_trlid = ? AND mat_pidtc = ?', a.trlid, a.pidtc).select("AVG(mat_mppss_rprcp) AS avg1, AVG(mat_fppss_rprcp) AS avg2").map { |i| [i.avg1, i.avg2] }
Is this working for you? it works as the average method does, but you can support as may values as you want
The advantage between this and the other queries here is this only uses one simple SQL query. The others fetch with an SQL everything in your table(can take some time if table is big) and then computes the average in ruby language
I am sure that you have all ready looked at http://api.rubyonrails.org/classes/ActiveRecord/Calculations.html#method-i-average
But you cant get the average of 2 things.
what you can do not to repeat your query is:
entries = TiEntry.where('ext_trlid = ? AND mat_pidtc = ?', a.trlid, a.pidtc)
average_mppss = entries.average(:mat_mppss_rprcp)
average_fppss = entries.average(:mat_fppss_rprcp)
this will only execute your query one time
I hope that this works for you
I need to find a way to display all Vacancies from my Vacancy model except the ones that a user already applied for.
I keep the IDs of the vacancies a certain user applied for in a seperate model AppliedVacancies.
I was thinking something line the lines of:
#applied = AppliedVacancies.where(employee_id: current_employee)
#appliedvacancies_id = []
#applied.each do |appliedvacancy|
#appliedvacancies_id << appliedvacancy.id
end
#notyetappliedvacancies = Vacancy.where("id != ?", #appliedvacancy_id)
But it does not seem to like getting an array of IDs. How would I go about fixing this?
I get following error:
PG::DatatypeMismatch: ERROR: argument of WHERE must be type boolean, not type record
LINE 1: SELECT "vacancies".* FROM "vacancies" WHERE (id != 13,14)
^
: SELECT "vacancies".* FROM "vacancies" WHERE (id != 13,14)
This is purely an SQL problem.
You cannot use != to compare a value to a set of values. You need to use the IN operator.
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)", #appliedvacancy_id)
As an aside, you can drastically improve the code you've written so far. You are needlessly instantiating complete ActiveRecord models for every record found in your applied_vacancies table, when all you need are the IDs.
A first pass at improvement would be to use pluck to skip the entire process and go straight to the list of IDs:
ids = AppliedVacancies.where(employee_id: current_employee).pluck(:id)
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)", ids)
Next, you can go a step further and eliminate the first query all together (or rather, combine it with the last query as a sub-query) by leaving it as an AREL projection which can be subbed into the second query directly:
ids = AppliedVacancies.select(:id).where(employee_id: current_employee)
#notyetappliedvacancies = Vacancy.where("id NOT IN (?)",App)
This will generate a single query:
select * from vacancies where id not in (select id from applied_vacancies where employee_id = <value>)
Answer like #meagar, but Rails 4 way:
#notyetappliedvacancies = Vacancy.where.not(id: #appliedvacancy_id)
I would like to understand why in Rails 4 (4.2.0) I see the following behaviour when manipulating data in a join table:
student.student_courses
returns all associated records of courses for a given user;
but the following will save changes
student.student_courses[0].status = "attending"
student.student_courses[0].save
while this will not
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
Why is that, why are those two working differently, is the first one the correct way to do it ?
student.student_courses[0] and student.student_courses.find(1) are subtly different things.
When you say student.student_courses, you're just building a query in an ActiveRecord::Relation. Once you do something to that query that requires a trip to the database, the data is retrieved. In your case, that something is calling [] or find. When you call []:
student.student_courses[0]
your student will execute the underlying query and stash all the student_courses somewhere. You can see this by looking at:
> student.student_courses[0].object_id
# and again...
> student.student_courses[0].object_id
# same number is printed twice
But if you call find, only one object is retrieved and a new one is retrieved each time:
> student.student_courses.find(1).object_id
# and again...
> student.student_courses.find(1).object_id
# two different numbers are seen
That means that this:
student.student_courses[0].status = "attending"
student.student_courses[0].save
is the same as saying:
c = student.student_courses[0]
c.status = "attending"
c.save
whereas this:
student.student_courses.find(1).status = "attending"
student.student_courses.find(1).save
is like this:
c1 = student.student_courses.find(1)
c1.status = "attending"
c2 = student.student_courses.find(1)
c2.save
When you use the find version, you're calling status= and save on entirely different objects and since nothing was actually changed in the one that you save, the save doesn't do anything useful.
student_courses is an ActiveRecord::Relation, basically a key => value store. The find method would only work on a model
I'm using Rails, I'm doing something like this - which is more efficient?
post = current_user.posts.find(29)
OR
post = Post.where("user_id = ? AND id = ?", user.id, 29).first
I'm guessing the first statement would do something like SELECT * FROM posts WHERE user_id = x (current_user is a preset User instance) then find post #29 amongst the returned array/rows; however, the second one might do something like SELECT * FROM posts WHERE user_id = x AND id = 29 LIMIT 0,1 .. is it quicker to fetch all, without any criteria, then let ruby search within the returned array/rows; OR, is criteria and a limit a quicker way to do it; OR, does it depend on the length/width of the table and countless other things? Thanks
SQL query in both cases will be the same. So there's no difference in time of execution - but the first statement is more idiomatic, hence should be preferred.