Rails: Calling .limit(5) changes order of results - ruby-on-rails

I have a search function that basically runs an ordered list of model records. The problem is whenever I called .search.limit(5), the results are in a different order from when I call .search
Here is some of my method
def self.search(server_name, pvp_type)
if server_name.nil?
result = Rom::Leaderboard.order('pvp_vs desc, win_percent desc').limit(200)
end
end
When I call
Rom::Leaderboard.search(nil, 2).pluck(:actor_name)
SQL Translation:
SELECT "rom_leaderboards"."actor_name" FROM "rom_leaderboards" WHERE "rom_leaderboards"."pvp_type" = 2 ORDER BY pvp_vs desc, win_percent desc LIMIT 200
I get the following results:
[Zarglon, Lirav, adf, sdfa, Nonad, ...]
Zarglon and Lirav have the same pvp_vs & win_percent attribute values; afd, sdfa, and Nonad also have the same relationship.
Now when I call
Rom::Leaderboard.search(nil, 2).limit(5).pluck(:actor_name)
SQL Translation:
SELECT "rom_leaderboards"."actor_name" FROM "rom_leaderboards" WHERE "rom_leaderboards"."pvp_type" = 2 ORDER BY pvp_vs desc, win_percent desc LIMIT 5
I get the following results:
[Lirav, Zarglon, sfda, Nonad, adf]
These queries are both correct (since search returns a ordered list based on pvp_vs & win_percent and both list are ordered correctly). But I want them to be the same. For some reason limit changes this order. Is there anyway to keep them the same?

Suppose you try to order this array-of-arrays by the first element:
[
[ 1, 1 ],
[ 1, 2 ],
[ 1, 3 ]
]
Both of these (and several others) are valid results because you have duplicate sort keys:
[ [1,1], [1,2], [1,3] ]
[ [1,3], [1,1], [1,2] ]
You're encountering the same problem inside the database. You say that:
Zarglon and Lirav have the same pvp_vs & win_percent attribute values; afd, sdfa, and Nonad also have the same relationship.
So those five values can appear in any order and still satisfy your specified ORDER BY condition. They don't even have to come out of the database in the same order in two executions of the same query.
If you want consistent ordering, you need to ensure that each row in your result set has a unique sort key so that ties will be broken consistently. This is ActiveRecord so you'll have a unique id available so you can use that to break your ordering ties:
result = Rom::Leaderboard.order('pvp_vs desc, win_percent desc, id').limit(200)
# --------------------------------------------------------------^^
That will give you a well defined and unique ordering.

Related

Active Record .includes with where clause

I'm attempting to avoid an N+1 query by using includes, but I need to filter out some of the child records. Here's what I have so far:
Column.includes(:tickets).where(board_id: 1, tickets: {sprint_id: 10})
The problem is that only Columns containing Tickets with sprint_id of 10 are returned. I want to return all Columns with board_id of 1, and pre-fetch tickets only with a sprint_id of 10, so that column.tickets is either an empty list of a list of Ticket objects with sprint_id 10.
This is how includes is intended to work. When you add a where clause it applies to the entire query and not just loading the associated records.
One way of doing this is by flipping the query backwards:
columns = Ticket.eager_load(:columns)
.where(sprint_id: 10, columns: { board_id: 1 })
.map(&:column)
.uniq
Column.includes(:tickets).where(board_id: 1, tickets: {sprint_id: 10}) makes two SQL queries. One to select the columns that match the specified where clause, and another to select and load the tickets that their column_id is equal to the id of the matched columns.
To get all the related columns without loading unwanted tickets, you can do this:
columns = Column.where(board_id: 1).to_a
tickets = Ticket.where(column_id: columns.map(&:id), sprint_id: 10).to_a
This way you won't be able to call #tickets on each column (as it will again make a database query and you'll have the N+1 problem) but to have a similar way of accessing a column's tickets without making any queries you can do something like this:
grouped_tickets = tickets.group_by(&:column_id)
columns.each do |column|
column_tickets = grouped_tickets[column.id]
# Do something with column_tickets if they're present.
end

How to combine 3 SQL request into one and order it Rails

I'm creating filter for my Point model on Ruby on Rails app. App uses ActiveAdmin+Ransacker for filters. I wrote 3 methods to filter the Point:
def self.filter_by_customer_bonus(bonus_id)
Point.joins(:customer).where('customers.bonus_id = ?', bonus_id)
end
def self.filter_by_classificator_bonus(bonus_id)
Point.joins(:setting).where('settings.bonus_id = ?', bonus_id)
end
def self.filter_by_bonus(bonus_id)
Point.where(bonus_id: bonus_id)
end
Everything works fine, but I need to merge the result of 3 methods to one array. When The Points.count (on production server for example) > 1000000 it works too slow, and I need to merge all of them to one method. The problem is that I need to order the final merged array this way:
Result array should start with result of first method here, the next adding the second method result, and then third the same way.
Is it possible to move this 3 sqls into 1 to make it work faster and order it as I write before?
For example my Points are [1,2,3,4,5,6,7,8,9,10]
Result of first = [1,2,3]
Result of second = [2,3,4]
Result of third = [5,6,7]
After merge I should get [1,2,3,4,5,6,7] but it should be with the result of 1 method, not 3+merge. Hope you understand me :)
UPDATE:
The result of the first answer:
Point Load (8.0ms) SELECT "points".* FROM "points" INNER JOIN "customers" ON "customers"."number" = "points"."customer_number" INNER JOIN "managers" ON "managers"."code" = "points"."tp" INNER JOIN "settings" ON "settings"."classificator_id" = "managers"."classificator_id" WHERE "points"."bonus_id" = $1 AND "customers"."bonus_id" = $2 AND "settings"."bonus_id" = $3 [["bonus_id", 2], ["bonus_id", 2], ["bonus_id", 2]]
It return an empty array.
You can union these using or (documentation):
def self.filter_trifecta(bonus_id)
(
filter_by_customer_bonus(bonus_id)
).or(
filter_by_classificator_bonus(bonus_id)
).or(
filter_by_bonus(bonus_id)
)
end
Note: you might have to hoist those joins up to the first condition — I'm not sure of or will handle those forks well as-is.
Below gives you all the results in a single query. if you have indexes on the foreign keys used here it should be able to handle million records:
The one provided earlier does an AND on all 3 queries, thats why you had zero results, you need union, below should work. (Note: If you are using rails 5, there is active record syntax for union, which the first commenter provided.)
Updated:
Point.from(
"(#{Point.joins(:customer).where(customers: {bonus_id: bonus_id).to_sql}
UNION
#{Point.joins(:setting).where(settings: {bonus_id: bonus_id}).to_sql}
UNION
#{Point.where(bonus_id: bonus_id).to_sql})
AS points")
Instead you can also use your 3 methods like below:
Point.from("(#{Point.filter_by_customer_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_classificator_bonus(bonus_id).to_sql}
UNION
#{Point.filter_by_bonus(bonus_id).to_sql}
) as points")

Rails: AND operator in a has_many association

My relationship is a Client can have many ClientJobs. I want to be able to find clients that perform both Job a and Job b. I'm using 3 select boxes so I can pick a maximum of three jobs to select from. The select boxes are populated from the database.
I know how to test for 1 job with the query below. But I need a way to use an AND operator to test that both jobs exist for that client.
#clients = Client.includes("client_jobs").where(
client_jobs: { job_name: params[:job1]})
Unfortunately it's easy to do an IN operation like below, but I'm thinking the syntax for AND should be similar....I hope
#lients = Client.includes("client_jobs").where(
client_jobs: { job_name: [params[:job1], params[:job2]]})
EDIT: Posting the sql statement that hits the database from the answer below
Core Load (0.6ms) SELECT `clients`.* FROM `clients`
CoreStatistic Load (1.9ms) SELECT `client_jobs`.* FROM `client_jobs`
WHERE `client_jobs `.`client_id` IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10,........)
The second query runs through every client_job in the database. It's never tested against the params[:job1], params[:job2] etc. So #clients returns nil crashing my view template
(undefined method `map' for nil:NilClass
In my opinion, a better approach then self-joins is to simply join ClientJobs and then use GROUP BY and HAVING clauses to filter out only those records that exactly match the given associated records.
performed_jobs = %w(job job2 job3)
Client.joins(:client_jobs).
where(client_jobs: { job_name: performed_jobs }).
group("clients.id").
having("count(*) = #{performed_jobs.count}")
Let's walk through this query:
first two clauses join the ClientJobs to Clients and filter out only those, that have any of the three jobs defined (it uses the IN clause)
next, we group these joined records by Client.id so that we get the clients back
finally, the having clause ensures we only return those clients that had exactly 3 ClientJob records joined in, i.e. only those that had all the three client jobs defined.
It is the trick with HAVING(COUNT(*) = ...) that turns the IN clause (which is essentially an OR-ed list of options) into a "must have all these" clause.
To do this in a single SQL query try the following:
jobs_with_same_user = ClientJob.select(:user_id).where(job_name: "<job_name1>", user_id: ClientJob.select(:user_id).where(job_name: "<job_name2>"))
#clients = Client.where(id: jobs_with_same_user)
Here's what this query is doing:
Select the user_ids of all Client jobs with [job_name2]
Select the user_ids of all Client jobs with user_id IN result set from (1) AND having [job_name1]
Select all users with using (2) as a subquery.
Not many know this but Rails 4+ supports subqueries. Basically this is a self join acting as subquery for the clients:
SELECT *
FROM clients
WHERE id IN <jobs_with_same_user>
Also, I'm not sure if you're referencing the client_jobs association in your view, but if you are, add the includes statement to avoid an N+1 query:
#clients = Client.includes(:client_jobs).where(id: jobs_with_same_user)
EDIT
If you prefer, the same result can be achieved with a self-referencing inner join:
jobs_with_same_user = ClientJob
.select("client_jobs.user_id AS user_id")
.joins("JOIN client_jobs inner_client_jobs ON inner_client_jobs.user_id=client_jobs.user_id")
.where(client_jobs: { job_name: "<first_job_name1>" }, inner_client_jobs: { job_name: "<job_name2>" })
#clients = Client.where(id: jobs_with_same_user)

How do I query on a subset of ActiveModel records?

I've rewritten this question as my previous explanation was causing confusion.
In the SQL world, you have an initial record set that you apply a query to. The output of this query is the result set. Generally, the initial record set is an entire table of records and the result set is the records from the initial record set that match the query ruleset.
I have a use case where I need my application to occasionally operate on only a subset of records in a table. If a table has 10,000 records in it, I'd like my application to behave like only the first 1,000 records exist. These should be the same 1,000 records each time. In other words, I want the initial record set to be the first 1,000 devices in a table (when ordered by primary key), and the result set the resulting records from these first 1,000 devices.
Some solutions have been proposed, and it's revealed that my initial description was not very clear. To be more explicit, I am not trying to implement pagination. I'm also not trying to limit the number of results I receive (which .limit(1,000) would indeed achieve).
Thanks!
This is the line in your question that I don't understand:
This causes issues though with both of the calls, as limit limits the results of the query, not the database rows that the query is performed on.
This is not a Rails thing, this is a SQL thing.
Device.limit(n) runs SELECT * FROM device LIMIT n
Limit always returns a subset of the queried result set.
Would first(n) accomplish what you want? It will both order the result set ascending by the PK and limit the number of results returned.
SQL Statements can be chained together. So if you have your subset, you can then perform additional queries with it.
my_subset = Device.where(family: "Phone")
# SQL: SELECT * FROM Device WHERE `family` = "Phone"
my_results = my_subset.where(style: "Touchscreen")
# SQL: SELECT * FROM Device WHERE `family` = "Phone" AND `style` = "Touchscreen"
Which can also be written as:
my_results = Device.where(family: "Phone").where(style: "Touchscreen")
my_results = Device.where(family: "Phone", style: "Touchscreen")
# SQL: SELECT * FROM Device WHERE `family` = "Phone" AND `style` = "Touchscreen"
From your question, if you'd like to select the first 1,000 rows (ordered by primary key, pkey) and then query against that, you'll need to do:
my_results = Device.find_by_sql("SELECT *
FROM (SELECT * FROM devices ORDER BY pkey ASC LIMIT 1000)
WHERE `more_searching` = 'happens here'")
You could specifically ask for a set of IDs:
Device.where(id: (1..4).to_a)
That will construct a WHERE clause like:
WHERE id IN (1,2,3,4)

Rails: Find biggest number out of val.size

user = SkillUser.find_all_by_skill_id(skill_id)
user.size
gives me: 1 2 2 1 3 1 3 1 3 2 1 1 3
How can I get the biggest value (in this case 3) out of this row of numbers?
Thanks for help
You can use the maximum scope on your ActiveRelation:
SkillUser.maximum(:rating)
If you want the maximum of an attribute called rating.
If you want to count the number of users per skill id, try:
SkillUser.count(:group => :skill_id).max_by { |skill_id,count| count }
This gives you both the skill_id and the number of users for the skill with most users.
For a more efficient way (by doing the whole calculation in SQL), try:
SkillUser.limit(1).reverse_order.count(:group => :skill_id, :order => :count)
# Giving the SQL:
# => SELECT COUNT(*) AS count_all, "skill_users"."skill_id" AS skill_id
# FROM "skill_users" GROUP BY "skill_users"."skill_id"
# ORDER BY "skill_users"."count" DESC LIMIT 1
Be aware that count must be called last because it doesn't return an ActiveRelation for you to further scope the query.
You should use ActiveRecord::Calculations
http://ar.rubyonrails.org/classes/ActiveRecord/Calculations/ClassMethods.html
for performance reasons
1.9.3-194 (main):0 > User.maximum(:id)
(1.6ms) SELECT MAX("users"."id") AS max_id FROM "users"
=> 3
Fastest way to find a single maximum value in an unsorted list of
integer is to scan the list from left to right and memorize the
largest value so far.
If you sort the list first, you get the
additional benefit of easily finding the 2nd, 3rd etc. largest
values easily as well.
If you take one of the "maximum" methods hidden in ruby ... you should check what the implementors are doing to pick the max and compare it to 1. and 2. above :-)
Explanations:
to 1. Doing it this way, you just have to pick each value in the list exactly once and compare it once to the maximum so-far.
to 2. Sorting costs O(n*log n) ops in the average if you got a list with n entries. Obviously this is more than the O(n) in solution 1., but you get a bit more
to 3. Well.. I prefer knowing what happens, but your preferences might vary

Resources