Limit an array by the sum of a value within the records in rails3 - ruby-on-rails

So lets say I have the following in a Post model, each record has the field "num" with a random value of a number and a user_id.
So I make this:
#posts = Post.where(:user_id => 1)
Now lets say I want to limit my #posts array's records to have a sum of 50 or more in the num value (with only the final record going over the limit). So it would be adding post.num + post2.num + post3.num etc, until it the total reaches at least 50.
Is there a way to do this?

I would say to just grab all of the records like you already are:
#posts = Post.where(:user_id => 1)
and then use Ruby to do the rest:
sum, i = 0, 0
until sum >= 50
post = #posts[i].delete
sum, i = sum+post.num, i+1
end
There's probably a more elegant way but this will work. It deletes posts in order until the sum has exceed or is equal to 50. Then #posts is left with the rest of the records. Hopefully I understood your question.

You need to use the PostgreSQL Window functions
This gives you the rows with the net sum lower than 50
SELECT a.id, sum(a.num) num_sum OVER (ORDER BY a.user_id)
FROM posts a
WHERE a.user_id = 1 AND a.num_sum < 50
But your case is trickier as you want to go over the limit by one row:
SELECT a.id, sum(a.num) num_sum OVER (ORDER BY a.user_id)
FROM posts a
WHERE a.user_id = 1 AND a.num_sum <= (
SELECT MIN(c.num_sum)
FROM (
SELECT sum(b.num) num_sum OVER (ORDER BY b.user_id)
FROM posts b
WHERE b.user_id = 1 AND b.num_sum >= 50
) c )
You have to convert this SQL to Arel.

Related

Rails .where query chained to sql function, is there a way to call it on the results without converting them to an array?

I have a method that ranks user's response rates in our system called ranked_users
def ranked_users
User.joins(:responds).group(:id).select(
"users.*, SUM(CASE WHEN answers.response != 3 THEN 1 ELSE 0 END ) avg, RANK () OVER (
ORDER BY SUM(CASE WHEN answers.response != 3 THEN 1 ELSE 0 END ) DESC, CASE WHEN users.id = '#{
current_user.id
}' THEN 1 ELSE 0 END DESC
) rank"
)
.where('users.active = true')
.where('answers.created_at BETWEEN ? AND ?', Time.now - 12.months, Time.now)
end
result = ranked_users
I then take the top three with top_3 = ranked_users.limit(3)
If the user is not in the top 3, I want to append them with their rank to the list:
user_rank = result.find_by(id: current_user.id)
Whenever I call user_rank.rank it returns 1. I know this is because it's applying the find_by clause first and then ranking them. Is there a way to enforce the find_by clause happens only on the result of the first query? I tried doing result.load.find_by(...) but had the same issue. I could convert the entire result into an array but I want the solution to be highly scalable.
If you expect lots of users with lots of answers and high load on your rating system - you can create a materialized view for the ranking query with (user_id, avg, rank, etc.) and refresh it periodically instead of calculating rank every time (say, a few times per day or even less often). There's gem scenic for this.
You can even have indexes on rank and user id on the view and your query will be two simple fast reads from it.

Given a record and order conditions, find record(s) after or before

For example, you have a list of items, sorted by priority. You have 10,000 items! If you are showing the user a single item, how do you provide buttons for the user to see the previous item or the next item (what are these items)?
You could pass the item's position to the item page and use OFFSET in your SQL query. The downside of this, apart from having to pass a number that may change, is that the database cannot jump to the offset; it has to read every record until it reaches, say, the 9001st record. This is slow. Having searched for a solution, I could not find one, so I wrote order_query.
order_query uses the same ORDER BY query, but also includes a WHERE clause that excludes records before (for next) or after (for prev) the current one.
Here is an example of what the criteria could look like (using the gem above):
p = Issue.find(31).relative_order_by_query(Issue.visible,
[[:priority, %w(high medium low)],
[:valid_votes_count, :desc, sql: '(votes - suspicious_votes)'],
[:updated_at, :desc],
[:id, :desc]])
p.before #=> ActiveRecord::Relation<...>
p.previous #=> Issue<...>
p.position #=> 5
p.next #=> Issue<...>
p.after #=> ActiveRecord::Relation<...>
Have I just reinvented the wheel here? I am very interested in other approaches of doing this on the backend.
Internally this gem builds a query that depends on the current record's order values and looks like:
SELECT ... WHERE
x0 OR
y0 AND (x1 OR
y1 AND (x2 OR
y2 AND ...))
ORDER BY ...
LIMIT 1
Where x correspond to > / < terms, and y to = terms (for resolving ties), per order criterion.
Example query from the test suite log:
-- Current record: priority='high' (votes - suspicious_votes)=4 updated_at='2014-03-19 10:23:18.671039' id=9
SELECT "issues".* FROM "issues" WHERE
("issues"."priority" IN ('medium','low') OR
"issues"."priority" = 'high' AND (
(votes - suspicious_votes) < 4 OR
(votes - suspicious_votes) = 4 AND (
"issues"."updated_at" < '2014-03-19 10:23:18.671039' OR
"issues"."updated_at" = '2014-03-19 10:23:18.671039' AND
"issues"."id" < 9)))
ORDER BY
"issues"."priority"='high' DESC,
"issues"."priority"='medium' DESC,
"issues"."priority"='low' DESC,
(votes - suspicious_votes) DESC,
"issues"."updated_at" DESC,
"issues"."id" DESC
LIMIT 1
I found an alternative approach, and it uses a construct from the SQL '92 standard (Predicates 209), the row values constructor comparison predicate:
Let Rx and Ry be the two row value constructors of the comparison predicate and let RXi and RYi be the i-th row value constructor elements of Rx and Ry, respectively. "Rx comp op Ry" is true, false, or unknown as follows:
"x = Ry" is true if and only if RXi = RYi for all i.
"x <> Ry" is true if and only if RXi <> RYi for some i.
"x < Ry" is true if and only if RXi = RYi for all i < n and RXn < RYn for some n.
"x > Ry" is true if and only if RXi = RYi for all i < n and RXn > RYn for some n.
I found an example in this article by Markus Winand. Row value constructor comparison predicate can be used like this:
SELECT *
FROM sales
WHERE (sale_date, sale_id) < (?, ?)
ORDER BY sale_date DESC, sale_id DESC
This is roughly equivalent to this query:
SELECT *
FROM sales
WHERE sale_date < ? OR (sale_date = ? AND sale_id < ?)
ORDER BY sale_date DESC, sale_id DESC
The first caveat is that to use this directly all the order components have to be in the same direction, otherwise more fiddling is required. The other being that, despite being standard, row values comparison predicates are not supported by most databases (does work on postgres).

render query count(DISTINCT "reviews".id) as my_count even if count result is 0

In a query below rails counts how many reviews and comments publication have and order results in a DESC order.
For instance:
publication one: 2 reviews + 10 comments = 12(my_count)
publication two: 2 reviews + 5 comments = 7(my_count)
In case above query can find and render publication as expected, however if:
publication three: 0 reviews + 5 comments = 5(my_count)
in this case query will not render publication three, because review value is 0. How could I make it render even if one or both values are 0? So, basically I want render all records in DESC order no matter if value 0.
Thanks for guidance!
#publication = Publication.joins(:reviews, :publication_comments)
.select('"publications".*, count(DISTINCT "reviews".id) + count(DISTINCT "publication_comments".id) as my_count')
.group('"publications".id')
.order("my_count DESC")
You need to make left join with reviews and probably publications_comments tables. Please try this:
#publication = Publication.includes(:reviews, :publication_comments)
.select('"publications".*, count(DISTINCT "reviews".id) + count(DISTINCT "publication_comments".id) as my_count')
.group('"publications".id')
.order("my_count DESC")

How to get records based on an offset around a particular record?

I'm building a search UI which searches for comments. When a user clicks on a search result (comment), I want to show the surrounding comments.
My model:
Group (id, title) - A Group has many comments
Comment (id, group_id, content)
For example:
When a user clicks on a comment with comment.id equal to 26. I would first find all the comments for that group:
comment = Comment.find(26)
comments = Comment.where(:group_id => comment.group_id)
I now have all of the group's comments. What I then want to do is show comment.id 26, with a max of 10 comments before and 10 comments after.
How can I modify comments to show that offset?
Sounds simple, but it's tricky to get the best performance for this. In any case, you must let the database do the work. That will be faster by an order of magnitude than fetching all rows and filter / sort on the client side.
If by "before" and "after" you mean smaller / bigger comment.id, and we further assume that there can be gaps in the id space, this one query should do all:
WITH x AS (SELECT id, group_id FROM comment WHERE id = 26) -- enter value once
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id > x.id
ORDER BY c.id
LIMIT 10
)
UNION ALL
(
SELECT *
FROM x
JOIN comment c USING (group_id)
WHERE c.id < x.id
ORDER BY c.id DESC
LIMIT 10
)
I'll leave paraphrasing that in Ruby syntax to you, that's not my area of expertise.
Returns 10 earlier comments and 10 later ones. Fewer if fewer exist. Use <= in the 2nd leg of the UNION ALL query to include the selected comment itself.
If you need the rows sorted, add another query level on top with ORDER BY.
Should be very fast in combination with these two indexes for the table comment:
one on (id) - probably covered automatically the primary key.
one on (group_id, id)
For read-only data you could create a materialized view with a gap-less row-number that would make this even faster.
More explanation about parenthesis, indexes, and performance in this closely related answer.
Something like:
comment = Comment.find(26)
before_comments = Comment.
where('created_at <= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)
after_comments = Comment.
where('created_at >= ?', comment.created_at).
where('id != ?', comment.id).
where(group_id: comment.group_id).
order('created_at DESC').limit(10)

Select n objects randomly with condition in Rails

I have a model called Post, with a column called vote, and it has a big number of posts
I want to select n posts randomly that have >=x votes. n is very small compared to the number of posts
What is the best way to do this? I've tried a couple of ways that seem to be very inefficient. Thanks
If you're on MySQL, you could order all the posts that meet the greater than criteria randomly and select the top n.
The actual query would look like
SELECT * FROM posts WHERE votes >= x ORDER BY rand() LIMIT n
Haven't tested this, but something like this should work in Rails:
Post.all(:conditions => ["votes >= ?", x], :order => "rand()", :limit => n)

Resources