I have a Rails application with the following models:
User
Bet
User has many_bets and Bets belongs_to User. Every Bet has a Profitloss value, which states how much the User has won/lost on that Bet.
So to calculate how much a specific User has won overall I cycle through his bets in the following way:
User.bets.sum(:profitloss)
I would like to show the User his ranking compared to all the other Users, which could look something like this:
"Your overall ranking: 37th place"
To do so I need to sum up the overall winnings per User, and find out in which position the current user is.
How do I do that and how to do it, so it don't overload the server :)
Thanks!
You can try something similar to
User.join(:bets).
select("users.id, sum(bets.profitloss) as accumulated").
group("users.id").
order("accumulated DESC")
and then search in the resulting list of "users" (not real users, they have only two meaningful attributes, their ID and a accumulated attribute with the sum), for the one corresponding to the current one.
In any case to get a single user's position, you have to calculate all users' accumulated, but at least this is only one query. Even better, you can store in the user model the accumulated value, and query just it for ranking.
If you have a large number of Users and Bets, you won't be able to compute and sort the global profitloss of each user "on demand", so I suggest that you use a rake task that you schedule regularly (once a day, every hour, etc...)
Add a column position in the User model, get the list of all Users, compute their global profitloss, sort the list of Users with their profitloss, and finally update the position attribute of each User with their position in the list.
Best way to do it is to keep a pre calculated total in your database either on user model itself or on a separate model that has 1:1 relation to user. If you don't do this, you will have to calculate sum for all users at all times in order to get their rating, which means full table operation on bets table. This said, this query will give you desired results, if more than 1 person has the same total, it will count both as rating X:
select id, (select count(h.id) from users u inner join
(select user_id, sum(profitloss) as `total` from bets group by user_id) b2
on b2.user_id = u.id, (select id from users) h inner join
(select user_id, sum(profitloss) as `total` from bets group by user_id) b
on b.user_id = h.id where u.id = 1 and (b.total > b2.total))
as `rating` from users where id = 1;
You will need to plug user.id into query in where id = X
if you add a column to user table to keep track of their total, query is a little simpler, in this example column name is total_profit_loss:
select id, total_profit_loss, (select count(h.username)+1 from users u,
(select username, score from users) h
where id = 1 and (h.total_profit_loss > u.total_profit_loss))
as `rating` from users where id = 1;
Related
I am trouble figuring out the proper syntax to structure this query correctly. I am trying to show ALL records from both the SalesHistoryDetail AND from the SalesVsBudget table. I believe my query allows for some of the records on SalesVsBudget to not be pulled, whereas I want them all for that period, regardless of whether there was a corresponding sale. Here is my code:
SELECT MAX(a.DispatchCenterOrderKey) AS DispatchCenter,
a.CustomerKey,
CASE WHEN a.CustomerKey IN
(SELECT AddressKey
FROM FinancialData.dbo.DimAddress
WHERE AddressKey >= 99000 AND AddressKey <= 99599) THEN 1 ELSE 0 END AS InterCompanyFlag,
MAX(a.Customer) AS Customer,
a.SalesmanID,
MAX(a.Salesman) AS Salesman,
a.SubCategoryKey,
MAX(a.SubCategoryDesc) AS Subcategory,
SUM(a.Value) AS SalesAmt,
b.FiscalYear AS Year,
b.FiscalWeekOfYear AS Week,
MAX(c.BudgetLbs) AS BudgetLbs,
MAX(c.BudgetDollars) AS BudgetDollars
FROM dbo.SalesHistoryDetail AS a
LEFT OUTER JOIN dbo.M_DateDim AS b ON a.InvoiceDate = b.Date
FULL OUTER JOIN dbo.SalesVsBudget AS c ON a.SalesmanID = c.SalesRepKey
AND a.CustomerKey = c.CustomerKey
AND a.SubCategoryKey = c.SubCategoryKey
AND b.FiscalYear = c.Year AND b.FiscalWeekOfYear = c.WeekNo
GROUP BY a.SalesmanID, a.CustomerKey, a.SubCategoryKey, b.FiscalYear, b.FiscalWeekOfYear
There are two different data sets that I am pulling from, obviously the SalesHistoryDetail table and the SalesVsBudget table. I'm hoping to get ALL budgetLbs, and BudgetDollars values from the SalesVsBudget table regardless of whether they match in the join. I want all of the matching joining records too, but I also want EVERY record from SalesVsBudget. Essentially I want to show ALL sales records and I want to reference the budget values from SalesVsBudget when the salesman,customer,subcategory, year and week match but I also want to see budget entries that fall in my date range that don't have corresponding sales records in that period. Hopefully that makes sense. I feel I am very close, but my budget numbers doesn't reflect the whole story and I think that is because some of my records are being excluded! Please help.
I was able to accomplish this through playing with the FULL OUTER JOIN. My problems was there were more records in SalesVsBudget than SalesHistory_V. Therefore I had to make SalesVsBudget the initial FROM table and SaleHistory_V with a FULL OUTER JOIN and all records lined up.
I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.
Micropost.joins("INNER JOIN activities ON
(activities.micropost_id = microposts.id)").
where('activities.user_id= ?',id).order('activities.created_at DESC').
select("DISTINCT (microposts.id), *")
which should return whole micropost columns.This is not working in my developement enviornment.
(PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.
I want to select based on order of activities.created_at with distinct micropost _id.
Please help..
To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.
To explain where the ambiguity comes from, consider this simple set of data for your activities:
CREATE TABLE activities (
id INTEGER PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE,
micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp, 1),
(2, current_timestamp - interval '3 hours', 1),
(3, current_timestamp - interval '2 hours', 2)
You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?
If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:
SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC
Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.
You can translate the above to Rails like so:
Micropost.select('microposts.*')
.joins("JOIN activities ON activities.micropost_id = microposts.id")
.where('activities.user_id' => id)
.group('microposts.id')
.order('MAX(activities.created_at) DESC')
Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.
Try the below code
Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq
I want to expand this question.
order by foreign key in activerecord
I'm trying to order a set of records based on a value in a really large table.
When I use join, it brings all the "other" records data into the objects.. As join should..
#table users 30+ columns
#table bids 5 columns
record = Bid.find(:all,:joins=>:users, :order=>'users.ranking DESC' ).first
Now record holds 35 fields..
Is there a way to do this without the join?
Here's my thinking..
With the join I get this query
SELECT * FROM "bids"
left join users on runner_id = users.id
ORDER BY ranking LIMIT 1
Now I can add a select to the code so I don't get the full user table, but putting a select in a scope is dangerous IMHO.
When I write sql by hand.
SELECT * FROM bids
order by (select users.ranking from users where users.id = runner_id) DESC
limit 1
I believe this is a faster query, based on the "explain" it seems simpler.
More important than speed though is that the second method doesn't have the 30 extra fields.
If I build in a custom select inside the scope, it could explode other searches on the object if they too have custom selects (there can be only one)
What you would like to achieve in active record writing is something along
SELECT b.* from bids b inner join users u on u.id=b.user_id order by u.ranking desc
In active record i would write such as:
Bids.joins("inner join users u on bids.user_id=u.id").order("u.ranking desc")
I think it's the only to make a join without fetching all attributes from the user models.
I am trying to generate a report to screen of accounting transaction history. In most situations it is one display row per record in the AccountingTransaction table. But occasionally there are transactions that I wish to display to the end user as one transaction which are really, behind the scenes, two accounting transactions. This is caused by deferral of revenues and fund splitting since this app is a fund accounting app.
If I display all rows one by one, those double entries look odd to the user since the fund splitting and deferral is "behind the scenes". So I want to roll up all the related transactions into one display row on screen.
I have my query now using group by to group the related transactions
#history = AccountingTransaction.where("customer_id in (?) AND no_download <> 1", customers_in_account).group(:transaction_type_id, :reference_id).order(:created_at)
as I loop through I get the transactions grouped as I want but I am struggling with how to display the total sum of the 'credit' field for all records in the group. (It is only showing the credit for the first record of the group) If I add a .sum(:credit) to my query, of course, it returns the sums just as I want but not all the other data.
Is there a way for me to group these records like in my #history query and also get the sum of the credit field for each respective group?
* Addition *
What I really want is what the following SQL query would give me.
SELECT transaction_type_id, reference_id, sum(credit)
WHERE customer_id in (21,22,23,24) AND no_download <> 1
GROUP BY reference_id, transaction_type_id ORDER BY created_at
I'm not sure you can do "ORDER BY created_at" and not include it in the select fields, but here is an example.
#history = AccountingTransaction.
select([:reference_id, :transaction_type_id, :created_at]).
select(AccountingTransaction.arel_table[:credit].sum.as("credit_sum")).
where("customer_id in (?) AND no_download <> 1", customers_in_account).
group(:transaction_type_id, :reference_id).
order(:created_at)
To access the credit_sum you could do:
#history[0].attributes["credit_sum"]
I guess if you'd like, you could create a method:
def credit_sum
attributes["credit_sum"]
end
EDIT *
As stated in comments you can access the attribute directly:
#history[0].credit_sum
Here are my models and associations:
User has many Awards
Award belongs to User
Prize has many Awards
Award belongs to Prize
Let's pretend that there are four Prizes (captured as records):
Pony
Toy
Gum
AwesomeStatus
Every day a User can be awarded one or more of these Prizes. But the User can only receive each Prize once per day. If the User wins AwesomeStatus, for ex, a record is added to the Awards table with a fk to User and Prize. Obviously, if the User doesn't win the AwesomeStatus for the day, no record is added.
At the end of the day (before midnight, let's say), I want to return a list of Users who lost their AwesomeStatus. (Of course, to lose your AwesomeStatus, you had to have the day before.) Unfortunately, in my case, I don't think observers will work and will have to rely on a script. Regardless, how would you go about determining which Users lost their AwesomeStatus? Note: don't make your solution overly dependent on the period of time -- in this case a day. I want to maintain flexibility in how many times per whatever period Users have an opportunity to win the prize (and to also lose it).
I would probably do something like this:
The class Award should also have a column awarded_at which contains the day the prize was awarded. So when it is time to create the award it can be done like this:
# This will make sure that no award will be created if it already exists for the current date
#user.awards.find_or_create_by_prize_id_and_awarded_at(#prize.id, Time.now.strftime("%Y-%m-%d"))
And then we can have a scope to load all users with an award that will expire today and no active awards for the supplied prize.
# user.rb
scope :are_losing_award, lambda { |prize_id, expires_after|
joins("INNER JOIN awards AS expired_awards ON users.id = expired_awards.user_id AND expired_awards.awarded_at = '#{(Time.now - expires_after.days).strftime("%Y-%m-%d")}'
LEFT OUTER JOIN awards AS active_awards ON users.id = active_awards.user_id AND active_awards.awarded_at > '(Time.now - expires_after.days).strftime("%Y-%m-%d")}' AND active_awards.prize_id = #{prize_id}").
where("expired_awards.prize_id = ? AND active_awards.id IS NULL", prize_id)
}
So then we can call it like this:
# Give me all users who got the prize three days ago and has not gotten it again since
User.are_losing_award(#prize.id, 3)
There might be some ways to write the scope better with ARel queries or something, I'm no expert with that yet, but this way should work until then :)
I'd add an integer "time period" field to awards, which stands for a given period of time (day, week, 5 hour period, whatever you want).
Now, you can search the awards table for users who have the award status at t-1, but not at t:
SELECT prev.user_id
FROM awards prev
OUTER JOIN awards current ON prev.user_id = current.user_id
AND prev.prize_id = current.prize_id
AND current.time_period = 1000
WHERE prev.prize_id = 1
AND current.prize_id IS NULL
AND prev.time_period = 999
Just use updated_at, or add an awarded_at like suggested above and use it like this:
scope :awarded, proc {|date| where(["updated_at <= ?", date])}
In your Award model. Print it like this, maybe:
awesome_status = Prize.find_by_name('AwesomeStatus')
p "Users who do not have AwesomeStatus anymore:"
User.all.each {|user| p user.username if user.awards.awarded(1.day.ago).collect(&:id).include?(awesome_status)}
If you want it to be dynamic, displayed somewhere, etc. throw a 'lasts_for' into Prize and compare against it and simply write a maintenance cronjob that sets an 'active' boolean on Award to false instead of deleting the association.