Report using Rails ActiveRecord group by - ruby-on-rails

I am trying to generate a report to screen of accounting transaction history. In most situations it is one display row per record in the AccountingTransaction table. But occasionally there are transactions that I wish to display to the end user as one transaction which are really, behind the scenes, two accounting transactions. This is caused by deferral of revenues and fund splitting since this app is a fund accounting app.
If I display all rows one by one, those double entries look odd to the user since the fund splitting and deferral is "behind the scenes". So I want to roll up all the related transactions into one display row on screen.
I have my query now using group by to group the related transactions
#history = AccountingTransaction.where("customer_id in (?) AND no_download <> 1", customers_in_account).group(:transaction_type_id, :reference_id).order(:created_at)
as I loop through I get the transactions grouped as I want but I am struggling with how to display the total sum of the 'credit' field for all records in the group. (It is only showing the credit for the first record of the group) If I add a .sum(:credit) to my query, of course, it returns the sums just as I want but not all the other data.
Is there a way for me to group these records like in my #history query and also get the sum of the credit field for each respective group?
* Addition *
What I really want is what the following SQL query would give me.
SELECT transaction_type_id, reference_id, sum(credit)
WHERE customer_id in (21,22,23,24) AND no_download <> 1
GROUP BY reference_id, transaction_type_id ORDER BY created_at

I'm not sure you can do "ORDER BY created_at" and not include it in the select fields, but here is an example.
#history = AccountingTransaction.
select([:reference_id, :transaction_type_id, :created_at]).
select(AccountingTransaction.arel_table[:credit].sum.as("credit_sum")).
where("customer_id in (?) AND no_download <> 1", customers_in_account).
group(:transaction_type_id, :reference_id).
order(:created_at)
To access the credit_sum you could do:
#history[0].attributes["credit_sum"]
I guess if you'd like, you could create a method:
def credit_sum
attributes["credit_sum"]
end
EDIT *
As stated in comments you can access the attribute directly:
#history[0].credit_sum

Related

How to delete all logs except last 100 for each user in single table?

I have a single logs table which contains entries for users. I want to (prune) delete all but the last 100 for each user. I'd like to do this in the most efficient way (one statement using ActiveRecord if possible).
I know I can use the following:
.order(created_at: :desc) to get the records sorted
.offset(100) to get all records except the ones I want to keep
.ids to pluck the record ids
select(:user_id).distinct to get a list of all users in the table
The table has id, user_id, created_at columns (and others not pertinent to this question).
Each user should have at least the last 100 log entries remaining the logs table.
Not really sure how to do this using ruby syntax with my Log model. If it can't be done efficiently using ruby then I'll resort to using the SQL equivalent.
Any help much appreciated.
In SQL, you could do this:
DELETE FROM logs
USING (SELECT id
FROM (SELECT id,
row_number()
OVER (PARTITION BY user_id
ORDER BY created_at DESC)
AS rownr
FROM logs
) AS a
WHERE rownr > 100
) AS b
WHERE logs.id = b.id;
If the table is large, this will be slow.

How to show all records from multiple tables regardless of match on join statement

I am trouble figuring out the proper syntax to structure this query correctly. I am trying to show ALL records from both the SalesHistoryDetail AND from the SalesVsBudget table. I believe my query allows for some of the records on SalesVsBudget to not be pulled, whereas I want them all for that period, regardless of whether there was a corresponding sale. Here is my code:
SELECT MAX(a.DispatchCenterOrderKey) AS DispatchCenter,
a.CustomerKey,
CASE WHEN a.CustomerKey IN
(SELECT AddressKey
FROM FinancialData.dbo.DimAddress
WHERE AddressKey >= 99000 AND AddressKey <= 99599) THEN 1 ELSE 0 END AS InterCompanyFlag,
MAX(a.Customer) AS Customer,
a.SalesmanID,
MAX(a.Salesman) AS Salesman,
a.SubCategoryKey,
MAX(a.SubCategoryDesc) AS Subcategory,
SUM(a.Value) AS SalesAmt,
b.FiscalYear AS Year,
b.FiscalWeekOfYear AS Week,
MAX(c.BudgetLbs) AS BudgetLbs,
MAX(c.BudgetDollars) AS BudgetDollars
FROM dbo.SalesHistoryDetail AS a
LEFT OUTER JOIN dbo.M_DateDim AS b ON a.InvoiceDate = b.Date
FULL OUTER JOIN dbo.SalesVsBudget AS c ON a.SalesmanID = c.SalesRepKey
AND a.CustomerKey = c.CustomerKey
AND a.SubCategoryKey = c.SubCategoryKey
AND b.FiscalYear = c.Year AND b.FiscalWeekOfYear = c.WeekNo
GROUP BY a.SalesmanID, a.CustomerKey, a.SubCategoryKey, b.FiscalYear, b.FiscalWeekOfYear
There are two different data sets that I am pulling from, obviously the SalesHistoryDetail table and the SalesVsBudget table. I'm hoping to get ALL budgetLbs, and BudgetDollars values from the SalesVsBudget table regardless of whether they match in the join. I want all of the matching joining records too, but I also want EVERY record from SalesVsBudget. Essentially I want to show ALL sales records and I want to reference the budget values from SalesVsBudget when the salesman,customer,subcategory, year and week match but I also want to see budget entries that fall in my date range that don't have corresponding sales records in that period. Hopefully that makes sense. I feel I am very close, but my budget numbers doesn't reflect the whole story and I think that is because some of my records are being excluded! Please help.
I was able to accomplish this through playing with the FULL OUTER JOIN. My problems was there were more records in SalesVsBudget than SalesHistory_V. Therefore I had to make SalesVsBudget the initial FROM table and SaleHistory_V with a FULL OUTER JOIN and all records lined up.

ActiveRecord query searching for duplicates on a column, but returning associated records

So here's the lay of the land:
I have a Applicant model which has_many Lead records.
I need to group leads by applicant email, i.e. for each specific applicant email (there may be 2+ applicant records with the email) i need to get a combined list of leads.
I already have this working using an in-memory / N+1 solution
I want to do this in a single query, if possible. Right now I'm running one for each lead which is maxing out the CPU.
Here's my attempt right now:
Lead.
all.
select("leads.*, applicants.*").
joins(:applicant).
group("applicants.email").
having("count(*) > 1").
limit(1).
to_a
And the error:
Lead Load (1.2ms) SELECT leads.*, applicants.* FROM "leads" INNER
JOIN "applicants" ON "applicants"."id" = "leads"."applicant_id"
GROUP BY applicants.email HAVING count(*) > 1 LIMIT 1
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column
"leads.id" must appear in the GROUP BY clause or be used in an
aggregate function
LINE 1: SELECT leads.*, applicants.* FROM "leads" INNER JOIN
"appli...
This is a postgres specific issue. "the selected fields must appear in the GROUP BY clause".
must appear in the GROUP BY clause or be used in an aggregate function
You can try this
Lead.joins(:applicant)
.select('leads.*, applicants.email')
.group_by('applicants.email, leads.id, ...')
You will need to list all the fields in leads table in the group by clause (or all the fields that you are selecting).
I would just get all the records and do the grouping in memory. If you have a lot of records, I would paginate them or batch them.
group_by_email = Hash.new { |h, k| h[k] = [] }
Applicant.eager_load(:leads).each_batch(10_000) do |batch|
batch.each do |applicant|
group_by_email[:applicant.email] << applicant.leads
end
end
You need to use a .where rather than using Lead.all. The reason it is maxing out the CPU is you are trying to load every lead into memory at once. That said I guess I am still missing what you actually want back from the query so it would be tough for me to help you write the query. Can you give more info about your associations and the expected result of the query?

Summary Count by Group with ActiveRecord

I have a Rails application that holds user data (in an aptly named user_data object). I want to display a summary table that shows me the count of total users and the count of users who are still active (status = 'Active'), created each month for the past 12 months.
In SQL against my Postgres database, I can get the result I want with the following query (the date I use in there is calculated by the application, so you can ignore that aspect):
SELECT total.creation_month,
total.user_count AS total_count,
active.user_count AS active_count
FROM
(SELECT date_trunc('month',"creationDate") AS creation_month,
COUNT("userId") AS user_count
FROM user_data
WHERE "creationDate" >= to_date('2015 12 21', 'YYYY MM DD')
GROUP BY creation_month) AS total
LEFT JOIN
(SELECT date_trunc('month',"creationDate") AS creation_month,
COUNT("userId") AS user_count
FROM user_data
WHERE "creationDate" >= to_date('2015 12 21', 'YYYY MM DD')
AND status = 'Active'
GROUP BY creation_month) AS active
ON total.creation_month = active.creation_month
ORDER BY creation_month ASC
How do I write this query with ActiveRecord?
I previously had just the total user count grouped by month in my display, but I am struggling with how to add in the additional column of active user counts.
My application is on Ruby 2.1.4 and Rails 4.1.6.
I gave up on trying to do this the ActiveRecord way. Instead I just constructed my query into a string and passed the string into
ActiveRecord::Base.connection.execute(sql_string)
This had the side effect that my result set came out as a array instead of a set of objects. So getting at the values went from a syntax (where user_data is the name assigned to a single record from the result set) like
user_data.total_count
to
user_data['total_count']
But that's a minor issue. Not worth the hassle.

Ranking position

I have a Rails application with the following models:
User
Bet
User has many_bets and Bets belongs_to User. Every Bet has a Profitloss value, which states how much the User has won/lost on that Bet.
So to calculate how much a specific User has won overall I cycle through his bets in the following way:
User.bets.sum(:profitloss)
I would like to show the User his ranking compared to all the other Users, which could look something like this:
"Your overall ranking: 37th place"
To do so I need to sum up the overall winnings per User, and find out in which position the current user is.
How do I do that and how to do it, so it don't overload the server :)
Thanks!
You can try something similar to
User.join(:bets).
select("users.id, sum(bets.profitloss) as accumulated").
group("users.id").
order("accumulated DESC")
and then search in the resulting list of "users" (not real users, they have only two meaningful attributes, their ID and a accumulated attribute with the sum), for the one corresponding to the current one.
In any case to get a single user's position, you have to calculate all users' accumulated, but at least this is only one query. Even better, you can store in the user model the accumulated value, and query just it for ranking.
If you have a large number of Users and Bets, you won't be able to compute and sort the global profitloss of each user "on demand", so I suggest that you use a rake task that you schedule regularly (once a day, every hour, etc...)
Add a column position in the User model, get the list of all Users, compute their global profitloss, sort the list of Users with their profitloss, and finally update the position attribute of each User with their position in the list.
Best way to do it is to keep a pre calculated total in your database either on user model itself or on a separate model that has 1:1 relation to user. If you don't do this, you will have to calculate sum for all users at all times in order to get their rating, which means full table operation on bets table. This said, this query will give you desired results, if more than 1 person has the same total, it will count both as rating X:
select id, (select count(h.id) from users u inner join
(select user_id, sum(profitloss) as `total` from bets group by user_id) b2
on b2.user_id = u.id, (select id from users) h inner join
(select user_id, sum(profitloss) as `total` from bets group by user_id) b
on b.user_id = h.id where u.id = 1 and (b.total > b2.total))
as `rating` from users where id = 1;
You will need to plug user.id into query in where id = X
if you add a column to user table to keep track of their total, query is a little simpler, in this example column name is total_profit_loss:
select id, total_profit_loss, (select count(h.username)+1 from users u,
(select username, score from users) h
where id = 1 and (h.total_profit_loss > u.total_profit_loss))
as `rating` from users where id = 1;

Resources