"Complex" grouping and indexing in rails?

"Complex" grouping and indexing in rails? - ruby-on-rails

I have a problem to which I can't seem to find a simple solution.
I want to achieve the following:
* i have a list of tasks, each with an owner and a due date
* i want to display a list of all tasks grouped by owner
* i want to sort the owners based on the due dates: e.g. The owner with the lowest due date first, followed by the owner with the second lowest, etc
* I want to be able to paginate the results, preferable with will_paginate
To ilustrate, this would be a result i am looking for:
Harry
- task 1, due date 1
- task 3, due date 4
Ben
- task 2, due date 2
Carol
- task 4, due date 3
So far, I can query for all owners with tasks, sort them on a virtual attribute with their "earliest due date" and then display them and their tasks in a view.
There are multiple problems with this approach, imo:
* i run multiple queries from the view (owner.tasks.each etc). I always learned that running queries from the view is bad
* i'm doing the sorting by loading all records in memory (virtual attribute), which could get problematic with large records set.
* I have no idea how i can get pagination in on this, that will be sensitive to the amount of tasks displayed
I can't seem to crack this nut, so any help or hints would be greatly appreciated!
Erwin

Try this query, you have not provided sample data (ideally using SQL) so that we could play ourselves:
SELECT
u.id as owner_id, u.name as owner_name, t.id, t.due_date
FROM users u
INNER JOIN tasks m ON u.id = m.owner_id
INNER JOIN tasks t ON u.id = t.owner_id
GROUP BY u.id, u.name, t.id, t.due_date
ORDER BY MIN(m.due_date), t.due_date
You should get all the data you need in the proper order, and you can paginate simply by applying LIMIT to it (or converting it to AR and submitting it to will_paginate).

Related

JIRA Query to track any and all activity between certain hours

A contractor I'm working with wants to give us a small bonus based on any "overtime" work we may or may not have done. Problem is, we didn't exactly keep a log of any activity we did outside of normal working day hours. I'm trying to use my Jira activity as a starting point. How can I track any and all activity (leaving a comment, changing a status, assigning the ticket to someone else, etc) conducted by me within certain hours, within certain dates.
For example, any and all activity conducted by me between the hours of 7PM - 8AM from February 3, 2021 to January 15, 2022.

You can get it from the Jira database.
For any activity, you can use following SQL query for retrieve any change in a project:
SELECT p.pname, p.pkey, i.issuenum, cg.ID, cg.issueid, au.lower_user_name, cg.AUTHOR, cg.CREATED, ci.FIELDTYPE, ci.FIELD, ci.OLDVALUE, ci.OLDSTRING, ci.NEWVALUE, ci.NEWSTRING
FROM changegroup cg
inner join jiraissue i on cg.issueid = i.id
inner join project p on i.project = p.id
inner join changeitem ci on ci.groupid = cg.id
inner join app_user au on cg.author = au.user_key
WHERE p.pkey = '<PROJECT_KEY>'
order by 1,3,4;
You can modify that query by filtering cg.CREATED in order to get activity between certain dates.
NOTE: Query can be changed depend on your database.
NOTE2: You can look that Atlassian link for further information.

Need help writing sql query in rails so I get ActiveRelation

I need an active record relation that gives me the latest record of a region, city, bed combination. I have the sql query written as below, but I need to figure out if there is away to use a different approach to have it return an active record relation and not an array. Any suggestions?
Current query:
#current_ltm_market_stats = LtmStatsByBedCount.find_by_sql(" SELECT *
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
AND (lstats.beds,lstats.city_id,lstats.region_id,lstats.reporting_date)
IN (SELECT lstats.beds,
lstats.city_id,
lstats.region_id,
max(lstats.reporting_date)
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
GROUP BY city_id, region_id, beds)
ORDER BY lstats.year DESC,lstats.month DESC")
I had tried this before which did result in a relation but it runs really slowly and the result is not exactly the same. Are there any better rails ways to do this?
#all_ltm_market_stats = LtmStatsByBedCount.where(city_id: #market.city_id, region_id: #market.region_id)
#current_ltm_market_stats = #latest_year_ltm_market_stats.where(month: #latest_year_ltm_market_stats.all_ltm_market_stats.select('Max(year)'))

Information in the question is incomplete, so i might have to update my answer when additional details are added, But here is the initial draft with available information:
#current_ltm_market_stats = LtmStatsByBedCount.
where(city_id: #city_id).
where(#region_id_condition).
where("(beds, city_id, region_id, reporting_date) IN (
SELECT lstats.beds,
lstats.city_id,
lstats.region_id,
max(lstats.reporting_date)
FROM ltm_stats_by_bed_counts lstats
WHERE lstats.city_id = '#{#city_id}'
#{#region_id_condition}
GROUP BY city_id, region_id, beds)").
order(year: :desc, month: :desc)
Note that you might have to adjust your #region_id_condition a bit for this to work.
Theoretically it is equivalent of your SQL version(which means it will generate same sql excluding table alias) and returns the AR relation object. Which is the only requirement in the question. Obviously SQL might be improved with additional information as well.
Additionally, you will want to have carefully crafted indexes on this table if you are going to use this query on larger datasets frequently.

Second foreign_key to speed up query

Say you creating an imdb type site for TV Shows. You have a Show with many attached episodes and a bunch of people
Right now I link people to episodes though a contribution table - but if I want to make a list of all the shows they are on, I have to go through episodes.
Since this query takes a long time I was thinking about adding show_id to the contributions table. Is this common practice to increase performance or is there another way I haven't thought of?

Since this query takes a long time
Have you run a SQL explain plan to show why this is the case? What is the actual SQL query that is being run, and are you doing things like ordering or running subqueries within it?
If I understand your structure it is something like this:
|people| n---1 |contribution| 1---n |episodes| n---1 |shows|
A sql select of the sort:
select distinct s.name
from shows s,
episodes e,
contribution c
where c.people_id = <id>
and c.episode_id = e.id
and e.show_id = s.id
should really not have performance issues unless there are no indexes on the tables or the tables are massive.

Here's a way using where id in ( ... ) to select all shows a specific person appeared in
Shows.where(id: Contribution.select("show_id")
.join(:episodes)
.where(person_id: personId)
.group("episodes.show_id"))
You may also want to try exists
Shows.where("EXISTS(SELECT 1 from contributions c
join episodes e on e.id = c.episode_id
where c.person_id = ? and e.show_id = shows.id)")

Order with DISTINCT ids in rails with postgres

I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.
Micropost.joins("INNER JOIN activities ON
(activities.micropost_id = microposts.id)").
where('activities.user_id= ?',id).order('activities.created_at DESC').
select("DISTINCT (microposts.id), *")
which should return whole micropost columns.This is not working in my developement enviornment.
(PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.
I want to select based on order of activities.created_at with distinct micropost _id.
Please help..

To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.
To explain where the ambiguity comes from, consider this simple set of data for your activities:
CREATE TABLE activities (
id INTEGER PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE,
micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp, 1),
(2, current_timestamp - interval '3 hours', 1),
(3, current_timestamp - interval '2 hours', 2)
You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?
If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:
SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC
Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.
You can translate the above to Rails like so:
Micropost.select('microposts.*')
.joins("JOIN activities ON activities.micropost_id = microposts.id")
.where('activities.user_id' => id)
.group('microposts.id')
.order('MAX(activities.created_at) DESC')
Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.

Try the below code
Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq

Report using Rails ActiveRecord group by

I am trying to generate a report to screen of accounting transaction history. In most situations it is one display row per record in the AccountingTransaction table. But occasionally there are transactions that I wish to display to the end user as one transaction which are really, behind the scenes, two accounting transactions. This is caused by deferral of revenues and fund splitting since this app is a fund accounting app.
If I display all rows one by one, those double entries look odd to the user since the fund splitting and deferral is "behind the scenes". So I want to roll up all the related transactions into one display row on screen.
I have my query now using group by to group the related transactions
#history = AccountingTransaction.where("customer_id in (?) AND no_download <> 1", customers_in_account).group(:transaction_type_id, :reference_id).order(:created_at)
as I loop through I get the transactions grouped as I want but I am struggling with how to display the total sum of the 'credit' field for all records in the group. (It is only showing the credit for the first record of the group) If I add a .sum(:credit) to my query, of course, it returns the sums just as I want but not all the other data.
Is there a way for me to group these records like in my #history query and also get the sum of the credit field for each respective group?
* Addition *
What I really want is what the following SQL query would give me.
SELECT transaction_type_id, reference_id, sum(credit)
WHERE customer_id in (21,22,23,24) AND no_download <> 1
GROUP BY reference_id, transaction_type_id ORDER BY created_at

I'm not sure you can do "ORDER BY created_at" and not include it in the select fields, but here is an example.
#history = AccountingTransaction.
select([:reference_id, :transaction_type_id, :created_at]).
select(AccountingTransaction.arel_table[:credit].sum.as("credit_sum")).
where("customer_id in (?) AND no_download <> 1", customers_in_account).
group(:transaction_type_id, :reference_id).
order(:created_at)
To access the credit_sum you could do:
#history[0].attributes["credit_sum"]
I guess if you'd like, you could create a method:
def credit_sum
attributes["credit_sum"]
end
EDIT *
As stated in comments you can access the attribute directly:
#history[0].credit_sum

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

"Complex" grouping and indexing in rails? - ruby-on-rails

Related

JIRA Query to track any and all activity between certain hours

Need help writing sql query in rails so I get ActiveRelation

Second foreign_key to speed up query

Order with DISTINCT ids in rails with postgres

Report using Rails ActiveRecord group by

Categories

Resources