Join Issue - How to bring back only 1 outer join per row - join

I don't think my title explained it very well :)
I have a query that has quite a few joins on it.
SELECT
sof_slot_games.launch_date,
sof_slot_games.game_name,
sof_reviews.review_content,
sof_slot_games.slot_game_id,
sof_slot_game_details.no_of_reels,
sof_slot_game_details.paylines,
sof_reviews.reg_timestamp,
sof_developers.developer_name,
sof_slot_games.game_slug,
sof_slot_game_images.game_image
FROM
sof_reviews
Inner Join sof_slot_games ON sof_slot_games.slot_game_id = sof_reviews.slot_game_id
Inner Join sof_slot_game_details ON sof_slot_games.slot_game_id = sof_slot_game_details.slot_game_id
Inner Join sof_developers ON sof_slot_games.developer_id = sof_developers.developer_id
left outer Join sof_slot_game_images ON sof_reviews.slot_game_id = sof_slot_game_images.slot_game_id
WHERE
sof_slot_game_images.image_type_id = '3'
ORDER BY
sof_slot_games.launch_date DESC
limit 0,20
The problem is that I want to return just 1 game_image per row. The games themselves are basically the most recent 20 games by launch_date. But if I join with the game_image (type=3), then it will bring back multiple rows for the same game if that game has multiple images.
I want to really just pick the most recent 20 games and then pull back the 1st image for each. This is to go into an RSS Feed for the top 20 games which is why I want to do it this way (in case anyone is wondering) :)
I've been trying to figure this out... I know I have done it before, but my brain is not reminding me what I did :)
Thanks!

Related

Preventing duplicate values in multiple joins of the same table

I have two tables in my project that I need to join together in a somewhat complicated way and it is giving me very strange issues
I have a concept of teams and a concept of FeedItems. FeedItems means the team has solved a challenge. I need to know the last time they solved a challenge at, and I also need to calculate the sum of point-based FeedItems.
SELECT COALESCE(sum(challenges.point_value), 0) + COALESCE(sum(point_feed_items.point_value), 0) as team_score,
GREATEST(MAX(pentest_feed_items.created_at), MAX(point_feed_items.created_at)) as last_solve_time, teams.* FROM "teams"
LEFT JOIN feed_items AS point_feed_items
ON point_feed_items.team_id = teams.id
AND point_feed_items.type IN ('StandardSolvedChallenge', 'ScoreAdjustment')
LEFT JOIN feed_items AS pentest_feed_items
ON pentest_feed_items.team_id = teams.id
AND pentest_feed_items.type IN ('PentestSolvedChallenge')
LEFT JOIN challenges ON challenges.id = point_feed_items.challenge_id
AND challenges.type IN ('StandardChallenge') WHERE "teams"."division_id" = $1
GROUP BY teams.id ORDER BY "teams"."created_at" ASC
This works nearly all the time, I am just running into an edge case where I will sometimes end up with the same ScoreAdjustment in the point_feed_items.point_value sum. I called COUNT(point_feed_items.point_value) and verified that I somehow had 3 elements coming back even though there should only be 1. I have so far been unable to figure out either why the same element is sometimes coming back multiple times, or how to call DISTINCT as part of the LEFT JOIN to avoid the problem completely.
I did find that removing the 2nd LEFT JOIN did fix the issue, however I need the data from that LEFT JOIN.
To put the issue another way, I replaced COALESCE(sum(point_feed_items.point_value), 0) with COALESCE(COUNT(point_feed_items.point_value), 0) and verified with no ScoreAdjustments in the database that it returned 0. I then created one ScoreAdjustment with the correct team and COALESCE(COUNT(point_feed_items.point_value), 0) then returned 3 instead of 1. Am I misunderstanding how LEFT JOIN AS works?
This is part of a rails app, however it is mostly written as a manual query for better performance.
Turns out this was all due to a misunderstanding of how LEFT JOIN works when you do it multiple times.
Given the following (simplified example) data:
I was thinking of the LEFT JOIN as looking something like this:
And it actually looked something like this:
I went ahead and switched my query over to look as follows:
SELECT COALESCE(sum(point_feed_items.team_score), 0) as team_score,
GREATEST(MAX(pentest_feed_items.last_solve_time),
MAX(point_feed_items.last_solve_time)) as last_solve_time,
teams.*
FROM "teams"
LEFT JOIN LATERAL
(
SELECT
COALESCE(sum(challenges.point_value), 0) + COALESCE(sum(feed_items.point_value), 0) as team_score,
MAX(feed_items.created_at) as last_solve_time
FROM feed_items
LEFT JOIN challenges ON challenges.id = feed_items.challenge_id AND challenges.type IN ('StandardChallenge')
WHERE feed_items.team_id = teams.id
AND feed_items.type IN ('StandardSolvedChallenge', 'ScoreAdjustment')
) AS point_feed_items ON true
LEFT JOIN LATERAL
(
SELECT MAX(feed_items.created_at) as last_solve_time
FROM feed_items
WHERE feed_items.team_id = teams.id
AND feed_items.type IN ('PentestSolvedChallenge')
) AS pentest_feed_items ON true
WHERE "teams"."division_id" = $1 GROUP BY teams.id
And everything works fine now.

How to show all records from multiple tables regardless of match on join statement

I am trouble figuring out the proper syntax to structure this query correctly. I am trying to show ALL records from both the SalesHistoryDetail AND from the SalesVsBudget table. I believe my query allows for some of the records on SalesVsBudget to not be pulled, whereas I want them all for that period, regardless of whether there was a corresponding sale. Here is my code:
SELECT MAX(a.DispatchCenterOrderKey) AS DispatchCenter,
a.CustomerKey,
CASE WHEN a.CustomerKey IN
(SELECT AddressKey
FROM FinancialData.dbo.DimAddress
WHERE AddressKey >= 99000 AND AddressKey <= 99599) THEN 1 ELSE 0 END AS InterCompanyFlag,
MAX(a.Customer) AS Customer,
a.SalesmanID,
MAX(a.Salesman) AS Salesman,
a.SubCategoryKey,
MAX(a.SubCategoryDesc) AS Subcategory,
SUM(a.Value) AS SalesAmt,
b.FiscalYear AS Year,
b.FiscalWeekOfYear AS Week,
MAX(c.BudgetLbs) AS BudgetLbs,
MAX(c.BudgetDollars) AS BudgetDollars
FROM dbo.SalesHistoryDetail AS a
LEFT OUTER JOIN dbo.M_DateDim AS b ON a.InvoiceDate = b.Date
FULL OUTER JOIN dbo.SalesVsBudget AS c ON a.SalesmanID = c.SalesRepKey
AND a.CustomerKey = c.CustomerKey
AND a.SubCategoryKey = c.SubCategoryKey
AND b.FiscalYear = c.Year AND b.FiscalWeekOfYear = c.WeekNo
GROUP BY a.SalesmanID, a.CustomerKey, a.SubCategoryKey, b.FiscalYear, b.FiscalWeekOfYear
There are two different data sets that I am pulling from, obviously the SalesHistoryDetail table and the SalesVsBudget table. I'm hoping to get ALL budgetLbs, and BudgetDollars values from the SalesVsBudget table regardless of whether they match in the join. I want all of the matching joining records too, but I also want EVERY record from SalesVsBudget. Essentially I want to show ALL sales records and I want to reference the budget values from SalesVsBudget when the salesman,customer,subcategory, year and week match but I also want to see budget entries that fall in my date range that don't have corresponding sales records in that period. Hopefully that makes sense. I feel I am very close, but my budget numbers doesn't reflect the whole story and I think that is because some of my records are being excluded! Please help.
I was able to accomplish this through playing with the FULL OUTER JOIN. My problems was there were more records in SalesVsBudget than SalesHistory_V. Therefore I had to make SalesVsBudget the initial FROM table and SaleHistory_V with a FULL OUTER JOIN and all records lined up.

getting the count of another table values in existing query Rails

Currently I am doing this to get data from my tables as
cameras = Camera.joins("left JOIN users on cameras.owner_id = users.id")
.joins("left JOIN vendor_models vm on cameras.model_id = vm.id")
.joins("left JOIN vendors v on vm.vendor_id = v.id")
.where(condition).order(sorting(col_for_order, order_for)).decorate
And this generates such query as
SELECT "cameras".* FROM "cameras" left JOIN users on cameras.owner_id = users.id left JOIN vendor_models vm on cameras.model_id = vm.id left JOIN vendors v on vm.vendor_id = v.id
I have an other table as camera_shares on which I have relation of camera's table as there is a camera_id which is present in camera_shares table as well, in each camera_shares there is an camera_id which tell us that the share is for which camera, I want to calculate the count of camera_shares, For example if there is a camera with id 12 and there are also 20 camera_shares with camera_id = 12 then I want to count the total of it. I want to do that in Rails query as I have shown above? How is that possible to do?
Seems rather straightforward. Suppose you want to find the number of camera_shares for #camera:
CameraShare.where(camera: #camera).count

Second foreign_key to speed up query

Say you creating an imdb type site for TV Shows. You have a Show with many attached episodes and a bunch of people
Right now I link people to episodes though a contribution table - but if I want to make a list of all the shows they are on, I have to go through episodes.
Since this query takes a long time I was thinking about adding show_id to the contributions table. Is this common practice to increase performance or is there another way I haven't thought of?
Since this query takes a long time
Have you run a SQL explain plan to show why this is the case? What is the actual SQL query that is being run, and are you doing things like ordering or running subqueries within it?
If I understand your structure it is something like this:
|people| n---1 |contribution| 1---n |episodes| n---1 |shows|
A sql select of the sort:
select distinct s.name
from shows s,
episodes e,
contribution c
where c.people_id = <id>
and c.episode_id = e.id
and e.show_id = s.id
should really not have performance issues unless there are no indexes on the tables or the tables are massive.
Here's a way using where id in ( ... ) to select all shows a specific person appeared in
Shows.where(id: Contribution.select("show_id")
.join(:episodes)
.where(person_id: personId)
.group("episodes.show_id"))
You may also want to try exists
Shows.where("EXISTS(SELECT 1 from contributions c
join episodes e on e.id = c.episode_id
where c.person_id = ? and e.show_id = shows.id)")

Using distinct in a join

I'm still a novice at SQL and I need to run a report which JOINs 3 tables. The third table has duplicates of fields I need. So I tried to join with a distinct option but hat didn't work. Can anyone suggest the right code I could use?
My Code looks like this:
SELECT
C.CUSTOMER_CODE
, MS.SALESMAN_NAME
, SUM(C.REVENUE_AMT)
FROM C_REVENUE_ANALYSIS C
JOIN M_CUSTOMER MC ON C.CUSTOMER_CODE = MC.CUSTOMER_CODE
/* This following JOIN is the issue. */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE = (SELECT SALESMAN_CODE FROM M_SALESMAN WHERE COMP_CODE = '00')
WHERE REVENUE_DATE >= :from_date
AND REVENUE_DATE <= :to_date
GROUP BY C.CUSTOMER_CODE, MS.SALESMAN_NAME
I also tried a different variation to get a DISTINCT.
/* I also tried this variation to get a distinct */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE =
(SELECT distinct(SALESMAN_CODE) FROM M_SALESMAN)
Please can anyone help? I would truly appreciate it.
Thanks in advance.
select distinct
c.customer_code,
ms.salesman_code,
SUM(c.revenue_amt)
FROM
c_revenue c,
m_customer mc,
m_salesman ms
where
c.customer_code = mc.customer_code
AND mc.salesman_code = ms.salesman_code
AND ms.comp_code = '00'
AND Revenue_Date BETWEEN (from_date AND to_date)
group by
c.customer_code, ms.salesman_name
The above will return you any distinct combination of Customer Code, Salesman Code and SUM of Revenue Amount where the c.CustomerCode matches an mc.customer_code AND that same mc record matches an ms.salesman_code AND that ms record has a comp_code of '00' AND the Revenue_Date is between the from and to variables. Then, the whole result will be grouped by customer code and salesman name; the only thing that will cause duplicates to appear is if the SUM(revenue) is somehow different.
To explain, if you're just doing a straight JOIN, you don't need the JOIN keywords. I find it tends to convolute things; you only need them if you're doing an "odd" join, like an LEFT/RIGHT join. I don't know your data model so the above MIGHT still return duplicates but, if so, let me know.

Resources