Number of joins in select query - join

I have a business table and in that we have 50 foreign key columns which refers other master data tables.
to fetch all the data my query has to join all the 50 reference tables like
select ct.id , ct.name , ct.description , st.value , pr.value , sv.value , ....
from
core_table ct
left outer join domain_value st on ct.status_fk = st.id
left outer join domain_value pr on ct.priority_fk = pr.id
left outer join domain_value svon ct.severity_fk = sv.id
.......
.......
so like this i need to make 50 left outer joins.
is this right to do 50 left outer joins like this or do we have any other optimized way to achieve this ?

Is too many Left Joins a code smell?
It's a perfectly legitimate solution for some designs.

Related

How do you conduct an outer join in typeorm?

I am using typeorm in a project and want to conduct a left outer join. After going through their docs there is no mention of this at all and i am not convinced that left join will do what i need it to do. Is there a way to conduct an outer join in typeorm?
Just to share, this is the query i am trying to convert to typeorm:
SELECT bsm."bId"
FROM (
SELECT bs."id" "statusId", bs.label label, bs_mapping."bId" "bId"
FROM bs_mapping
LEFT JOIN bs
ON bs.id = bs_mapping."binStatusId"
) bsm
LEFT OUTER JOIN tt_bs
ON tt_bs."statusId" = bsm."statusId"
WHERE ((bsm.label = 'inactive' OR bsm.label = 'block') AND
tt_bs."ttId" IS NULL')
GROUP by bsm."bId"

Preventing duplicate values in multiple joins of the same table

I have two tables in my project that I need to join together in a somewhat complicated way and it is giving me very strange issues
I have a concept of teams and a concept of FeedItems. FeedItems means the team has solved a challenge. I need to know the last time they solved a challenge at, and I also need to calculate the sum of point-based FeedItems.
SELECT COALESCE(sum(challenges.point_value), 0) + COALESCE(sum(point_feed_items.point_value), 0) as team_score,
GREATEST(MAX(pentest_feed_items.created_at), MAX(point_feed_items.created_at)) as last_solve_time, teams.* FROM "teams"
LEFT JOIN feed_items AS point_feed_items
ON point_feed_items.team_id = teams.id
AND point_feed_items.type IN ('StandardSolvedChallenge', 'ScoreAdjustment')
LEFT JOIN feed_items AS pentest_feed_items
ON pentest_feed_items.team_id = teams.id
AND pentest_feed_items.type IN ('PentestSolvedChallenge')
LEFT JOIN challenges ON challenges.id = point_feed_items.challenge_id
AND challenges.type IN ('StandardChallenge') WHERE "teams"."division_id" = $1
GROUP BY teams.id ORDER BY "teams"."created_at" ASC
This works nearly all the time, I am just running into an edge case where I will sometimes end up with the same ScoreAdjustment in the point_feed_items.point_value sum. I called COUNT(point_feed_items.point_value) and verified that I somehow had 3 elements coming back even though there should only be 1. I have so far been unable to figure out either why the same element is sometimes coming back multiple times, or how to call DISTINCT as part of the LEFT JOIN to avoid the problem completely.
I did find that removing the 2nd LEFT JOIN did fix the issue, however I need the data from that LEFT JOIN.
To put the issue another way, I replaced COALESCE(sum(point_feed_items.point_value), 0) with COALESCE(COUNT(point_feed_items.point_value), 0) and verified with no ScoreAdjustments in the database that it returned 0. I then created one ScoreAdjustment with the correct team and COALESCE(COUNT(point_feed_items.point_value), 0) then returned 3 instead of 1. Am I misunderstanding how LEFT JOIN AS works?
This is part of a rails app, however it is mostly written as a manual query for better performance.
Turns out this was all due to a misunderstanding of how LEFT JOIN works when you do it multiple times.
Given the following (simplified example) data:
I was thinking of the LEFT JOIN as looking something like this:
And it actually looked something like this:
I went ahead and switched my query over to look as follows:
SELECT COALESCE(sum(point_feed_items.team_score), 0) as team_score,
GREATEST(MAX(pentest_feed_items.last_solve_time),
MAX(point_feed_items.last_solve_time)) as last_solve_time,
teams.*
FROM "teams"
LEFT JOIN LATERAL
(
SELECT
COALESCE(sum(challenges.point_value), 0) + COALESCE(sum(feed_items.point_value), 0) as team_score,
MAX(feed_items.created_at) as last_solve_time
FROM feed_items
LEFT JOIN challenges ON challenges.id = feed_items.challenge_id AND challenges.type IN ('StandardChallenge')
WHERE feed_items.team_id = teams.id
AND feed_items.type IN ('StandardSolvedChallenge', 'ScoreAdjustment')
) AS point_feed_items ON true
LEFT JOIN LATERAL
(
SELECT MAX(feed_items.created_at) as last_solve_time
FROM feed_items
WHERE feed_items.team_id = teams.id
AND feed_items.type IN ('PentestSolvedChallenge')
) AS pentest_feed_items ON true
WHERE "teams"."division_id" = $1 GROUP BY teams.id
And everything works fine now.

SQL Join issue with 3 tables and a subquery

Tables Involved:
account, user, service, accesshist
I want to include all records in the account table, and only the data from the other tables when it exists.
Count from account: 5064
rows returned from query below: 4915
select u.last_name, u.first_name, a.username, ll.mxlogin, si.servicename, a.islockedout
from account a
join service si on a.serviceid = si.serviceid
left outer join user u on u.loginid = a.username
left outer join(select max(loginattemptdate) as MxLogin, usernameattempted from accesshist where isloginsuccessful = 1
group by usernameattempted) ll
on a.username = ll.usernameattempted
where a.isenabled = 1
order by ll.mxlogin, u.last_name
I've narrowed it down that the subquery join is the part causing the number of rows to be reduced, but I am unsure how to correct it. Any insight is greatly appreciated!
Have you tried changing the first join to a left outer join?
select u.last_name, u.first_name, a.username, ll.mxlogin, si.servicename, a.islockedout
from account a
left outer join service si on a.serviceid = si.serviceid

Solr5.4 indexing using DataImportHandler(joins)

I am using Solr 5.4. Working on indexing entities using Data Import Handler which are having inner joins and left joins as below.
Does solr5.4 supports inner joins and left joins?
I am having a query which has relationship between 7 tables. I am doing inner joins for few and left joins for few tables.
below is my query declared in data-config.xml file.
SELECT rel.*,
PRODUCT_SRC_SYS.PRODUCT_SRC_NME AS sourceName
FROM (SELECT REF.*,
PRODUCT_RLSP.OBJ_ID AS relObjId
FROM (SELECT prdct.*,
PRODUCT_CATEGORY.OBJ_ID AS refObjId
PRODUCT_CATEGORY_MAPG.category_mapng_Name
AS refCategoryMapngName,
PRODUCT_CATEGORY_LIFE_CYL.LIFE_CYL_STAT_TYP
AS refStatus
FROM (SELECT PRODUCT.RGSTRY_ID
AS id
PRODUCT.OBJ_ID
AS obj_id,
PRODUCT.PRODUCT_NAME
As product_name,
PRODUCT_STAT.PRODUCT_STAT_TYP
AS status
FROM PRODUCT
LEFT JOIN PRODUCT_STAT
ON PRODUCT.OBJ_ID =
PRODUCT_STAT.OBJ_ID)
prdct
LEFT JOIN PRODUCT_CATEGORY
ON prdct.product_name =
PRODUCT_CATEGORY.category_name
INNER JOIN PRODUCT_CATEGORY_MAPG
ON PRODUCT_CATEGORY.OBJ_ID =
PRODUCT_CATEGORY_MAPG.OBJ_ID
INNER JOIN PRODUCT_CATEGORY_LIFE_CYL
ON PRODUCT_CATEGORY.OBJ_ID =
PRODUCT_CATEGORY_LIFE_CYL.OBJ_ID)
REF
LEFT JOIN PRODUCT_RLSP
ON REF.obj_id =
PRODUCT_RLSP.OBJ_ID) rel
LEFT JOIN PRODUCT_SRC_SYS
ON rel.obj_id = PRODUCT_SRC_SYS.OBJ_ID
Any help is highly appreciated!!

Get incremental changes between Hive partitions

I have a nightly job that runs and computes some data in hive. It is partitioned by day.
Fields:
id bigint
rank bigint
Yesterday
output/dt=2013-10-31
Today
output/dt=2013-11-01
I am trying to figure out if there is a easy way to get incremental changes between today and yesterday
I was thinking about doing a left outer join but not sure what that looks like since its the same table
This is what it might looks like when there are different tables
SELECT * FROM a LEFT OUTER JOIN b
ON (a.id=b.id AND a.dt='2013-11-01' and b.dt='2-13-10-31' ) WHERE a.rank!=B.rank
But on the same table it is
SELECT * FROM a LEFT OUTER JOIN a
ON (a.id=a.id AND a.dt='2013-11-01' and a.dt='2-13-10-31' ) WHERE a.rank!=a.rank
Suggestions?
This would work
SELECT a.*
FROM A a LEFT OUTER JOIN A b ON a.id = b.id
WHERE a.dt='2013-11-01' AND b.dt='2013-10-31' AND <your-rank-conditions>;
Efficiently, this would span 1 MapReduce job only.
So I figured it out... Using Subqueries and Joins
select * from (select * from table where dt='2013-11-01') a
FULL OUTER JOIN
(select * from table where dt='2013-10-31') b
on (a.id=b.id)
where a.rank!=b.rank or a.rank is null or b.rank is null
The above will give you the diff..
You can take the diff and figure out what you need to ADD/UPDATE/REMOVE
UPDATE If a.rank!=null and b.rank!=null i.e rank changed
DELETE IF a.rank=null and b.rank!=null i.e the user is no longer ranked
ADD if a.rank!=null and b.rank=null i.e this is a new user

Resources