Issues with joining multiple tables - join

I'm really struggling at the moment trying to work out how to join multiple tables without duplicating data.
At the moment I have 8 tables that I was wanted to get various information from per member of staff like the below:
SDQ score, Goal scores, CHI score, number of appointments, number of dna appointments
The tables and field I can see to join are as follows
tblSDQ - Assessed_By_Staff_ID
tblGoals - Recorded_By_Staff_ID
tblCHI - Recorded_By_Staff_ID
tblReferral - Staff_ID
tblStaff - Staff_ID
tblDiaryAppointment - needs to connect to tblDiaryAppointmentClinician using Clinician_Invitee_Staff_ID
I hope someone can help or advice. I just don't know if it's even possible to join all these tables using the same field, or if its possible to join them but then return a number of entries but then just count others?

Syntax depends on a rdbms you are using.
You could use join with specified join fields from both tables:
select bla-bla
from table1
join table2 on ( table1.fileld_name1 = table2.fileld_name2 )
https://dev.mysql.com/doc/refman/5.0/en/join.html
if you need outer join (to show nulls for optional tables data) you could use this:
join table2 on ( table1.fileld_name1 = table2.fileld_name2 or table2.field_name2 is null )
to join with couns you could use subqueries like this
join ( select field_name3, coint(*) as cnt from table3 goup by field_name3 ) AS table3_counts
...
where ( table3_counts.field_name3 = ... or table3_counts.field_name3 is null )
https://dev.mysql.com/doc/refman/5.0/en/from-clause-subqueries.html
PS: Joins are often slow. It's better to denormalize tables to eliminate joins and gain performance. Or do simple selects and join in backend code.

Related

Conditional join of two tables in Knime

I am new to Knime Analytics.
I have two tables and I need to join them not by equality, but by the difference between the values of two fields. (In sql it would look like "table1 join table2 on abs(table1.mass - table2.mass)<0.005 ") but in nodes I have found only node, that join by equality.
Are there any nodes for conditional joining tables or something like this?
The only way I can think of to do this is as follows. Use a Cross Joiner node to join all rows of each table to all rows of the second table. Now use a Java Snippet Row Filter on the joined table, with the following snippet code
return Math.abs($mass$.doubleValue() - $mass (#1)$.doubleValue()) < 0.005;
(Assuming that both incoming tables have a column called 'mass', which will become 'mass' and 'mass (#1)' after the Cross Joiner

Rails: Collect records whose join tables appear in two queries

There are three models that matter here: Objective, Student, and Seminar. All are associated with has_and_belongs_to_many.
There is an ObjectiveStudent join model that includes columns "ready" and "points_all_time". There is an ObjectiveSeminar join model that includes column "priority".
I need to collect all of the objectives that are associated with a given student and also with a given seminar.
They need to also be marked with a "priority" above zero in the seminar. So I think I need this line:
obj_sems = ObjectiveSeminar.where(:seminar => given_seminar).where("priority > ?", 0)
Finally, they need to also be objectives where the student is ready, but has not scored above 7. So I think I need this line:
obj_studs = ObjectiveStudent.where(:user => given_student, :ready => true).where("points_all_time <= ?", 7)
Is there a way to gather all the objectives whose join table records appear in both of the above queries? Note that neither of the lists return objectives; they return objective_seminars, and objective_students, respectively. My end goal is to collect the objectives that meet all of the above criteria.
Or am I approaching this all wrong?
Bonus question: I would also love to sort the objectives by their priority in the given seminar. But I'm afraid that would add too much to the database load. What are your thoughts on this?
Thank you in advance for any insight.
In order to get Objectives you'll need to start your query from that.
In order to query with an AND condition the associated tables, you'll need inner joins with these tables.
Finally you'll need a distinct operator to only fetch each objective once.
The extended version of what (I think) you need is:
Objective.joins(objective_seminars: :seminar, objective_student: :student).
where(seminars: seminar_search_params, strudents: student_search_params).
where('objective_seminars.priority > 0').
where('objective_students.ready = 1 AND points_all_time <= 7').
order('objective_seminars.priority ASC').
distinct
Now for the database load it all depends on your indexes and the size of your tables.
The above query will translate to the following SQL (or something similar).
SELECT DISTINCT objectives.* FROM objectives
INNER JOIN objective_students ON objective_students.objective_id = objectives.id
INNER JOIN students ON students.id = objective_students.student_id
INNER JOIN objective_seminars ON objective_seminars.objective_id = objectives.id
INNER JOIN seminars ON seminars.id = objective_seminars.seminar_id
WHERE seminars_query AND
students_query AND
objective_seminars.priority > 0 AND
objective_students.ready = 1 AND points_all_time <= 7 AND
objective_seminars.priority ASC
So you'll need to add or extend your indexes so that all 5 tables queries can have an index helping out. The actual index implementation is up to you and depends on your application's specific (read - write load, tables size, cardinality etc)

Joining two tables based on matching two columns

I'm trying to join two tables:
Table A has three columns: State, County, and Count (of Farmer's Markets in said county)
Table B has several columns: State, County, and several data columns (like food access score)
I'm trying to combine them in such a way as to put the Count for each State/County combination (since there are multiple counties with the same name) together with the State and County and data columns from Table B.
I've been banging my head on SAS, trying to get a join to cooperate. I read a few other questions on here, but I can't find where the mistake is in my code.
PROC SQL;
CREATE TABLE WORK.QUERY1
AS
SELECT FMDV4.State, FMDV4.County, FMDV4.Count, CFSDV1.GROC14,
CFSDV1.SUPERC14, CFSDV1.CONVS14, CFSDV1.SPECS14, CFSDV1.FOODINSEC_13_15,
CFSDV1.PCT_LACCESS_POP15, CFSDV1.DIRSALES_FARMS12, CFSDV1.FMRKT16,
CFSDV1.FOODHUB16, CFSDV1.CSA12, CFSDV1.POVRATE15, CFSDV1.PERPOV10
FROM FNLPRJT.CFSDV1 AS CFSDV1
INNER JOIN FNLPRJT.FMDV4 AS FMDV4
ON (( CFSDV1.State = FMDV4.State ) AND ( CFSDV1.County =
FMDV4.County ));
QUIT;
I also tried a few variants, like:
PROC SQL;
CREATE TABLE WORK.QUERY1
AS
SELECT FMDV4.State, FMDV4.County, FMDV4.Count, CFSDV1.GROC14,
CFSDV1.SUPERC14, CFSDV1.CONVS14, CFSDV1.SPECS14, CFSDV1.FOODINSEC_13_15,
CFSDV1.PCT_LACCESS_POP15, CFSDV1.DIRSALES_FARMS12, CFSDV1.FMRKT16,
CFSDV1.FOODHUB16, CFSDV1.CSA12, CFSDV1.POVRATE15, CFSDV1.PERPOV10
FROM FNLPRJT.CFSDV1 AS CFSDV1
INNER JOIN FNLPRJT.FMDV4 AS FMDV4
ON CFSDV1.State = FMDV4.State
WHERE CFSDV1.County = FMDV4.County;
QUIT;
I get a table of 0 rows with the columns as they should be (State, County, Count, ). I'm just missing the dang data! Can anyone please help me find my mistake?
Can you try
propcase(CFSDV1.State) = propcase(FMDV4.State)
and
propcase(CFSDV1.County) = propcase(FMDV4.County);
If this doesn't work try character functions like trim and compress to remove any blanks that might be present in the data.

Unusual Joins SQL

I am having to convert code written by a former employee to work in a new database. In doing so I came across some joins I have never seen and do not fully understand how they work or if there is a need for them to be done in this fashion.
The joins look like this:
From Table A
Join(Table B
Join Table C
on B.Field1 = C.Field1)
On A.Field1 = B.Field1
Does this code function differently from something like this:
From Table A
Join Table B
On A.Field1 = B.Field1
Join Table C
On B.Field1 = C.Field1
If there is a difference please explain the purpose of the first set of code.
All of this is done in SQL Server 2012. Thanks in advance for any help you can provide.
I could create a temp table and then join that. But why use up the cycles\RAM on additional storage and indexes if I can just do it on the fly?
I ran across this scenario today in SSRS - a user wanted to see all the Individuals granted access through an AD group. The user was using a cursor and some temp tables to get the users out of AD and then joining the user to each SSRS object (Folders, reports, linked reports) associated with the AD group. I simplified the whole thing with Cross Apply and a sub query.
GroupMembers table
GroupName
UserID
UserName
AccountType
AccountTypeDesc
SSRSOjbects_Permissions table
Path
PathType
RoleName
RoleDesc
Name (AD group name)
The query needs to return each individual in an AD group associated with each report. Basically a Cartesian product of users to reports within a subset of data. The easiest way to do this looks like this:
select
G.GroupName, G.UserID, G.Name, G.AccountType, G.AccountTypeDesc,
[Path], PathType, RoleName, RoleDesc
from
GroupMembers G
cross apply
(select
[Path], PathType, RoleName, RoleDesc
from
SSRSOjbects_Permissions
where
Name = G.GroupName) S;
You could achieve this with a temp table and some outer joins, but why waste system resources?
I saw this kind of joins - it's MS Access style for handling multi-table joins. In MS Access you need to nest each subsequent join statement into its level brackets. So, for example this T-SQL join:
SELECT a.columna, b.columnb, c.columnc
FROM tablea AS a
LEFT JOIN tableb AS b ON a.id = b.id
LEFT JOIN tablec AS c ON a.id = c.id
you should convert to this:
SELECT a.columna, b.columnb, c.columnc
FROM ((tablea AS a) LEFT JOIN tableb AS b ON a.id = b.id) LEFT JOIN tablec AS c ON a.id = c.id
So, yes, I believe you are right in your assumption

Using distinct in a join

I'm still a novice at SQL and I need to run a report which JOINs 3 tables. The third table has duplicates of fields I need. So I tried to join with a distinct option but hat didn't work. Can anyone suggest the right code I could use?
My Code looks like this:
SELECT
C.CUSTOMER_CODE
, MS.SALESMAN_NAME
, SUM(C.REVENUE_AMT)
FROM C_REVENUE_ANALYSIS C
JOIN M_CUSTOMER MC ON C.CUSTOMER_CODE = MC.CUSTOMER_CODE
/* This following JOIN is the issue. */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE = (SELECT SALESMAN_CODE FROM M_SALESMAN WHERE COMP_CODE = '00')
WHERE REVENUE_DATE >= :from_date
AND REVENUE_DATE <= :to_date
GROUP BY C.CUSTOMER_CODE, MS.SALESMAN_NAME
I also tried a different variation to get a DISTINCT.
/* I also tried this variation to get a distinct */
JOIN M_SALESMAN MS ON MC.SALESMAN_CODE =
(SELECT distinct(SALESMAN_CODE) FROM M_SALESMAN)
Please can anyone help? I would truly appreciate it.
Thanks in advance.
select distinct
c.customer_code,
ms.salesman_code,
SUM(c.revenue_amt)
FROM
c_revenue c,
m_customer mc,
m_salesman ms
where
c.customer_code = mc.customer_code
AND mc.salesman_code = ms.salesman_code
AND ms.comp_code = '00'
AND Revenue_Date BETWEEN (from_date AND to_date)
group by
c.customer_code, ms.salesman_name
The above will return you any distinct combination of Customer Code, Salesman Code and SUM of Revenue Amount where the c.CustomerCode matches an mc.customer_code AND that same mc record matches an ms.salesman_code AND that ms record has a comp_code of '00' AND the Revenue_Date is between the from and to variables. Then, the whole result will be grouped by customer code and salesman name; the only thing that will cause duplicates to appear is if the SUM(revenue) is somehow different.
To explain, if you're just doing a straight JOIN, you don't need the JOIN keywords. I find it tends to convolute things; you only need them if you're doing an "odd" join, like an LEFT/RIGHT join. I don't know your data model so the above MIGHT still return duplicates but, if so, let me know.

Resources