I have five tables: users, interests, animals, interests_animals, interests_users.
User
foo
Interest 1, 2, 3
Animal 1, 2, 3
foo has interests 1, 2, 3
Interest 1 has Animal 1, 2
Interest 2 has Animal 1, 2, 3
Interest 3 has Animal 3
I need return all animals through interests grouped for interest id of foo ordered by animals count
I trying like this:
SELECT animals.* FROM animals
INNER JOIN interests_animals ON animals.id = interests_animals.animal_id
INNER JOIN interests ON interests_animals.interest_id = interests.id
INNER JOIN interests_users ON interests.id = interests_users.interest_id
WHERE interests_users.user_id = XXX
GROUP BY animals.id
ORDER BY COUNT(interests_animals.animal_id);
I need that the animals are returned in orders 2, 1, 3, but always returning 1,2,3
You need explicitly specify column(s), on which you do GROUP BY in SELECT clause.
All other parts of SELECT clause must be aggregates like count(), sum(), etc.
Notice, that we use count(distinct ..) here because each animal ID might appear multiple times due to the chain of JOINs:
SELECT
interests.id,
COUNT(DISTINCT animals.id) as animals_count
JOIN interests_animals ON animals.id = interests_animals.animal_id
JOIN interests ON interests_animals.interest_id = interests.id
JOIN interests_users ON interests.id = interests_users.interest_id
WHERE interests_users.user_id = XXX
GROUP BY 1
ORDER BY 2 desc;
-- in GROUP BY and ORDER BY, it is usually convenient to use just numbers -- "1" means "the 1st column of SELECT clause", etc.
Also, "INNER" is an optional keyword (simply "JOIN" and "INNER JOIN" are the same thing).
Also, as a side note, you might found useful to add this to your SELECT clause:
, array_agg(animals.id order by animals.id) as animal_ids
-- this will give you integer array of all animal IDs that relate to a particular interest, ordered.
Related
Issue:
You are given three tables: Students, Friends and Packages.
Students contains two columns: ID and Name.
Friends contains two columns: ID and Friend_ID (ID of the ONLY best friend).
Packages contains two columns: ID and Salary (offered salary in $ thousands per month).
Write a query to output the names of those students whose best friends got offered a higher salary than them. Names must be ordered by the salary amount offered to the best friends. It is guaranteed that no two students got same salary offer.
Code:
This is the code that I have come up with but it does not produce correct results. Can anyone let me know why?
select TableA.name
from
(select s.id,s.name,p.salary from students s inner join packages p on s.id=p.id) TableA,
(select f.id,f.friend_id, p2.salary from friends f inner join packages p2 on f.friend_id=p2.id) TableB
where TableA.id=TableB.id And TableA.salary>TableB.salary
order by TableB.salary desc;
I think in your query you wrote AND TableA.salary < TableB.salary instead of AND TableA.salary > TableB.salary.
Moreover I think your query can be written in a more synthetic way.
On MSSQL (but it works on MYSQL too, as query is very basic), you can try to use this one:
SELECT s.id
,s.NAME
,p.salary
, f.friend_id, p2.salary as friend_salary
FROM students s
INNER JOIN packages p ON s.id = p.id
LEFT JOIN friends f ON f.id = s.id
LEFT JOIN packages p2 ON f.friend_id = p2.id
WHERE p.salary <= p2.salary
ORDER BY s.id;
Output:
id NAME salary friend_id friend_salary
1 John 1000 2 1200
3 Pete 800 1 1000
Sample data:
CREATE TABLE students (id int, NAME VARCHAR(30));
CREATE TABLE packages (id int, salary INT);
CREATE TABLE friends (id int, friend_id INT);
INSERT INTO students values (1,'John');
INSERT INTO students values (2,'Arthur');
INSERT INTO students values (3,'Pete');
INSERT INTO packages values (1,1000);
INSERT INTO packages values (2,1200);
INSERT INTO packages values (3,800);
INSERT INTO friends values (1,2);
INSERT INTO friends values (2,3);
INSERT INTO friends values (3,1);
I used CTE for easy code readability. I am not sure whether it is fully optimized or not. But, it yields the result as expected from the question.
with std_salary as (
SELECT s.id, s.name, p.salary
FROM Students s
JOIN Packages p
ON s.id=p.id),
friend_salary as (
SELECT f.id, p.salary
FROM Friends f
JOIN Packages p
ON f.friend_id=p.id
)
SELECT name
FROM
(SELECT std_salary.name, std_salary.salary as own, friend_salary.salary as friend
FROM std_salary
JOIN friend_salary
ON std_salary.id=friend_salary.id) as final
WHERE final.own<final.friend
ORDER BY final.friend;
This worked for me in MS SQL
SELECT a.name
FROM (SELECT students.id as main_id, students.name, packages.salary
FROM students join packages on students.id = packages.id) a
JOIN (SELECT f.id as main_id1, p.salary
FROM friends f JOIN packages p ON f.friend_id = p.id) b
ON a.main_id = b.main_id1
WHERE b.salary>a.salary
ORDER BY b.salary ASC;
you have written 'where TableA.salary>TableB.salary' implying that you want to find rows where your salary is > than your friends. But the question asked was the opposite (to find names where the firends salary is > than your salary) so you can change that to 'where TableB.salary>TableA.salary' and it should work.
select my_name from
(select s.id as my_id,s.name my_name,p.salary as my_salary from students s
inner join packages p on s.id=p.id) as my_tbl inner join (select f.id as
id,f.friend_id as frnd_id,p.salary as frnd_salary from friends f inner join
packages p on f.friend_id=p.id ) as frnd_tbl on my_id=id where
frnd_salary>my_salary order by frnd_salary;
What I want to do is to join table and sum 3 columns.
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").select("sum(a), sum(b), sum(c)")
Gives me
#<ActiveRecord::Relation [#<DocumentProduct id: nil>]>
Something like that works:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").sum("a")
But I want to have 3 sums. I can`t do sum("a, b, c"). Where is the problem?
So, the code is building a SQL query using the ActiveRecord chained method syntax. It's possible to use .to_sql as the final part of most such chains (basically, as long as it's still an ActiveRecord object, rather than having been converted to an Array, for example) to see the SQL generated, or indeed inspecting the log, if it's on. Considering the common part of the chain:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id")
This generates something like (might not be exact, because I'm guessing a little about your application):
SELECT "document_products".* FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
The two final methods you list are very different; select selects which columns in the query to return, whereas sum is an aggregate function which expects a single value to be returned in each case. Considering the select, we get something like the following generated:
SELECT SUM(products.a), SUM(products.b), SUM(products.c) FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
When this query is interpreted, the expected data cannot be found, leading to the problem described. Ensuring that the GROUP BY clause is included in the SELECT part, however, yields the necessary information. Try something like this:
self.document_products.joins("JOIN products ON products.id = document_products.product_id").group("products.tax_id").select("products.tax_id, sum(a), sum(b), sum(c)")
This generates SQL something like:
SELECT products.tax_id, SUM(products.a), SUM(products.b), SUM(products.c) FROM "document_products" JOIN products ON products.id = document_products.product_id WHERE "document_products"."document_id" = 1497 GROUP BY products.tax_id
This appears to return the necessary information, and is, I think, what you're looking for (or close to it).
Let say a book model HABTM categories, for an example book A has categories "CA" & "CB". How can i retrieve book A if I query using "CA" & "CB" only. I know about the .where("category_id in (1,2)") but it uses OR operation. I need something like AND operation.
Edited
And also able to get books from category CA only. And how to include query criteria such as .where("book.p_year = 2012")
ca = Category.find_by_name('CA')
cb = Category.find_by_name('CB')
Book.where(:id => (ca.book_ids & cb.book_ids)) # & returns elements common to both arrays.
Otherwise you'd need to abuse the join table directly in SQL, group the results by book_id, count them, and only return rows where the count is at least equal to the number of categories... something like this (but I'm sure it's wrong so double check the syntax if you go this route. Also not sure it would be any faster than the above):
SELECT book_id, count(*) as c from books_categories where category_id IN (1,2) group by book_id having count(*) >= 2;
Is it possible to link product id's from different tables to one universal product id? eg 1014 id from table A and 2015 id from table B to one universal 10 id in table C ?
In this case you could do something like this:
First your internal products:
master_id, name, description, etc...
1, "Keyboard", "Nice"
2, "Mouse", "Microsoft"
3, "Monitor", "Bright"
4, "Printer", "Not the best"
Second table a and table b would have a master_id column that references one of those ids.
Then to select all keyboards from table a or table b:
SELECT * FROM table_a JOIN products ON table_a.master_id =
products.master_id WHERE products.master_id =1;
SELECT * FROM table_b JOIN products ON table_a.master_id =
products.master_id WHERE products.master_id =1;
you can then get all keyboards from BOTH tables via union:
SELECT * FROM table_a JOIN products ON table_a.master_id =
products.master_id WHERE products.master_id =1 UNION
SELECT * FROM table_b JOIN products ON table_a.master_id =
products.master_id WHERE products.master_id =1;
and welcome to StackOverflow!
I’m working in SQL Server with the following sample problem. Brandon prefers PC’s and Macs, Sue prefers PC’s only, and Alan Prefers Macs. The data would be represented something like this. I've had to make some compromises here but hopefully you get the picture:
TABLE 1: User
uID (INT PK), uName (VARCHAR)
1 'Brandon'
2 'Sue'
3 'Alan'
TABLE 2: Computer
cID (INT PK), cName (VARCHAR)
1 'Mac'
2 'PC'
TABLE 3: UCPref --Stores the computer preferences for each user
uID (INT FK), cID (INT FK)
1 1
1 2
2 1
3 2
Now, if I want to select everyone who likes PC’s OR Macs that would be quite easy. There's a dozen ways to do it, but if I'm having a list of items fed in, then the IN clause is quite straight-forward:
SELECT u.uName
FROM User u
INNER JOIN UCPref p ON u.uID = p.uID
WHERE cID IN (1,2)
The problem I have is, what happens when I ONLY want to select people who like BOTH PC’s AND Mac’s? I can do it in multiple sub queries, however that isn’t very scalable.
SELECT u.uName
FROM User u
INNER JOIN UCPref p ON u.uID = p.uID
WHERE u.uID IN (SELECT uID FROM UCPref WHERE cID = 1)
AND u.uID IN (SELECT uID FROM UCPref WHERE cID = 2)
How does one write this query such that you can return the users who prefer multiple computers taking into consideration that there may be hundreds, maybe thousands of different kinds of computers (meaning no sub queries)? If only you could modify the IN clause to have a key word like 'ALL' to indicate that you want to match only those records that have all of the items in the parenthesis?
SELECT u.uName
FROM User u
INNER JOIN UCPref p ON u.uID = p.uID
WHERE cID IN *ALL* (1,2)
Using JOINs:
SELECT u.uname
FROM USERS u
JOIN UCPREF ucp ON ucp.uid = u.uid
JOIN COMPUTER mac ON mac.cid = ucp.cid
AND mac.cname = 'Mac'
JOIN COMPUTER pc ON pc.cid = ucp.cid
AND pc.cname = 'PC'
I'm using table aliases, because I'm JOINing onto the same table twice.
Using EXISTS:
SELECT u.uname
FROM USERS u
JOIN UCPREF ucp ON ucp.uid = u.uid
WHERE EXISTS (SELECT NULL
FROM COMPUTER c
WHERE c.cid = ucp.cid
AND c.cid IN (1, 2)
GROUP BY c.cid
HAVING COUNT(*) = 2)
If you're going to use the IN clause, you have to use GROUP BY/HAVING but there is a risk in the COUNT(). Some db's don't allow more than the *, while MySQL allows DISTINCT .... The problem is that if you can't use DISTINCT in the count, you could get two instances of the value 2, and it would valid to SQL - giving you a false positive.