Displaying not equal to - sqlplus

select DISTINCT STORECODE, SUPPCODE from stocks
where suppcode = 'S2'
From this code I get the Storecodes that have the supplier code 'S2'.
I would like to display where:
where suppcode != 'S2'
However, when I run this I then get the other store codes. e.g. S1, S3, S4, S5, S6. I would like to only return the Storecodes that don't have S2 as a supplier.
Many thanks!

select distinct storecode
from stocks
where storecode not in
(select distinct storecode
from stocks
where supplier = 'S2')

Related

Get the average of the most recent records within groups with ActiveRecord

I have the following query, which calculates the average number of impressions across all teams for a given name and league:
#all_team_avg = NielsenData
.where('name = ? and league = ?', name, league)
.average('impressions')
.to_i
However, there can be multiple entries for each name/league/team combination. I need to modify the query to only average the most recent records by created_at.
With the help of this answer I came up with a query which gets the result that I need (I would replace the hard-coded WHERE clause with name and league in the application), but it seems excessively complicated and I have no idea how to translate it nicely into ActiveRecord:
SELECT avg(sub.impressions)
FROM (
WITH summary AS (
SELECT n.team,
n.name,
n.league,
n.impressions,
n.created_at,
ROW_NUMBER() OVER(PARTITION BY n.team
ORDER BY n.created_at DESC) AS rowcount
FROM nielsen_data n
WHERE n.name = 'Social Media - Twitter Followers'
AND n.league = 'National Football League'
)
SELECT s.*
FROM summary s
WHERE s.rowcount = 1) sub;
How can I rewrite this query using ActiveRecord or achieve the same result in a simpler way?
When all you have is a hammer, everything looks like a nail.
Sometimes, raw SQL is the best choice. You can do something like:
#all_team_avg = NielsenData.find_by_sql("...your_sql_statement_here...")

Joining and filtering out unnecessary data

I need to build a query with the following requirements.
The two tables to use are
MASTER_ARCHIVE and
REP_PROFILE
As of now we are only interested in reps at the wirehouses: Wells Fargo, Morgan Stanley, UBS, Merrill Lynch
To get reps from only these firms, I need to filter the Rep Profile table by Firm ID (The Firm IDs can be found in Firm table), and can filter the Master Archive table on FIRM_CRD
What we need is 2 sets of data:
1) A list of wirehouse reps that are in the Master Archive table, but not in the Rep Profile table
2) A list of wirehouse reps that are in the Rep Profile Table, but not in the Master Archive table
Does anyone have an idea of what type of Joins and filter conditions that I would use to get the data that I'm looking for?
This is what I currently came up with!!!!
SELECT *
FROM MASTER_ARCHIVE E
Left JOIN REP_PROFILE R
ON E.REP_CRD = R.CRD_NUMBER
WHERE E.FIRM_ID IN ('F206','F443','F474','F458')
MINUS
SELECT *
FROM MASTER_ARCHIVE E
JOIN REP_PROFILE R
ON E.REP_CRD = R.CRD_NUMBER
WHERE E.FIRM_ID IN ('F206','F443','F474','F458')
--ORDER BY NAME Name
i don't understand so much, but try with this
SELECT *
FROM MASTER_ARCHIVE E
LEFT JOIN REP_PROFILE R
ON E.REP_CRD = R.CRD_NUMBER
WHERE E.FIRM_ID IN ('F206','F443','F474','F458')
AND R.CRD_NUMBER IS NULL

Emulating an interval join in hive

I am using hive 0.13.
I have two tables:
data table. columns: id, time. 1E10 rows.
mymap table. columns: id, name, start_time, end_time. 1E6 rows.
For each row in the data table I want to get the name from the mymap table matching the id and the time interval. So I want to do a join like:
select data.id, time, name from data left outer join mymap on data.id = mymap.id and time>=start_time and time<end_time
It is known that for every row in data there are 0 or 1 matches in mymap.
The above query is not supported in hive as it is a non-equi-join. Moving the inequality conditions into a where filter does not work cause the join explodes before the filter is applied:
select data.id, time, name from data left outer join mymap on data.id = mymap.id where mymap.id is null or (time>=start_time and time<end_time)
(I am aware that the queries are not exactly equivalent due to cases where there is a match for id but no matching interval. This can be solved as I describe here: Hive: work around for non equi left join)
How can I go about this?
You could perform your join and then query from that table. I didn't test this code, but it would read something like
select id
,time
,name
from (
select d.id
,d.time
,m.name
,m.start_time
,m.end_time
from data as d LEFT OUTER JOIN mymap as m
ON d.id = m.id
) x
where time>=start_time
AND time<end_time
You could potentially get around this issue by flattening out the data structure in table2 and using a UDF to process the joined records.
select
id,
time,
nameFinderUDF(b.name_list, time) as name
from
data a
LEFT OUTER JOIN
(
select
id,
collect_set(array(name,cast(start_time as string),cast(end_time as string))) as name_list
from
mymap
group by
id
) b
ON (a.id=b.id)
With a UDF that does something like:
public String evaluate(ArrayList<ArrayList<String>> name_list,Long time) {
for (int i;i<name_list.length;i++) {
if (time >= Long.parseLong(name_list[i][1]) && time <= Long.parseLong(name_list[i][2])) {
return name_list[i][0]
return null;
}
This approach should make the merge 1 to 1, but it could create a fairly large data structure repeated many times. It is still quite a bit more efficient than a straight join.

How to get records from multiple condition from a same column through associated table

Let say a book model HABTM categories, for an example book A has categories "CA" & "CB". How can i retrieve book A if I query using "CA" & "CB" only. I know about the .where("category_id in (1,2)") but it uses OR operation. I need something like AND operation.
Edited
And also able to get books from category CA only. And how to include query criteria such as .where("book.p_year = 2012")
ca = Category.find_by_name('CA')
cb = Category.find_by_name('CB')
Book.where(:id => (ca.book_ids & cb.book_ids)) # & returns elements common to both arrays.
Otherwise you'd need to abuse the join table directly in SQL, group the results by book_id, count them, and only return rows where the count is at least equal to the number of categories... something like this (but I'm sure it's wrong so double check the syntax if you go this route. Also not sure it would be any faster than the above):
SELECT book_id, count(*) as c from books_categories where category_id IN (1,2) group by book_id having count(*) >= 2;

linQ query joining multiple tables

How do i link 2 tables
Events[ Event_ID , Hobby_Code]
Hobbies[Hobby_Code]
Members[Member_ID,Hobby_Code,Event_ID, Member_Email]
so that when i sent an email to the member the hobby code's of the Members and Events Table must be equal? I want to link Events and Members.
var q = from m in listOfMemebers
join e in listOfEvents on m.Hobby_Code equals e.Hobby_Code
select m;
This will select all the members that have an Hobby_Code that exists on some event

Resources