Limit the results of a select from geo.places.neighbors in yql to a max. distance - yql

with
select placeTypeName, name from geo.places.neighbors where neighbor_woeid in (select woeid from geo.places where text="sunnyvale, usa" limit 1)
i can get neighbor places from a location via YQL. But how can i limit the results to a distance? 10km, 20km, 30km ..

Related

Active Record query Distinct Group by

I have a query, where I am trying to find min score of a user in a grade, in a grade, there are users with the same min score
Example: User A has a score of 2 and user B has a score of 2, so my expectation is to get both the users grouped by grade.
However, I am only getting one user. The query is :
users = Users.all
#user_score = users
.where.not(score: [ nil, 0 ])
.select('DISTINCT ON ("users"."grade") grade, "users".*')
.order('"users"."grade" ASC, "users"."score" ASC')
.group_by(&:grade)
Please if some can guide me what am i doing wrong here.
DISTINCT will cut off all non uniq values in the result, so there is no way to get multiple users with same min score in your query.
I think you can achieve the desired result with window function:
SELECT * FROM
(SELECT *, rank() OVER (PARTITION BY grade ORDER BY score) AS places
FROM users
WHERE score IS NOT NULL AND score != 0
) AS ranked_by_score
WHERE places = 1;

Limit query results

I have an app that includes music charts to showcase the top tracks (it shows the top 10).
However, I'm trying to limit the charts so that any particular user cannot have more than one track on the top charts at the same time.
If you need any more info, just let me know.
You can use the row_number() function which gives a running number that resets when the user id changes. Then you can use that in a WHERE clause to create a per-user-limit:
SELECT * FROM (
SELECT COALESCE(sum(plays.clicks), 0),
row_number() OVER (PARTITION BY users.id ORDER BY COALESCE(sum(plays.clicks), 0) DESC),
users.id AS user_id,
tracks.*
FROM tracks
JOIN plays
ON tracks.id = plays.track_id
AND plays.created_at > now() - interval '14 days'
INNER JOIN albums
ON tracks.album_id = albums.id
INNER JOIN users
ON albums.user_id = users.id
GROUP BY users.id, tracks.id
ORDER BY 1 desc) sq1
WHERE row_number <= 2 LIMIT 10;

How to use joins and averages together in Hive queries

I have two tables in hive:
Table1: uid,txid,amt,vendor Table2: uid,txid
Now I need to join the tables on txid which basically confirms a transaction is finally recorded. There will be some transactions which will be present only in Table1 and not in Table2.
I need to find out number of avg of transaction matches found per user(uid) per vendor. Then I need to find the avg of these averages by adding all the averages and divide them by the number of unique users per vendor.
Let's say I have the data:
Table1:
u1,120,44,vend1
u1,199,33,vend1
u1,100,23,vend1
u1,101,24,vend1
u2,200,34,vend1
u2,202,32,vend2
Table2:
u1,100
u1,101
u2,200
u2,202
Example For vendor vend1:
u1-> Avg transaction find rate = 2(matches found in both Tables,Table1 and Table2)/4(total occurrence in Table1) =0.5
u2 -> Avg transaction find rate = 1/1 = 1
Avg of avgs = 0.5+1(sum of avgs)/2(total unique users) = 0.75
Required output:
vend1,0.75
vend2,1
I can't seem to find count of both matches and occurrence in just Table1 in one hive query per user per vendor. I have reached to this query and can't find how to change it further.
SELECT A.vendor,A.uid,count(*) as totalmatchesperuser FROM Table1 A JOIN Table2 B ON A.uid = B.uid AND B.txid =A.txid group by vendor,A.uid
Any help would be great.
I think you are running into trouble with your JOIN. When you JOIN by txid and uid, you are losing the total number of uid's per group. If I were you I would assign a column of 1's to table2 and name the column something like success or transaction and do a LEFT OUTER JOIN. Then in your new table you will have a column with the number 1 in it if there was a completed transaction and NULL otherwise. You can then do a case statement to convert these NULLs to 0
Query:
select vendor
,(SUM(avg_uid) / COUNT(uid)) as avg_of_avgs
from (
select vendor
,uid
,AVG(complete) as avg_uid
from (
select uid
,txid
,amt
,vendor
,case when success is null then 0
else success
end as complete
from (
select A.*
,B.success
from table1 as A
LEFT OUTER JOIN table2 as B
ON B.txid = A.txid
) x
) y
group by vendor, uid
) z
group by vendor
Output:
vend1 0.75
vend2 1.0
B.success in line 17 is the column of 1's that I put int table2 before the JOIN. If you are curious about case statements in Hive you can find them here
Amazing and precise answer by GoBrewers14!! Thank you so much. I was looking at it from a wrong perspective.
I made little changes in the query to get things finally done.
I didn't need to add a "success" colummn to table2. I checked B.txid in the above query instead of B.success. B.txid will be null in case a match is not found and be some value if a match is found. That checks the success & failure conditions itself without adding a new column. And then I set NULL as 0 and !NULL as 1 in the part above it. Also I changed some variable names as hive was finding it ambiguous.
The final query looks like :
select vendr
,(SUM(avg_uid) / COUNT(usrid)) as avg_of_avgs
from (
select vendr
,usrid
,AVG(complete) as avg_uid
from (
select usrid
,txnid
,amnt
,vendr
,case when success is null then 0
else 1
end as complete
from (
select A.uid as usrid,A.vendor as vendr,A.amt as amnt,A.txid as txnid
,B.txid as success
from Table1 as A
LEFT OUTER JOIN Table2 as B
ON B.txid = A.txid
) x
) y
group by vendr, usrid
) z
group by vendr;

facebook api limit offset does not work

I have fql query like these.
SELECT name, page_id, type, pic_cover, fan_count, about FROM page where page_id IN (SELECT page_id FROM page_fan WHERE uid=me()) LIMIT 100 OFFSET 0
SELECT name, page_id, type, pic_cover, fan_count, about FROM page where page_id IN (SELECT page_id FROM page_fan WHERE uid=me()) LIMIT 100 OFFSET 100
Although the first query is work very well, the second one doesn't return anything. Is there any change in facebook api for offset. Is there any way to handle with this problem?
Guess it's too late. The problem is with using IN keyword. The second query must be
SELECT name, page_id, type, pic_cover, fan_count, about FROM page where page_id IN (SELECT page_id FROM page_fan WHERE uid=me() limit 100 offset 100)
LIMIT and OFFSET are in inner query.
Hope this helps.

row_number() with an unspecified window `row_number() OVER ()`

I'm building a paginated scoreboard using postgres 9.1.
There are several criteria that users can sort the scoreboard by, and they can sort by ascending or descending. There is a feature to let users find "their row" across the multiple pages in the scoreboard, and it must reflect the users selected sorting criteria.
I am using postgres's row_number function to find their offset into the result set to return the page where the user can find their row.
Everything I'm reading about row_number seems to imply that bad things happen to people who don't specify an ordering within the row_number window. E.g. row_number() OVER (ORDER BY score_1) is OK, row_number() OVER () is bad.
My case is different from the examples I've read about in that I am explicitly ordering my query, I realize the DB engine may not return the results in any particular order if I don't.
But I'd like to just specify ordering at the level of the entire query and get the row_number of the results, without having to duplicate my ordering specification with the row_number's window.
So this is what I'd like to do, and it "seems to work".
SELECT
id,
row_number() OVER () AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY (score_1 ASC | score_1 DESC | score_2 ASC | score_2 DESC | score_3 ASC | score_3 DESC)
Where player_position reflects the players rank in whatever criteria I'm ordering by.
But the documentation I've read tells me I should do it like this:
SELECT
id,
row_number() OVER (ORDER BY score_1 ASC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_1 ASC
or
SELECT
id,
row_number() OVER (ORDER BY score_2 DESC) AS player_position,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY score_2 DESC
The real reason that I'd like to avoid redundantly specifying the ordering for the row_number window is to keep my query amenable with the ActiveRecord ORM. I want to have my base scoreboard query, and chain on the ordering.
e.g. Ultimately, I want to be able to do this:
Players.scoreboard.order('score_1 ASC')
Players.scoreboard.order('score_2 DESC')
etc...
Is it possible?
Try moving your main query into a subquery with an ORDER BY and apply the ROW_NUMBER() to the outermost query.
SELECT y.*,
ROW_NUMBER() OVER () as player_position
FROM
(SELECT
id,
score_1,
score_2,
score_3,
FROM my_table
ORDER BY <whatever>) as y

Resources