Retain view before releasing it? - esper

I have this query that groups id's into sub views, and each view should have at least 3 unique val
select count(val), count(distinct val)
from A.std:groupwin(id).win:expr_batch(count(distinct val) >= 3) group by id
Lets say I have val, 1,2,3.
The problem with this is that I can have 1000 events for val 1 and 2 and as soon as just one event of val 3 enters the window it is released.
How can I wait for at least 1 minute and let other events enter before releasing the window?
Thanks

Related

any way to limit result by node

I have this query:
MATCH (user:Users)-[buy:Sales]->(item:Items)<-[buy2:Sales]- (user2:Users)-[buy_other:Sales]->(item2:Items)
where item.category = item2.category
return
user.mail, item2.id
the idea is to get items that the first user could be interested in that other user2 also bought, but i want to limit the results to max 2 item2 id per user
I know i can limit results in general, with limit 10 for example, but that means that those 10 results could all be for the same user.
Any help? thanks in advance
You can do it by inserting a COLLECTing and getting the first n items of it.
MATCH (user:Users)-[buy:Sales]->(item:Items)<-[buy2:Sales]- (user2:Users)-[buy_other:Sales]->(item2:Items)
WHERE item.category = item2.category
// this is where you collect and get some items of it
WITH user,COLLECT(item2)[0..2] AS item2s
UNWIND item2s AS item2
RETURN
user.mail, item2.id

JOIN ON second highest value (Impala)

I don't know how or even if this is possible.... I am trying to JOIN tables on the second highest value. I tried rowNumber, lag, lead & rank but haven't been able to get any of them to do what I need. To summarize, I'm just trying to shift the activitydate table down one row to join on rollDate minus 1 (but can't use -1 because they are not consistent dates, there are days missing.)
Does anyone know a good way to do this? Any suggestions are appreciated!
Select
ds.activitydate
,sum(ws.weeklyTotals / ds.daysBetween) as newRunRates -- getting an average of daily activity from weekly totals
from
(select
fsc.activitydate
,fsc.weekstart
,max(fsc.activitydate) OVER (partition by fsc.weekstart) as rollUpDate
,datediff(to_date(max(fsc.activitydate) OVER (partition by fsc.weekstart)), to_date(fsc.weekstart)) + 1 as daysBetween
from fiscalcalendar fsc
) ds -- used this to get a week-ending date bc that is what I need to join on. I only have a week start in this table
left join
(select
activitydate_iso
,count(distinct assignedmaincomponentid) as weeklyTotals
from activityTable
group by 1
) ws -- weeklySplits -- this gives me my weekly totals by a week ending date
on ds.rollUpDate = ws.activitydate_iso
-- need this join logic to actually be
-- on ds.rollUpDate = (max(ws.activitydate_iso) where activitydate_iso < rollUpDate)
where activitydate between '2020-05-22' and '2020-06-15'
group by 1,2
order by 1,2 ```

Duplicates in the result of a subquery

I am trying to count distinct sessionIds from a measurement. sessionId being a tag, I count the distinct entries in a "parent" query, since distinct() doesn't works on tags.
In the subquery, I use a group by sessionId limit 1 to still benefit from the index (if there is a more efficient technique, I have ears wide open but I'd still like to understand what's going on).
I have those two variants:
> select count(distinct(sessionId)) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 3757
> select count(sessionId) from (select * from UserSession group by sessionId limit 1)
name: UserSession
time count
---- -----
0 4206
To my understanding, those should return the same number, since group by sessionId limit 1 already returns distinct sessionIds (in the form of groups).
And indeed, if I execute:
select * from UserSession group by sessionId limit 1
I have 3757 results (groups), not 4206.
In fact, as soon as I put this in a subquery and re-select fields in a parent query, some sessionIds have multiple occurrences in the final result. Not always, since there is 17549 rows in total, but some are.
This is the sign that the limit 1 is somewhat working, but some sessionId still get multiple entries when re-selected. Maybe some kind of undefined behaviour?
I can confirm that I get the same result.
In my experience using nested queries does not always deliver what you expect/want.
Depending on how you use this you could retrieve a list of all values for a tag with:
SHOW TAG VALUES FROM UserSession WITH KEY=sessionId
Or to get the cardinality (number of distinct values for a tag):
SHOW TAG VALUES EXACT CARDINALITY FROM UserSession WITH KEY=sessionId.
Which will return a single row with a single column count, containing a number. You can remove the EXACT modifier if you don't need to be exact about the result: SHOW TAG VALUES CARDINALITY on Influx Documentation.

Esper - concatenate values from multiple rows to a list

I have an Esper query that returns multiple rows, but I'd like to instead get one row, where that row has a list (or concatenated string) of all of the values from the (corresponding columns of the) matching rows that my current query returns.
For example:
SELECT Name, avg(latency) as avgLatency
FROM MyStream.win:time(5 min)
GROUP BY Name
HAVING avgLatency / 1000 > 60
OUTPUT last every 5 min
Returns:
Name avgLatency
---- ----------
A 65
B 70
C 75
What I'd really like:
Name
----
{A, B, C}
Is this possible to do via the query itself? I tried to make this work using subqueries, but I'm not working with multiple streams. I can't find any aggregation functions or enumeration functions in the Esper documentation that fits what I'm trying to do either.
Thanks to anybody that has any insight or direction for me here.
EDIT:
If this can't be done via the query, I'm open to changing the subscriber, or anything else, if necessary.
You can have a subscriber or listener do the concat. There is a "Multi-Row Delivery" for subscribers. Or use a table like below.
// create table to hold aggregation result
create table LatencyTable(name string primary key, avgLatency avg(double));
// update aggregations in table from events coming in
into LatencyTable select name, avg(latency) as avgLatency from MyStream#time(5 min) group by name;
// do a select with the "aggregate" enumeration method
select (select * from LatencyTable where avgLatency > x).aggregate(....) from pattern[every timer:interval(5 min)]

Mysql matchmaking / pairing

I'm currently working on a 1v1 online game and I ran into a problem when trying to match up players.
A player who wants to play gets put into a matchmaking table
id, user, amount
Now I want to query the table matchmaking for the best possible pairs of users (So, users who want to play for the same amount)
I also want users who are waiting for a longer time (smaller id), to be paired up first.
So far I have this query:
SELECT *
FROM matchmaking a, wpr_matchmaking b
WHERE a.user != b.user
AND a.amount = b.amount
ORDER BY a.id ASC , b.id ASC
LIMIT 0 , 30
This returns all possible pairings, so in a table with this content:
id, user, amount
1, 1, 10
2, 2, 10
3, 3, 10
I get the pairs:
1,2
1,3
2,1
2,3
3,1
3,2
Whereas I only want 1,2 returned in that case.
How do I make it only show me each user at most once?
Edit: adding the condition 'and a.id < b.id' to the query reduces the pairings by a factor of 2, but there's still too many.
Do you just want the highest pair to match those and then rerun the query? You could use SELECT TOP 1

Resources