How to find pairs of events with postgres?

How to find pairs of events with postgres? - psql

I have an events table:
ts | user | reason
----------------------------+--------+--------
2018-06-01 10:44:15.52+01 | 359999 | START
2018-06-01 10:44:29.521+01 | 359999 | STOP
2018-06-01 10:44:43.52+01 | 359998 | START
2018-06-01 10:44:55.52+01 | 359999 | START
2018-06-01 10:44:59.521+01 | 359998 | STOP
2018-06-01 10:45:07.52+01 | 359999 | STOP
2018-06-01 10:46:16.52+01 | 359999 | START
And I want to find the pairs of events:
user | start | stop
--------+----------------------------+----------------------------
359999 | 2018-06-01 10:44:15.52+01 | 2018-06-01 10:44:29.521+01
359998 | 2018-06-01 10:44:43.52+01 | 2018-06-01 10:44:59.521+01
359999 | 2018-06-01 10:44:55.52+01 | 2018-06-01 10:45:07.52+01
359999 | 2018-06-01 10:46:16.52+01 |
What sort of query could do this?

You can do this pretty easily with a window function. Among other things, these let you reference the next/previous row in a query result (via lead() and lag()). For example:
SELECT "user", ts AS start, next_ts AS stop
FROM (
SELECT *, lead(ts) OVER (PARTITION BY "user" ORDER BY ts) AS next_ts
FROM events
WHERE reason IN ('START', 'STOP')
) AS ts_pairs
WHERE reason = 'START'

Just got this working. Is there a more efficient way?
select imei, reason, ts AS start, (
select ts
from events as stops
where stops.ts > starts.ts
and reason = 'STOP'
and stops.user = starts.user
order by ts desc
limit 1
) as stop
from events as starts
where reason = 'START'
order by ts
;

Related

Neo4j Cypher: How to optimize a NOT EXISTS Query when cardinality is high

The below query takes over 1 second & consumer about 7 MB when cardinality b/w users to posts is about 8000 (one user views about 8000 posts). It is difficult to scale this due to high & linearly growing latencies & memory consumption. Is there a possibility to model this differently and/or optimise the query?
Query
PROFILE MATCH (u:User)-[:CREATED]->(p:Post) WHERE NOT (:User{ID: 2})-[:VIEWED]->(p) RETURN p.ID
Plan
| Plan | Statement | Version | Planner | Runtime | Time | DbHits | Rows | Memory (Bytes) |
+-----------------------------------------------------------------------------------------------------------+
| "PROFILE" | "READ_ONLY" | "CYPHER 4.1" | "COST" | "INTERPRETED" | 1033 | 3721750 | 10 | 6696240 |
+-----------------------------------------------------------------------------------------------------------+
+------------------------------+-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| Operator | Details | Estimated Rows | Rows | DB Hits | Cache H/M | Memory (Bytes) | Ordered by |
+------------------------------+-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +ProduceResults#neo4j | `p.ID` | 2158 | 10 | 0 | 0/0 | | |
| | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +Projection#neo4j | p.ID AS `p.ID` | 2158 | 10 | 10 | 0/0 | | |
| | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +Filter#neo4j | u:User | 2158 | 10 | 10 | 0/0 | | |
| | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +Expand(All)#neo4j | (p)<-[anon_15:CREATED]-(u) | 2158 | 10 | 20 | 0/0 | | |
| | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +AntiSemiApply#neo4j | | 2158 | 10 | 0 | 0/0 | | |
| |\ +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| | +Expand(Into)#neo4j | (anon_47)-[anon_61:VIEWED]->(p) | 233 | 0 | 3695819 | 0/0 | 6696240 | anon_47.ID ASC |
| | | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| | +NodeUniqueIndexSeek#neo4j | UNIQUE anon_47:User(ID) WHERE ID = $autoint_0 | 8630 | 8630 | 17260 | 0/0 | | anon_47.ID ASC |
| | +-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+
| +NodeByLabelScan#neo4j | p:Post | 8630 | 8630 | 8631 | 0/0 | | |
+------------------------------+-----------------------------------------------+----------------+------+---------+-----------+----------------+----------------+

Yes, this can be improved.
First, let's understand what this is doing.
First, it starts with a NodeByLabelScan. That makes sense, there's no avoiding that.
But then, for every node of the label (the following executes PER ROW!), it matches to user 2, and expands all :VIEWED relationships from user 2 to see if any of them is the post for that particular row.
Can you see why this is inefficient? There are 8630 post nodes according to the PROFILE plan, so user 2 is looked up by index 8630 times, and their :VIEWED relationships are expanded 8630 times. Why 8630 times? Because this is happening per :Post node.
Instead, try this:
MATCH (:User{ID: 2})-[:VIEWED]->(viewedPost)
WITH collect(viewedPost) as viewedPosts
MATCH (:User)-[:CREATED]->(p:Post)
WHERE NOT p IN viewedPosts
RETURN p.ID
This changes things up a bit.
First it matches to user 2's viewed posts (the lookup and expansion is performed only once), then those viewed posts are collected.
Then it will do a label scan, and filter such that the post isn't in the collection of viewed posts.

InfluxDB select different time between two row having same a field value

I have a table like this on InfluxDB:
+---------------+-----------------+--------+--------------------------+
| time | sequence_number | action | session_id |
+---------------+-----------------+--------+--------------------------+
| 1433322591220 | 270001 | delete | 556d85bfe26c3b3864617605 |
| 1433322553324 | 250001 | delete | 556d88e4e26c3b3b83c99d32 |
| 1433241828472 | 230001 | create | 556d88e4e26c3b3b83c99d32 |
| 1433241023633 | 80001 | create | 556d85bfe26c3b3864617605 |
| 1433239305306 | 70001 | create | 556d7f09e26c3b34e872b2ba |
+---------------+-----------------+--------+--------------------------+
Now I want to find the time range from a session be created to deleted, that means get time where action=delete minus time where action=create if they have same session_id

Heroku not listing pg_search_documents table

This is pretty weird but for some reason, heroku doesn't seem to show the pg_search_documents table when when I list tables using the heroku-sql-console.
>> heroku sql
SQL> show tables
+------------------------+
| table_name |
+------------------------+
| activity_notifications |
| attachments |
| businesses |
| color_modes |
| comments |
| counties |
| customer_employees |
| customers |
| delayed_jobs |
| file_imports |
| invitations |
| invoices |
| jobs |
| paper_stocks |
| paper_weights |
| quotes |
| rails_admin_histories |
| schema_migrations |
| tax_rates |
| users |
+------------------------+
As you can see, no mention of pg_search.
Then, in the same session,
SQL> select * from pg_search_documents;
+---------------------------------------------------------------------------------------------------------------------+
| id | content | searchable_id | searchable_type | created_at | updated_at |
+---------------------------------------------------------------------------------------------------------------------+
| 3 | Energy Centre | 3 | Customer | 2012-12-03 19:33:55 -0800 | 2012-12-03 19:33:55 -0800 |
+---------------------------------------------------------------------------------------------------------------------+
It's also interesting that the show tables command lists only 20 tables whereas heroku pg:info says there are 21.
The reason this is a problem rather than a curiosity is because I can't get heroku db:pull to pull down the pg_search_documents table (everything else pulls fine) and I can't test migrations on production data.
I'm using PG Version: 9.1.6 on heroku and PostgreSQL 9.2.1 locally. Also PgSearch 0.5.7.
Any ideas what the issue is?

Unix script to count line between pattern of output

I am trying to write a unix code where I will be to count no of rows between (ie here 2)
two "|DATE and TIME | XXXXXX |"
Is there any method I can use with combination of egrap and wc -l
Output:
|-------------------------------|
|DATE and TIME | XXXXXX |
|-------------------------------|
| 21-NOV-2012 15:56:51 | 1259 |
| 21-NOV-2012 15:56:51 | 1364 |
|-------------------------------|
|DATE and TIME | XXXXXX |
|-------------------------------|
| 21-NOV-2012 16:06:55 | 1259 |
| 21-NOV-2012 16:06:55 | 1364 |
|-------------------------------|

if the expected result is 9 (since there are empty lines in your example, I don't know if they should be counted):
awk '/DATE and TIME/&&!f{f=1;next;}/DATE and TIME/&&f{print x;exit;}f{x++}' file

Access violation while the program was idle - not trace information to track down the bug

I have a program that just popped up an AV. Until now the Eureka Log could find the source code line that generated the error but now it displays only this:
Access violation at address 7E452E4E in module 'USER32.dll'. Read of address 00000015.
Call Stack Information:
--------------------------------------------------------------------------------------------
|Address |Module |Unit |Class|Procedure/Method |Line |
--------------------------------------------------------------------------------------------
|Running Thread: ID=2640; Priority=0; Class=; [Main] |
|------------------------------------------------------------------------------------------|
|77F16A7E|GDI32.dll | | |IntersectClipRect | |
|7E433000|USER32.dll | | |EditWndProc | |
|7E42A993|USER32.dll | | |CallWindowProcA | |
|7E42A97D|USER32.dll | | |CallWindowProcA | |
|7E429011|USER32.dll | | |OffsetRect | |
|7E4196C2|USER32.dll | | |DispatchMessageA | |
|7E4196B8|USER32.dll | | |DispatchMessageA | |
|00625E13|Amper.exe |Amper.DPR | | |76[16]|
|7C915511|ntdll.dll | | |RtlFindActivationContextSectionString| |
|7C915D61|ntdll.dll | | |RtlFindCharInUnicodeString | |
|7C910466|ntdll.dll | | |RtlFreeUnicodeString | |
|7C80B87C|kernel32.dll | | |IsDBCSLeadByte | |
|7C9113ED|ntdll.dll | | |RtlDeleteCriticalSection | |
|7C80EEF5|kernel32.dll | | |FindClose | |
|7C901000|ntdll.dll | | |RtlEnterCriticalSection | |
|7C912CFF|ntdll.dll | | |LdrLockLoaderLock | |
|7C9010E0|ntdll.dll | | |RtlLeaveCriticalSection | |
|7C912D19|ntdll.dll | | |LdrUnlockLoaderLock | |
|7C9166C1|ntdll.dll | | |LdrGetDllHandleEx | |
|7C9166B3|ntdll.dll | | |LdrGetDllHandle | |
|7C9166A0|ntdll.dll | | |LdrGetDllHandle | |
|7C912A8D|ntdll.dll | | |RtlUnicodeToMultiByteN | |
|7C912C21|ntdll.dll | | |RtlUnicodeStringToAnsiString | |
|7C901000|ntdll.dll | | |RtlEnterCriticalSection | |
|7C912CC9|ntdll.dll | | |LdrLockLoaderLock | |
|7C912CFF|ntdll.dll | | |LdrLockLoaderLock | |
|7C9010E0|ntdll.dll | | |RtlLeaveCriticalSection | |
|7C912D19|ntdll.dll | | |LdrUnlockLoaderLock | |
|7C90CF78|ntdll.dll | | |ZwAllocateVirtualMemory | |
|7C90CF6E|ntdll.dll | | |ZwAllocateVirtualMemory | |
|7C9010E0|ntdll.dll | | |RtlLeaveCriticalSection | |
|7C80BA57|kernel32.dll | | |VirtualQueryEx | |
|7C80BA40|kernel32.dll | | |VirtualQueryEx | |
|7C80BA81|kernel32.dll | | |VirtualQuery | |
|7C901000|ntdll.dll | | |RtlEnterCriticalSection | |
|7C912CC9|ntdll.dll | | |LdrLockLoaderLock | |
|7C912CFF|ntdll.dll | | |LdrLockLoaderLock | |
|7C9010E0|ntdll.dll | | |RtlLeaveCriticalSection | |
--------------------------------------------------------------------------------------------
The program was totally idle while I got the error and its window was hidden by other windows. FastMM is active and set to full debug but it indicates no memory overwrite.
Any hints about how to find the origin of this AV?
Win XP, Delphi 7

I don't see an EditWndProc() method in user32.dll, but Delphi has a couple -- one dealing with combobox messages and one dealing with tree views. Given MS's comctrl mess, I'd guess you have a tree view?
Check your tree view stuff. Given IntersectClipRect's parameters, it's easy to guess that it's being passed an invalid device context -- so...are you doing any custom painting for your tree view? If so, are you checking to make sure the canvas handle is ! NIL before you begin painting (try assertions if nothing else)?

I just wonder what's on line 76[16] in Amper.exe... That line number might be a hint of the location of the error.
Then again, when it's just happening during an idle moment then it basically happens when the system is processing Windows messages like the mouse moving, keyboard events, timer updates and a lot more.
It sometimes helps to search for the error message plus code. I've done a quick scan and found this KB from MS which suggests that this kind of error can happen when you call certain Windows API's with invalid parameters. But this KB doesn't apply to your error. Still, it gives you an idea about what to check: any Windows API call you make in your own code.
Does it also generate this exception in the IDE, while you're debugging?

That's what EurekaLog does when it has nothing to work with. You need to rebuild and have the linker produce a detailed map file. That's how it knows what to apply its stack trace to.

Categories

HOME

asp.net-mvc

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to find pairs of events with postgres? - psql

Just got this working. Is there a more efficient way? select imei, reason, ts AS start, ( select ts from events as stops where stops.ts > starts.ts and reason = 'STOP' and stops.user = starts.user order by ts desc limit 1 ) as stop from events as starts where reason = 'START' order by ts ;

Related

Neo4j Cypher: How to optimize a NOT EXISTS Query when cardinality is high

InfluxDB select different time between two row having same a field value

Heroku not listing pg_search_documents table

Unix script to count line between pattern of output

Access violation while the program was idle - not trace information to track down the bug

Categories

Resources