Finding which child is using up all my memory in Erlang - erlang

I am troubleshooting a crashing Erlang program. It runs out of memory. It has several children started by OTP (one_for_one in the supervisor), and some started with spawn.
I am starting the program and falling into the Erlang prompt (test#test)1>. I'd like to see how much memory each of these children is using from here. I've searched online and not found anything, but this seems like a common enough need to already have a solution.
How can I find the memory utilization of each child, in Erlang, from the system prompt?

Did you try observer?
when you get the prompt, type observer:start(), then in the Application tab, you can see all the applications for each of them the processes. For each process you can get the memory usage by opening the process_info sub window.

Try erlang:process_info/2 with memory in ItemList
process_info(Pid, ItemList) -> InfoTupleList | [] | undefined
Types
Pid = pid()
ItemList = [Item]
Item = process_info_item()
InfoTupleList = [InfoTuple]
InfoTuple = process_info_result_item()
process_info_item() =
backtrace |
binary |
catchlevel |
current_function |
current_location |
current_stacktrace |
dictionary |
error_handler |
garbage_collection |
garbage_collection_info |
group_leader |
heap_size |
initial_call |
links |
last_calls |
memory |
message_queue_len |
messages |
min_heap_size |
min_bin_vheap_size |
monitored_by |
monitors |
message_queue_data |
priority |
reductions |
registered_name |
sequential_trace_token |
stack_size |
status |
suspending |
total_heap_size |
trace |
trap_exit

Related

Slowly increasing memory usage of Dask Sheduler

I'm running a test:
client = Client('127.0.0.1:8786')
def x(i):
return {}
while True:
start = time.time()
a = client.submit(randint(0,1000000))
res = a.result()
del a
end = time.time()
print("Ran on %s with res %s" % (end-start, res))
client.shutdown()
del client
I used it (with more code) to get an estimate of my queries performance. But for this example I've removed all things I could think of.
The above code leaks roughly 0.1 MB per second, which I would guesstimate to roughly 0.3MB per 1000 calls.
Am I doing something wrong in my code?
My python debugging skills are a bit rusty (and with a bit I mean I last used objgraph on Orbited (the precursor to websockets) in 2009 https://pypi.python.org/pypi/orbited) but from what I can see, checking to number of references before and after:
Counting objects in the scheduler, before and after using objgraph.show_most_common_types()
| What | Before | After | Diff |
|-------------+------------------+--------|---------+
| function | 33318 | 33399 | 81 |
| dict | 17988 | 18277 | 289 |
| tuple | 16439 | 28062 | 11623 |
| list | 10926 | 11257 | 331 |
| OrderedDict | N/A | 7168 | 7168|
It's not a huge number of RAM in any case, but digging deeper I found that t scheduler._transition_counter is 11453 and scheduler.transition_log is filled with:
('x-25ca747a80f8057c081bf1bca6ddd481', 'released', 'waiting',
OrderedDict([('x-25ca747a80f8057c081bf1bca6ddd481', 'processing')]), 4121),
('x-25ca747a80f8057c081bf1bca6ddd481', 'waiting', 'processing', {}, 4122),
('x-25cb592650bd793a4123f2df39a54e29', 'memory', 'released', OrderedDict(), 4123),
('x-25cb592650bd793a4123f2df39a54e29', 'released', 'forgotten', {}, 4124),
('x-25ca747a80f8057c081bf1bca6ddd481', 'processing', 'memory', OrderedDict(), 4125),
('x-b6621de1a823857d2f206fbe8afbeb46', 'released', 'waiting', OrderedDict([('x-b6621de1a823857d2f206fbe8afbeb46', 'processing')]), 4126)
First error on my part
Which of course led me to realise the first error on my part was not configuring transition-log-length.
After setting configuration transition-log-length to 10:
| What | Before | After | Diff |
| ---------------+----------+--------+---------|
| function | 33323 | 33336 | 13 |
| dict | 17987 | 18120 | 133 |
| tuple | 16530 | 16342 | -188 |
| list | 10928 | 11136 | 208 |
| _lru_list_elem | N/A | 5609 | 5609 |
A quick google found that _lru_list_elem is made by #functools.lru_cache which in turn is in invoked in key_split (in distributed/utils.py)
Which is the LRU cache, of up to 100 000 items.
Second try
Based on the code it appears as Dask should climb up to roughly 10k _lru_list_elem
After running my script again and watching the memory it climbs quite fast up until I approach 100k _lru_list_elem, afterwards it stops climbing almost entirely.
This appears to be the case, since it pretty much flat-lines after 100k
So no leak, but fun to get hands dirty on Dask source code and Python memory profilers
For diagnostic, logging, and performance reasons the Dask scheduler keeps records on many of its interactions with workers and clients in fixed-sized deques. These records do accumulate, but only to a finite extent.
We also try to ensure that we don't keep around anything that would be too large.
Seeing memory use climb up until a nice round number like what you've seen and then stay steady seems to be consistent with this.

Neo4j CSV import query super slow, when setting relationships

I am trying to evaluate Neo4j (using the community version).
I am importing some data (1 million rows) using the LOAD CSV process. It needs to match previously imported nodes to create a relationship between them.
Here is my query:
//Query #3
//create edges between Tr and Ad nodes
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///1M.txt'
AS line
FIELDTERMINATOR '\t'
//find appropriate tx and ad
MATCH (tx:Tr { txid: TOINT(line.txid) }), (ad:Ad {p58: line.p58})
//create the edge (relationship)
CREATE (tx)-[out:OUT_TO]->(ad)
//set properties on the edge
SET out.id= TOINT(line.id)
SET out.n = TOINT(line.n)
SET out.v = TOINT(line.v)
I have indicies on:
Indexes
ON :Ad(p58) ONLINE (for uniqueness constraint)
ON :Tr(txid) ONLINE
ON :Tr(h) ONLINE (for uniqueness constraint)
This query has been running for 5 days now and it has so far created 270K relationships (out of 1M).
Java heap is 4g
Machine has 32G of RAM and an SSD for a drive, only running linux and Neo4j
Any hints to speed this process up would be highly appreciated.
Should I try the enterprise edition?
Query Plan:
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
If a part of a query contains multiple disconnected patterns,
this will build a cartesian product between all those parts.
This may produce a large amount of data and slow down query processing.
While occasionally intended,
it may often be possible to reformulate the query that avoids the use of this cross product,
perhaps by adding a relationship between the different parts or by using OPTIONAL MATCH (identifier is: (ad))
20 ms
Compiler CYPHER 3.0
Planner COST
Runtime INTERPRETED
+---------------------------------+----------------+---------------------+----------------------------+
| Operator | Estimated Rows | Variables | Other |
+---------------------------------+----------------+---------------------+----------------------------+
| +ProduceResults | 1 | | |
| | +----------------+---------------------+----------------------------+
| +EmptyResult | | | |
| | +----------------+---------------------+----------------------------+
| +Apply | 1 | line -- ad, out, tx | |
| |\ +----------------+---------------------+----------------------------+
| | +SetRelationshipProperty(4) | 1 | ad, out, tx | |
| | | +----------------+---------------------+----------------------------+
| | +CreateRelationship | 1 | out -- ad, tx | |
| | | +----------------+---------------------+----------------------------+
| | +ValueHashJoin | 1 | ad -- tx | ad.p58; line.p58 |
| | |\ +----------------+---------------------+----------------------------+
| | | +NodeIndexSeek | 1 | tx | :Tr(txid) |
| | | +----------------+---------------------+----------------------------+
| | +NodeUniqueIndexSeek(Locking) | 1 | ad | :Ad(p58) |
| | +----------------+---------------------+----------------------------+
| +LoadCSV | 1 | line | |
+---------------------------------+----------------+---------------------+----------------------------+
OKAY, so by splitting the MATCH statement into two it sped up the query immensely. Thanks #William Lyon for pointing me to the Plan. I noticed the warning.
old MATCH atatement
MATCH (tx:Tr { txid: TOINT(line.txid) }), (ad:Ad {p58: line.p58})
split into two:
MATCH (tx:Tr { txid: TOINT(line.txid) })
MATCH (ad:Ad {p58: line.p58})
on 750K relationships the query took 83 seconds.
Next up 9 Million CSV LOAD

delphi call delphi dll memory leak caused by pchar

So lets take about dll.
If you want to pass a string to a dll call, you must make the procedure input PChar. Else you get data curroption.
so we say that our dll has
procedure LookPchar(pfff:Pchar);stdCall;External 'OutDll.dll';
which is nice. Now lets look what we declare in the dll dpr:
procedure LookPchar(pfff:Pchar);
begin
with TForm1.Create(nil) do
try
show;
FireDacConnection.ConnectionName := (Copy(pfff,1,100));
finally
free;
end;
end;
exports LookPchar;
well, in the Dll we have a Form, with a FireDacConnection in it, but any component or object in it will do the work.
the problem is that this PChar is released twice and cause memory leaks. i can't find a way to pass the PChar without cause memory leaks.
you may use fastmm, i use eurukalog, which writes
|+Leak #2: Type=UnicodeString: Ref count - 1, Content: "\r\n"; Total
size=18; Count=1 |
Why is the Unicode String gets Ref count of -1? how to prevent it? how to pass the Unicode string correctly?
What I tried:
pass it as const.
copy it (as in example and with strpcopy and strcopy)
use local variable to hold the copy of PChar.
edit:
adding the calling code :
var
ConnectionName:WideString;
begin
ConnectionName := 'This Is My String';
LookPChar(PChar(ConnectionName));
end;
adding leak log dump
|+Leak #2: Type=UnicodeString: Ref count - 1, Content: "\r\n"; Total size=18; Count=1 |
|-----------------------------------------------------------------------------------------------------------------------------------------| |00000002|04 |00000000|01D79D9C|outDll.dll|00009D9C|System
| |_NewUnicodeString |23897[6] |
|00000002|04 |00000000|008A11BC|myapp.exe |004A11BC|Caller
|TForm2 |Button4Click |66[2] |
|00000002|04 |00000000|00641C13|myapp.exe |00241C13|Vcl.Controls
|TControl |Click |7348[9] |
|00000002|04 |00000000|00646172|myapp.exe |00246172|Vcl.Controls
|TWinControl |WndProc |10038[153] |
|00000002|04 |00000000|0065B71C|myapp.exe |0025B71C|Vcl.StdCtrls
|TButtonControl|WndProc |5163[13] |
|00000002|04 |00000000|006462D7|myapp.exe |002462D7|Vcl.Controls
| |DoControlMsg |10107[12] |
|00000002|04 |00000000|00646172|myapp.exe |00246172|Vcl.Controls
|TWinControl |WndProc |10038[153] |
|00000002|04 |00000000|0070B240|myapp.exe |0030B240|Vcl.Forms
|TCustomForm |WndProc |4427[206] |
|00000002|04 |00000000|006457AC|myapp.exe |002457AC|Vcl.Controls
|TWinControl |MainWndProc |9750[3] |
|00000002|04 |00000000|004F7614|myapp.exe
|000F7614|System.Classes| |StdWndProc
|16600[8] | |00000002|03 |00000000|768162F7|user32.dll
|000162F7|USER32 | | (possible
gapfnScSendMessage+815)| | |00000002|03
|00000000|76816D35|user32.dll |00016D35|USER32 |
| (possible GetThreadDesktop+210) | | |00000002|03
|00000000|76816DE8|user32.dll |00016DE8|USER32 |
| (possible GetThreadDesktop+389) | | |00000002|03
|00000000|76816E49|user32.dll |00016E49|USER32 |
| (possible GetThreadDesktop+486) | | |00000002|03
|00000000|77420107|ntdll.dll |00010107|ntdll |
|KiUserCallbackDispatcher | | |00000002|03
|00000000|768196D0|user32.dll |000196D0|USER32 |
|SendMessageW | | |00000002|03
|00000000|71AB459B|comctl32.dll |000A459B|comctl32 |
|LoadIconMetric | | |00000002|03
|00000000|71AB45FE|comctl32.dll |000A45FE|comctl32 |
|LoadIconMetric | | |00000002|03
|00000000|71AB4488|comctl32.dll |000A4488|comctl32 |
|LoadIconMetric | | |00000002|03
|00000000|768162F7|user32.dll |000162F7|USER32 |
| (possible gapfnScSendMessage+815)| | |00000002|03
|00000000|76816D35|user32.dll |00016D35|USER32 |
| (possible GetThreadDesktop+210) | | |00000002|03
|00000000|76820D32|user32.dll |00020D32|USER32 |
| (possible GetClientRect+192) | | |00000002|03
|00000000|76820D56|user32.dll |00020D56|USER32 |
|CallWindowProcW | | |00000002|04
|00000000|00646282|myapp.exe |00246282|Vcl.Controls |TWinControl
|DefaultHandler |10079[30] | |00000002|04
|00000000|00646172|myapp.exe |00246172|Vcl.Controls |TWinControl
|WndProc |10038[153] | |00000002|04
|00000000|0065B71C|myapp.exe |0025B71C|Vcl.StdCtrls
|TButtonControl|WndProc |5163[13] |
|00000002|04 |00000000|004F7614|myapp.exe
|000F7614|System.Classes| |StdWndProc
|16600[8] | |00000002|03 |00000000|768162F7|user32.dll
|000162F7|USER32 | | (possible
gapfnScSendMessage+815)| | |00000002|03
|00000000|76816D35|user32.dll |00016D35|USER32 |
| (possible GetThreadDesktop+210) | | |00000002|03
|00000000|768177CE|user32.dll |000177CE|USER32 |
| (possible CharPrevW+314) | | |00000002|03
|00000000|76817893|user32.dll |00017893|USER32 |
|DispatchMessageW | |
sorry its unclear, i do not know how to keep tabs in stackoverflow editor.
Copy(pfff,1,100) is rather odd. You can use pfff directly and have the compiler automatically convert from pointer to null terminated character array to string.
FireDacConnection.ConnectionName := pfff;
It would surely make sense to do that before calling Show. It certainly seems pretty weird that you show a form modeless, then set the connection name, and then free the form. Indeed, even showing a form in a DLL looks odd.
That said, this isn't the cause of your problem. The only explanation for a leak in you code is a calling convention mismatch, or an error at the call site. Passing a PChar, and taking a copy, as you do, won't leak.
The calling convention in the implementation appears to be register. The declaration in your DLL should be:
procedure LookPchar(pfff:Pchar); stdcall;
Or did you not show the stdcall in the DLL code?
You might have made a mistake at the call site. Perhaps the leak is there. We cannot see that code.
Looking at your various edits, FastMM is reporting a leak that is not produced by any of the code in the question. You will need to isolate the issue before you can solve it. That's your next step.
Using PChar is fine for input. In the other direction, from callee to caller, there are many options, but you have not asked about that here. And there are many many questions on that topic.

Neo4j - count very slow

I am running this query (bisac_code is uniquely indexed).
Execution time is more than 2.5 minutes.
52 main codes are selected from almost 4000 in total.
The total number of wokas is very large, 19 million nodes.
Are there any possibilities to make it run faster?
neo4j-sh (?)$ MATCH (b:Bisac)-[r:INCLUDED_IN]-(w:Woka)
> WHERE (b.bisac_code =~ '.*000000')
> RETURN b.bisac_code as bisac_code, count(w) as wokas_count
> ORDER BY b.bisac_code
> ;
+---------------------------+
| bisac_code | wokas_count |
+---------------------------+
| "ANT000000" | 13865 |
| "ARC000000" | 32905 |
| "ART000000" | 79600 |
| "BIB000000" | 2043 |
| "BIO000000" | 256082 |
| "BUS000000" | 226173 |
| "CGN000000" | 16424 |
| "CKB000000" | 26410 |
| "COM000000" | 44922 |
| "CRA000000" | 18720 |
| "DES000000" | 2713 |
| "DRA000000" | 62610 |
| "EDU000000" | 228182 |
| "FAM000000" | 42951 |
| "FIC000000" | 474004 |
| "FOR000000" | 41999 |
| "GAM000000" | 8803 |
| "GAR000000" | 37844 |
| "HEA000000" | 36939 |
| "HIS000000" | 3908869 |
| "HOM000000" | 5123 |
| "HUM000000" | 29270 |
| "JNF000000" | 40396 |
| "JUV000000" | 200144 |
| "LAN000000" | 89059 |
| "LAW000000" | 153138 |
| "LCO000000" | 1528237 |
| "LIT000000" | 89611 |
| "MAT000000" | 58134 |
| "MED000000" | 80268 |
| "MUS000000" | 75997 |
| "NAT000000" | 35991 |
| "NON000000" | 107513 |
| "OCC000000" | 42134 |
| "PER000000" | 26989 |
| "PET000000" | 4980 |
| "PHI000000" | 72069 |
| "PHO000000" | 8546 |
| "POE000000" | 104609 |
| "POL000000" | 309153 |
| "PSY000000" | 55710 |
| "REF000000" | 96477 |
| "REL000000" | 133619 |
| "SCI000000" | 86017 |
| "SEL000000" | 40901 |
| "SOC000000" | 292713 |
| "SPO000000" | 172284 |
| "STU000000" | 10508 |
| "TEC000000" | 77459 |
| "TRA000000" | 9093 |
| "TRU000000" | 12041 |
| "TRV000000" | 27706 |
+---------------------------+
52 rows
198310 ms
And the response time is not consistent.
After a while drops to less than half of a minute.
52 rows
31207 ms
In Neo4j 2.3 there will be index support for prefix LIKE searches but probably not for postfix ones.
There are two ways of making #user2194039's solution faster:
Use path expression to count the Woka per Bisac:
MATCH (b:Bisac) WHERE (b.bisac_code =~ '.*000000')
WITH b, size((b)-[:INCLUDED_IN]->()) as wokas_count
RETURN b.bisac_code as bisac_code, wokas_count
ORDER BY b.bisac_code
Mark the Bisac's with that pattern with a label
MATCH (b:Bisac) WHERE (b.bisac_code =~ '.*000000') SET b:Main;
MATCH (b:Main:Bisac)
WITH b, size((b)-[:INCLUDED_IN]->()) as wokas_count
RETURN b.bisac_code as bisac_code, wokas_count
ORDER BY b.bisac_code;
The slow speed is caused by your regular expression pattern matching (=~ ). Although your bisac_code is indexed, the regex match causes the index to be ineffective. The index only works when you are matching full bisac_code values.
Cypher does include some string manipulation facilities that might let you get by without using a regex =~, but I doubt it would make any difference, because the index will still be useless.
I might suggest considering if you can further categorize your bisac_codes so that you do not need to do a pattern match. Maybe an extra indexed property that somehow denotes those codes that end in 000000?
If you do not want to add properties, you may try matching only the Bisacs first, and then including the Wokas. Something like this:
MATCH (b:Bisac) WHERE (b.bisac_code =~ '.*000000')
WITH b
MATCH (b)-[r:INCLUDED_IN]-(w:Woka)
RETURN b.bisac_code as bisac_code, count(w) as wokas_count
ORDER BY b.bisac_code
This may help Cypher stick to the 4000 Bisac nodes while doing the pattern match, before getting involved with all 19 million Woka nodes, but I am not sure if this will make a material difference. Even slogging through 4000 nodes (effectively without an index) is a slow process.
Hash Tables in Database Indexing
The reason that your index is ineffective for regex pattern matching is that Neo4j likely uses a hash table for indexing properties. This is common of many databases. Wikipedia has an article here.
The basics though are that the index is not storing all of the properties that you want to search through. It is storing values that represent the properties you want to search through, and the representation is only valid for the whole property. If you are searching for only a part of the property value, the hashes stored in the index are useless, and the database must search through the properties the old-fashioned way -- one by one.
Edit re: your edit
The improvement in response time after running this query multiple times is certainly due to caching. Neo4j is remembering that you access the Bisac nodes and bisac_code properties frequently, and is keeping them in memory. This makes future queries faster because the values do not need to be read off disk.
However, eventually, those nodes a properties will likely be dropped from the cache, as Neo4j finds you manipulating different nodes, which it will cache instead. There are only so many nodes Neo4j can cache before running out of memory, so it picks the most recent and/or frequently used data.

Keeping JavaScript state in an ePub's spine

Is it possible to keep code state across pages in an ePub? More specifically do readers like iBooks allow this type of state?
spine.js
+---------+----------+
| | |
+--------+ +--------+ +--------+
| Page 1 | | Page 2 | | Page |
| Quiz1 | | Quiz2 | | (n) |
| | | | | Result |
| | | | | |
+--------+ +--------+ +--------+
In this example, the last page could contain a score but state is required. WebSQL is out of the question since it's not supported by webkit ereaders and websockets demand a connection. Any thoughts?
No. Each HTML file is independent. To share information, you'll need to use some kind of local storage such as window.localStorage, but it's very hard to find out what device supports what level of HTML5.
UPDATE: This thread says localStorage is in fact supported.

Resources