gdb on iPad failing to dump memory - ios

I'm attempting to look at the memory of an app running on my iPad. I've got the pid of the app and I'm able to attach to the process with gdb.
iPad:~/dev root# gdb -p 3839
GNU gdb 6.3.50.20050815-cvs (Fri May 20 08:08:42 UTC 2011)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=arm-apple-darwin9 --target=".
/private/var/root/dev/3839: No such file or directory
Attaching to process 3839.
0x35824004 in ?? ()
(gdb)
However, when i attempt to dump one of the memory regions:
(gdb) info mach-regions
Region from 0x22000 to 0x24000 (---, max r-x; copy, private, not-reserved)
... from 0x24000 to 0x269000 (---, max r-x; copy, private, not-reserved)
[...]
... from 0x267a000 to 0x267b000 (---, max rwx; copy, private, not-reserved)
... from 0x267b000 to 0x267c000 (---, max rwx; copy, private, not-reserved)
... from 0x267c000 to 0x267d000 (---, max rwx; copy, private, not-reserved)
... from 0x267d000 to 0x267f000 (---, max rwx; copy, private, not-reserved)
... from 0x267f000 to 0x2680000 (---, max rwx; copy, private, not-reserved)
[...]
(gdb) dump binary memory output.bin 0x267c000 0x267d000
I get this error:
gdb stack crawl at point of internal error:
0 gdb 0x00170e74 internal_vproblem + 124
1 gdb 0x0016dd68 internal_verror + 52
2 gdb 0x0016dd94 align_down + 0
3 gdb 0x0016def8 gdb_check_fatal + 36
4 gdb 0x001eba9c mach_xfer_memory + 588
5 gdb 0x000c7ca8 default_xfer_partial + 256
6 gdb 0x000ca04c target_xfer_partial + 1004
7 gdb 0x000ca404 target_read_partial + 56
8 gdb 0x000ca488 target_read + 120
9 gdb 0x000ca674 target_read_memory + 68
10 gdb 0x00003f5c dump_memory_to_file + 276
11 gdb 0x0016c804 execute_command + 1160
12 gdb 0x000aba88 command_handler + 228
13 gdb 0x000acd94 command_line_handler + 768
14 gdb 0x002236b4 rl_callback_read_char + 160
../../gdb-1518/src/gdb/macosx/macosx-nat-mutils.c:772: internal-error: assertion failure in function "mach_xfer_memory": r_end >= cur_memaddr
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y
At which point the app crashes as well, once I exit gdb.
It seems like I'm doing everything right, here... why is the error happening? And how would one go about making it work?
I'm running iOS 5.1.1 on the first generation iPad.

Ah apparently the version of GDB that ships with Cydia doesn't work on iOS 5. These instructions worked for me:
GNU Debugger (gdb) is used to analyze the run time behavior of an iOS application. In recent iOS versions, GNU Debugger directly downloaded from the Cydia is broken and not functioning properly. Following the Pod 2g blog post also did not help me.
To get rid of this problem, add http://cydia.radare.org to cydia source and download the latest GNU Debugger (build 1708). GDB build 1708 is working for iOS 5.x.

Related

App terminated due to cpu use in iOS 13.2

I have a location service based tracking and geofencing app which would run for days and weeks in the background on and iOS 12.2 ff device.
Now with iOS 13.2 the app gets terminated after a variable amount of time, but at least several hours, due to excessive cpu usage:
Date/Time: 2019-11-09 23:25:18 +0200
End time: 2019-11-09 23:26:06 +0200
OS Version: iPhone OS 13.2 (Build 17B84)
Architecture: arm64
Report Version: 29
Incident Identifier: 5B46660C-A347-477F-8AE2-B1401080892B
Data Source: Microstackshots
Shared Cache: 0x44db8000 94FD24C8-F407-3A82-8D27-367F5B6C7BEC
Command: Anchorwatch
Path: /private/var/containers/Bundle/Application/32CD03B2-449C-4E84-8E06-79FA9B50F3A9/Anchorwatch.app/Anchorwatch
Identifier: de.sioned.Anchorwatch
Version: 2.2.1 (14)
Beta Identifier: 68A95EFC-B341-476C-9277-D711242471EC
PID: 57424
Event: cpu usage
Action taken: Process killed
CPU: 48 seconds cpu time over 48 seconds (99% cpu average), exceeding limit of 80% cpu over 60 seconds
CPU limit: 48s
Limit duration: 60s
CPU used: 48s
CPU duration: 48s
Duration: 48.39s
Duration Sampled: 14.08s
Steps: 16
Hardware model: iPad6,12
Active cpus: 2
Heaviest stack for the target process:
16 ??? (libsystem_pthread.dylib + 49032) [0x1c4f64f88]
16 ??? (libdispatch.dylib + 74516) [0x1c4ecb314]
16 ??? (libdispatch.dylib + 36344) [0x1c4ec1df8]
16 ??? (libdispatch.dylib + 33488) [0x1c4ec12d0]
16 ??? (libdispatch.dylib + 104312) [0x1c4ed2778]
16 ??? (libdispatch.dylib + 33948) [0x1c4ec149c]
16 ??? (libdispatch.dylib + 124996) [0x1c4ed7844]
16 ??? (libdispatch.dylib + 125216) [0x1c4ed7920]
16 ??? (libsystem_kernel.dylib + 158196) [0x1c50419f4]
Powerstats for: Anchorwatch [57424]
Bundle ID: de.sioned.Anchorwatch
Adam ID: 0
Is first party: No
App version: 2.2.1
Build version: 14
Is Beta: No
Share with Devs: No
UUID: 1C943425-70F9-3670-98D0-45D3051B4BB7
Path: /private/var/containers/Bundle/Application/32CD03B2-449C-4E84-8E06-79FA9B50F3A9/Anchorwatch.app/Anchorwatch
Architecture: arm64
Footprint: 1046.66 MB
Start time: 2019-11-09 23:25:52 +0200
End time: 2019-11-09 23:26:06 +0200
Num samples: 16 (100%)
CPU Time: 13.971s
Primary state: 15 samples Non-Frontmost App, Non-Suppressed, Kernel mode, Effective Thread QoS Background, Requested Thread QoS Default, Override Thread QoS Unspecified
User Activity: 0 samples Idle, 0 samples Active, 16 samples Unknown
Power Source: 0 samples on Battery, 0 samples on AC, 16 samples Unknown
16 _pthread_wqthread + 275 (libsystem_pthread.dylib + 49032) [0x1c4f64f88]
16 _dispatch_workloop_worker_thread + 587 (libdispatch.dylib + 74516) [0x1c4ecb314]
16 _dispatch_lane_invoke$VARIANT$mp + 419 (libdispatch.dylib + 36344) [0x1c4ec1df8]
16 _dispatch_lane_serial_drain$VARIANT$mp + 299 (libdispatch.dylib + 33488) [0x1c4ec12d0]
16 _dispatch_mach_invoke$VARIANT$mp + 471 (libdispatch.dylib + 104312) [0x1c4ed2778]
16 _dispatch_lane_serial_drain$VARIANT$mp + 759 (libdispatch.dylib + 33948) [0x1c4ec149c]
16 _dispatch_event_loop_drain$VARIANT$mp + 315 (libdispatch.dylib + 124996) [0x1c4ed7844]
16 _dispatch_kq_drain + 123 (libdispatch.dylib + 125216) [0x1c4ed7920]
16 kevent_id + 8 (libsystem_kernel.dylib + 158196) [0x1c50419f4]
1 <User mode>
Binary Images:
0x102d80000 - ??? Anchorwatch <1C943425-70F9-3670-98D0-45D3051B4BB7> /private/var/containers/Bundle/Application/32CD03B2-449C-4E84-8E06-79FA9B50F3A9/Anchorwatch.app/Anchorwatch
0x1c4eb9000 - 0x1c4f2dfff libdispatch.dylib <B7EED4C7-560D-3DA6-9B50-ED52A150AAC6> /usr/lib/system/libdispatch.dylib
0x1c4f59000 - 0x1c4f69fff libsystem_pthread.dylib <F8B082D8-24D9-3B1E-B80B-645FC8A88E14> /usr/lib/system/libsystem_pthread.dylib
0x1c501b000 - 0x1c5048fff libsystem_kernel.dylib <AE4C3D7A-7D08-33E7-BCC6-11AC821B4E48> /usr/lib/system/libsystem_kernel.dylib
While in background the app does nothing more than to write each location update to a sqlite database and checking the current location against a safety perimeter.
There is no reason to assume that the app suddenly after several hours would require such a high cpu usage.
I have not the slightest idea how to approach the problem and I am not even sure if I can do anything about it or if it is a bug in 13.2 and I have to wait for a fix.
My first assumption was, that the old iOS 12.0 bug which terminated background apps without reason and which was fixed in 12.2 was back. But that here seems to be something new. I was able to run the test app iOS 12 terminates apps in the background for no reason for several days.
Any hint how to interprete the log and take action ?
Edit:
It seems, that this is related to iOS 13.2 message: nehelper sent invalid result code [1] for Wi-Fi information request
Location service queries the WiFi interface in second intervals. Instrument shows that these query calls have a memory leak.
Turning off WiFi on the device seems to solve the problem, though that is no true solution. Now waiting for iOS 13.3 to leave beta.
It turned out, that a third party framework that I am using launches calls to CNCopyNetworkInfo in second intervals. These calls could not fullfilled as my App had no WiFi capabilities enabled and thus the calls to CNCopyNetworkInfo caused small memory leaks that added up over time.
After enabling WiFi access capabilities the memory leaks vanished.

"no next heap size found: 18446744071789822643, offset 0"

I've written a simulator, which is distributed over two hosts. When I launch a few thousand processes, after about 10 minutes and half a million events written, my main Erlang (OTP v22) virtual machine crashes with this message:
no next heap size found: 18446744071789822643, offset 0.
It's always that same number - 18446744071789822643.
Because my server is very capable, the crash dump is also huge and I can't view it on my headless server (no WX installed).
Are there any tips on what I can look at?
What would be the first things I can try out to debug this issue?
First, see what memory() says:
> memory().
[{total,18480016},
{processes,4615512},
{processes_used,4614480},
{system,13864504},
{atom,331273},
{atom_used,306525},
{binary,47632},
{code,5625561},
{ets,438056}]
Check which one is growing - processes, binary, ets?
If it's processes, try typing i(). in the Erlang shell while the processes are running. You'll see something like:
Pid Initial Call Heap Reds Msgs
Registered Current Function Stack
<0.0.0> otp_ring0:start/2 233 1263 0
init init:loop/1 2
<0.1.0> erts_code_purger:start/0 233 44 0
erts_code_purger erts_code_purger:wait_for_request 0
<0.2.0> erts_literal_area_collector:start 233 9 0
erts_literal_area_collector:msg_l 5
<0.3.0> erts_dirty_process_signal_handler 233 128 0
erts_dirty_process_signal_handler 2
<0.4.0> erts_dirty_process_signal_handler 233 9 0
erts_dirty_process_signal_handler 2
<0.5.0> erts_dirty_process_signal_handler 233 9 0
erts_dirty_process_signal_handler 2
<0.8.0> erlang:apply/2 6772 238183 0
erl_prim_loader erl_prim_loader:loop/3 5
Look for a process with a very big heap, and that's where you'd start looking for a memory leak.
(If you weren't running headless, I'd suggest starting Observer with observer:start(), and look at what's happening in the Erlang node.)

Postscript file - Image instead text

With a Postscript driver (Xerox, Canon, HP, all), when I create a PS file, for example when I print the test page in the printer properties, I get :
OK :
The view of the result is correct (with GSview for example)
Not OK :
The file size is to big, more than 4 MB.
When I edit the file, I have one big image (doNimage). I think is the reason of the big size file.
The example file : https://drive.google.com/open?id=0B9bet657DEU5alV6WFZZdDFjMmc
I'm on Windows 10, similar problem with Windows server 2012 r2.
I let the configuration of the driver by default.
Anyone has an idea ?
Thanks a lot.
Regards.
I don't understand your problem, the file you posted a link to contains text. Here's an example:
360 4485 M <202530360E0F1102381030100D100B0824152D30103102020C302A1E19181B1E1730132E28301530132D3B02230B2A2E22081308>[46 16 28 70 18 42 44 44 54 32 28 32 36 32 25 39 65 40 40 28 32 44 44 44 18 28 53 45 20 47 38 45
40 28 34 40 40 28 40 28 34 40 18 44 44 25 53 40 16 39 34 0]xS
M is a moveto and xS uses the xshow operator to draw the glyphs represented by the character codes in the hexstring, using the values in the array to modify the width of each glyph.
If you were expecting to see ASCII character codes you are going to be sadly disappointed, the files uses an incrementally downloaded subset TrueType font, so the character codes are defined as they are encountered, that is the first glyph used will be given character code 1, the second will be character code 2 and so on.
Even without that, using ASCII would limit the languages that could be supported. Back in the 1980s that maybe didn't seem like a problem, but its a long time since that was considered acceptable.
If you were expecting to be able to modify the text by editing it in a text editor, forget it. PostScript is a programming language, and the output of a PostScript printer driver is a machine-generated program. Its a lengthy process for a skilled user of the language to decipher what the program is doing. The program is not amenable to alteration, if there's a fault in the output, correct the original document and recreate the PostScript program from the original.
PostScript is not an editable format.
Thanks all for your response. I see I was not very clear in my question.
Here is the state :
With the PS driver, on a windows server 2008, I get this file :
http://expirebox.com/download/0bb511565377e8b74eead67641fe7f68.html
Inside the file I can see the text "Page de test d\222imprimante"
On a Windows server 2012 R2 :
http://expirebox.com/download/60fa957cba97c82bbcd5c0e975825b52.html
I can't see any text. It's a printer page test too.
I need to see text because I'll print document with code inside. Code for a printer to identify page type. (for example a white page for the tray n° 1, yellow page for tray 2)
KenS : I understand your point. But why the same driver give different file.
I checked if it's really the same. The only difference I see is the OS, one x86, the other x64.
Thanks.
Regards.

How to debug leak in native memory on JVM?

We have a java application running on Mule. We have the XMX value configured for 6144M, but are routinely seeing the overall memory usage climb and climb. It was getting close to 20 GB the other day before we proactively restarted it.
Thu Jun 30 03:05:57 CDT 2016
top - 03:05:58 up 149 days, 6:19, 0 users, load average: 0.04, 0.04, 0.00
Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.2%us, 1.7%sy, 0.0%ni, 93.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 24600552k total, 21654876k used, 2945676k free, 440828k buffers
Swap: 2097144k total, 84256k used, 2012888k free, 1047316k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3840 myuser 20 0 23.9g 18g 53m S 0.0 79.9 375:30.02 java
The jps command shows:
10671 Jps
3840 MuleContainerBootstrap
The jstat command shows:
S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT
37376.0 36864.0 16160.0 0.0 2022912.0 1941418.4 4194304.0 445432.2 78336.0 66776.7 232 7.044 17 17.403 24.447
The startup arguments are (sensitive bits have been changed):
3840 MuleContainerBootstrap -Dmule.home=/mule -Dmule.base=/mule -Djava.net.preferIPv4Stack=TRUE -XX:MaxPermSize=256m -Djava.endorsed.dirs=/mule/lib/endorsed -XX:+HeapDumpOnOutOfMemoryError -Dmyapp.lib.path=/datalake/app/ext_lib/ -DTARGET_ENV=prod -Djava.library.path=/opt/mapr/lib -DksPass=mypass -DsecretKey=aeskey -DencryptMode=AES -Dkeystore=/mule/myStore -DkeystoreInstance=JCEKS -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dmule.mmc.bind.port=1521 -Xms6144m -Xmx6144m -Djava.library.path=%LD_LIBRARY_PATH%:/mule/lib/boot -Dwrapper.key=a_guid -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.disable_console_input=TRUE -Dwrapper.pid=10744 -Dwrapper.version=3.5.19-st -Dwrapper.native_library=wrapper -Dwrapper.arch=x86 -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 -Dwrapper.lang.domain=wrapper -Dwrapper.lang.folder=../lang
Adding up the "capacity" items from jps shows that only my 6144m is being used for java heap. Where the heck is the rest of the memory being used? Stack memory? Native heap? I'm not even sure how to proceed.
If left to continue growing, it will consume all memory on the system and we will eventually see the system freeze up throwing swap space errors.
I have another process that is starting to grow. Currently at about 11g resident memory.
pmap 10746 > pmap_10746.txt
cat pmap_10746.txt | grep anon | cut -c18-25 | sort -h | uniq -c | sort -rn | less
Top 10 entries by count:
119 12K
112 1016K
56 4K
38 131072K
20 65532K
15 131068K
14 65536K
10 132K
8 65404K
7 128K
Top 10 entries by allocation size:
1 6291456K
1 205816K
1 155648K
38 131072K
15 131068K
1 108772K
1 71680K
14 65536K
20 65532K
1 65512K
And top 10 by total size:
Count Size Aggregate
1 6291456K 6291456K
38 131072K 4980736K
15 131068K 1966020K
20 65532K 1310640K
14 65536K 917504K
8 65404K 523232K
1 205816K 205816K
1 155648K 155648K
112 1016K 113792K
This seems to be telling me that because the Xmx and Xms are set to the same value, there is a single allocation of 6291456K for the java heap. Other allocations are NOT java heap memory. What are they? They are getting allocated in rather large chunks.
Expanding a bit more details on Peter's answer.
You can take a binary heap dump from within VisualVM (right click on the process in the left-hand side list, and then on heap dump - it'll appear right below shortly after). If you can't attach VisualVM to your JVM, you can also generate the dump with this:
jmap -dump:format=b,file=heap.hprof $PID
Then copy the file and open it with Visual VM (File, Load, select type heap dump, find the file.)
As Peter notes, a likely cause for the leak may be non collected DirectByteBuffers (e.g.: some instance of another class is not properly de-referencing buffers, so they are never GC'd).
To identify where are these references coming from, you can use Visual VM to examine the heap and find all instances of DirectByteByffer in the "Classes" tab. Find the DBB class, right click, go to instances view.
This will give you a list of instances. You can click on one and see who's keeping a reference each one:
Note the bottom pane, we have "referent" of type Cleaner and 2 "mybuffer". These would be properties in other classes that are referencing the instance of DirectByteBuffer we drilled into (it should be ok if you ignore the Cleaner and focus on the others).
From this point on you need to proceed based on your application.
Another equivalent way to get the list of DBB instances is from the OQL tab. This query:
select x from java.nio.DirectByteBuffer x
Gives us the same list as before. The benefit of using OQL is that you can execute more more complex queries. For example, this gets all the instances that are keeping a reference to a DirectByteBuffer:
select referrers(x) from java.nio.DirectByteBuffer x
What you can do is take a heap dump and look for object which are storing data off heap such as ByteBuffers. Those objects will appear small but are a proxy for larger off heap memory areas. See if you can determine why lots of those might be retained.

reducing jitter of serial ntp refclock

I am currently trying to connect my DIY DC77 clock to ntpd (using Ubuntu). I followed the instructions here: http://wiki.ubuntuusers.de/Systemzeit.
With ntpq I can see the DCF77 clock
~$ ntpq -c peers
remote refid st t when poll reach delay offset jitter
==============================================================================
+dispatch.mxjs.d 192.53.103.104 2 u 6 64 377 13.380 12.608 4.663
+main.macht.org 192.53.103.108 2 u 12 64 377 33.167 5.008 4.769
+alvo.fungus.at 91.195.238.4 3 u 15 64 377 16.949 7.454 28.075
-ns1.blazing.de 213.172.96.14 2 u - 64 377 10.072 14.170 2.335
*GENERIC(0) .DCFa. 0 l 31 64 377 0.000 5.362 4.621
LOCAL(0) .LOCL. 12 l 927 64 0 0.000 0.000 0.000
So far this looks OK. However I have two questions.
What exactly is the sign of the offset? Is .DCFa. ahead of the system clock or behind the system clock?
.DCFa. points to refclock-0 which is a DIY DCF77 clock emulating a Meinberg clock. It is connected to my Ubuntu Linux box with an FTDI usb-serial adapter running at 9600 7e2. I verified with a DSO that it emits the time with jitter significantly below 1ms. So I assume the jitter is introduced by either the FTDI adapter or the kernel. How would I find out and how can I reduce it?
Part One:
Positive offsets indicate time in the client is behind time on the server.
Negative offsets indicate that time in the client is ahead of time on the server.
I always remember this as "what needs to happen to my clock?"
+0.123 = Add 0.123 to me
-0.123 = Subtract 0.123 from me
Part Two:
Yes the USB serial converters add jitter. Get a real serial port:) You can also use setserial and tell it that the serial port needs to be low_latency. Just apt-get setserial.
Bonus Points:
Lose the unreferenced local clock entry. NO LOCL!!!!

Resources