As the title suggests, I want to see from where came the value of a specific address. I am debugging an ios game with lldb. This game has a muliplier value of 0.4 (how fast combos decrease). I can change this value with cheat engine, but I want to know which instruction in assembly set this value to that address so I can change this instruction with hex editor. I used to use watchpoints breakpoints etc.. for variable values, but in this case, the value is constant and it is set when the app starts immediately.
The equivalent instructions in lldb for Jester's gdb steps are:
(lldb) image lookup -va <ADDRESS>
That will tell you everything lldb knows about that address.
Related
Is it possible to alter the execution path with kprobe and terminate kernel function execution? While searching, I came across this post Replace system call in linux kernel 3
AFAIK, one can change the return value using kretprobe, but what i'm looking for is conditionally terminating kernel function execution from within kprobe handler. Has this been tried before? Thanks!
I found this in the kernel docs, so it seems doable:
Changing Execution Path
Since kprobes can probe into a running kernel code, it can change the
register set, including instruction pointer. This operation requires
maximum care, such as keeping the stack frame, recovering the
execution path etc. Since it operates on a running kernel and needs
deep knowledge of computer architecture and concurrent computing, you
can easily shoot your foot.
If you change the instruction pointer (and set up other related
registers) in pre_handler, you must return !0 so that kprobes stops
single stepping and just returns to the given address. This also means
post_handler should not be called anymore.
Note that this operation may be harder on some architectures which use
TOC (Table of Contents) for function call, since you have to setup a
new TOC for your function in your module, and recover the old one
after returning from it.
I have been experimented around accessing memory used by other programs and I've encountered a little bit strange (to me) results.
First I have created a variable in my first program and gave it the value of 10. Then I looked at the address of it and asigned it manualy to a pointer in my second program. After that i tried to derefrence the pointer and (to my surprise) the program didn't crash. Instead it printed derefrenced pointer's value as 0
Next I created a few other programs to experiment with this. In my first program I created a pointer and asigned 'new int' to it. Then I checked the address of the int and manually asigned it to another pointer in my second program. Now when i tried to derefrence the ptr of my second program it did crash.
Could someone explain why the difference happened? And why was the derefrenced pointer 0?
Sorry for a possibly stupid question :/
This is because the addresses that your program prints for you to see are virtual addresses. Virtual addresses are relative to the memory space of each individual program. They get converted to physical memory addresses by the operating system during runtime.
So you didn't really access the real (physical) memory address of one of your programs from another one. This is also why the pointer value was set to 0.
I am profiling crucial loop in my app and I see interesting option in time profiler settings:
"View as value"
When I select it, I see numbers with 'x' postfix instead of standard percentage values.
As an example, see attached screenshot (assembly code view):
and instruction has 442x value. This btw seems heavy for such simple assembly instruction and comparing to others in the loop.
What do those numbers mean? Are those somehow referring to the CPU cycles for given line of code?
My goal is:
Given a suspended thread in a Delphi-compiled 32 or 64-bit Windows program, to walk the stack (doable)
Given stack entries, to enumerate the local variables in each method and their values. That is, at the very least, find their address and type (integer32/64/signed/unsigned, string, float, record, class...) the combination of which can be used to find their value.
The first is fine and it's the second that this question is about. At a high level, how do you enumerate local variables given a stack entry in Delphi?
At a low level, this is what I've been investigating:
RTTI: does not list this kind of information about methods. This was not something I actually ever thought was a realistic option, but listing here anyway.
Debug information: Loading the debug info produced for a debug build.
Map files: even a detailed map file (a text-format file! Open one and have a look) does not contain local variable info. It's basically a list of addresses and source file line numbers. Great for address to file&line correlation, e.g. the blue dots in the gutter; not great for more detailed information
Remote debugging information (RSM file) - no known information on its contents or format.
TD32/TDS files: my current line of research. They contain global and local symbols among a lot of other information.
The problems I'm encountering here are:
There's no documentation of the TD32 file format (that I can find.)
Most of my knowledge of them comes from the Jedi JCL code using them (JclTD32.pas) and I'm not sure how to use that code, or whether the structures there are extensive enough to show local vars. I'm pretty certain it will handle global symbols, but I'm very uncertain about local. There are a wide variety of constants defined and without documentation for the format, to read what they mean, I'm left guessing. However, those constants and their names must come from somewhere.
Source I can find using TDS info does not load or handle local symbols.
If this is the right approach, then this question becomes 'Is there documentation for the TDS/TD32 file format, and are there any code samples that load local variables?'
A code sample isn't essential but could be very useful, even if it's very minimal.
Check if any debugging symbols weren't in binary.
Also possible is using GDB (on Windows a port of
it). It would be great if you found a .dbg or .dSYM
file. They contain source code, eg.
gdb> list foo
56 void foo()
57 {
58 bar();
59 sighandler_t fnc = signal(SIGHUP, SIG_IGN);
60 raise(SIGHUP);
61 signal(SIGHUP, fnc);
62 baz(fnc);
63 }
If you don't have any debugging files, you may try to get MinGW or Cygwin, and use nm(1) (man page). It will read symbol names from binary. They may contain some types, like C++ ones:
int abc::def::Ghi::jkl(const std::string, int, const void*)
Don't forget to add --demangle option then or you'll get something like:
__ZN11MRasterFont21getRasterForCharacterEh
instead of:
MRasterFont::getRasterForCharacter(unsigned char)
Take a look at the http://download.xskernel.org/docs/file%20formats/omf/borland.txt Open Architecture Handbook. It is old, but maybe you find some relevant information about the file format.
I have parsed out the addresses, file names and line numbers from a dSYM file for an iOS app. I basically have a table that maps an address to a file name and line number, which is very helpful for debugging.
To get the actual lookup address, I use the stack trace address from the crash report and use the formula specified in this answer: https://stackoverflow.com/a/13576028/2758234. So something like this.
(actual lookup address)
= (stack trace address) + (virtual memory slide) - (image load address)
I use that address and look it up on my table. The file name I get is correct, but the line number always points to the end of the function or method that was called, not the actual line that called the following function on the stack trace.
I read somewhere, can't remember where, that frame addresses have to be de-tagged, because they are aligned to double the system pointer size. So for 32-bit systems, the pointer size is 4 bytes, so we de-tag using 8-bytes, using a formula like this:
(de-tagged address) = (tagged address) & ~(sizeof(uintptr_t)*2 - 1)
where uintptr_t is the data type used for pointers in Objective-C.
After doing this, the lookup sort of works, but I have to do something like find the closest address that is less than or equal to the de-tagged address.
Question #1:
Why do I have to de-tag a stack frame address? Why in the stack trace aren't the addresses already pointing to the right place?
Question #2:
Sometimes in the crash report there seems to be a missing frame. For example, if function1() calls function2() which calls function3() which calls function4(), in my stack trace I will see something like:
0 Exception
1 function4()
2 function3()
4 function1()
And the stack trace address for function3() (frame 2, above) doesn't even point to the right line number (but it is the right file, though), even after de-tagging. I see this even when I let Xcode symbolicate a crash report.
Why does this happen?
For question #1, the addresses in an iOS crash report have three components that are taken into account: The original load address of your app, the random slide value that was added to that address when your app was launched, and the offset within the binary. At the end of the crash report, there should be a line showing the actual load address of your binary.
To compute the slide, you need to take the actual load address from the crash report and subtract the original load address. This tells you the random slide value that was applied to this particular launch of the app.
I'm not sure how you derived your table - the problem may lie there. You may want to double check by using lldb. You can load your app into lldb and tell lldb that it should be loaded at address 0x140000 (this would be the actual load address from your crash report, don't worry about slides and original load addresses)
% xcrun lldb
(lldb) target create -d -a armv7 /path/to/myapp.app
(lldb) target modules load -f myapp __TEXT 0x140000
Now lldb has your binary loaded at the actual load address of this crash report. You can do all the usual queries in lldb, such as
(lldb) image lookup -v -a 0x144100
to do a verbose lookup on address 0x144100 (which might appear in your crash report).
You can also do a nifty "dump your internal line table" command in lldb with target modules dump line-table. For instance, I compiled a hello-world Mac app:
(lldb) tar mod dump line-table a.c
Line table for /tmp/a.c in `a.out
0x0000000100000f20: /tmp/a.c:3
0x0000000100000f2f: /tmp/a.c:4:5
0x0000000100000f39: /tmp/a.c:5:1
0x0000000100000f44: /tmp/a.c:5:1
(lldb)
I can change the load address of my binary and try dumping the line table again:
(lldb) tar mod load -f a.out __TEXT 0x200000000
section '__TEXT' loaded at 0x200000000
(lldb) tar mod dump line-table a.c
Line table for /tmp/a.c in `a.out
0x0000000200000f20: /tmp/a.c:3
0x0000000200000f2f: /tmp/a.c:4:5
0x0000000200000f39: /tmp/a.c:5:1
0x0000000200000f44: /tmp/a.c:5:1
(lldb)
I'm not sure I understand what you're doing with the de-tagging of the addresses. The addresses on the call stack are the return addresses of these functions, not the call instruction - so these may point to the line following the actual method invocation / dispatch source line, but that's usually easy to understand when you're looking at the source code. If all of your lookups are pointing to the end of the methods, I think your lookup scheme may have a problem.
As for question #2, the unwind of frame #1 can be a little tricky at times if frame #0 (the currently executing frame) is a leaf function that doesn't set up a stack frame, or is in the process of setting up a stack frame. In those cases, frame #1 can get skipped. But once you're past frame #1, especially on arm, the unwind should not miss any frames.
There is one very edge-casey wrinkle when a function marked noreturn calls another function, the last instruction of the function may be a call -- with no function epilogue -- because it knows it will never get control again. Pretty uncommon. But in that case, a simple-minded symbolication will give you a pointer to the first instruction of the next function in memory. Debuggers et al use a trick where they subtract 1 from the return address when doing symbol / source line lookup to sidestep this issue, but it's not something casual symbolicators usually need worry about. And you have to be careful to not do the decr-pc trick on the currently-executing function (frame 0) because a function may have just started executing and you don't want to back up the pc into the previous function and symbolicate incorrectly.