Symbolicating addresses programmatically - ios

I am looking for a way to symbolicate external app symbols (iOS) inside my own application (macOS), assuming I have the DSYM and system symbols.
Xcode symbolicates both app addresses as well as system framework addresses (UIKit, Foundation, etc.)
atos requires an image file and can symbolicate addresses from that image.
I am looking to symbolicate a large number of addresses in my own app. The addresses represent stack traces at various points in time. I would like to symbolicate the system framework addresses as well.
I found atosl, which uses dwarf.h and libdwarf.h to reimplement atos to varying degrees of success—however this seems like a very low–level approach.
Are there any other ways to symbolicate a large number of addresses at once?

Here is symbolication I use in tests (requires XCTest): https://github.com/avito-tech/Mixbox/blob/db3206c95b71f35ae6032ff9b0baff13026608f4/Frameworks/TestsFoundation/Reporting/FileLineForFailureProvider/StackTrace/ExtendedStackTraceEntryFromStackTraceEntryConverterImpl.swift
I use the code to highlight failures in tests in Xcode without requiring testers to pass file: StaticString = #file, line: UInt = #line everywhere. The code is less readable with this boilerplate, and also there is not much reason for such boilerplate, because ideally Xcode should be able to highlight stacktrace of test failure...
Note that there is an issue. If you do not have sources on the machine that executes the code, it doesn't symbolicate. Maybe it can be fixed quickly, I didn't even tried.
Also there are comments in the code about other options: atos, lldb, CoreSymbolication. I think CoreSymbolication is what you want to use. The solution I gave you is simple, more dependent on XCTest, less configurable, has some other flaws.

Related

Reducing size of dSYM files for symbolication

We are looking for ways to reduce the size of dSYM files from Apple platforms. We need dSYM files just for symbolicating stack traces of crashes and in Crashlytics blog I read this:
These mappings actually hold much more than needed just for
symbolication, presenting some opportunities for optimization. They
have everything required for a generalized symbolic debugger to step
through and inspect your program, which may be a huge amount of
information. On iOS, we have seen dSYMs greater than 1GB in size! This
is a real opportunity for optimization, and we take advantage of this
in two ways. First, we extract just the mapping info we need into a
lightweight, platform-agnostic format. This results in a typical
space-saving of 20x when compared to an iOS dSYM.
Reducing size by 20x sounds very good but I found little information on how this can be done. Do I need to learn details of Mach-O DWARF to achieve this or some of the command line tools can do this? I also wonder if the stripped version can be used direcly afterwards for symbolicating.
Thanks.
Crashlytics converts dSYMs to cSYMs and uses these cSYMs for symbolication. This conversion happens on client's Macs. You might want to look into the final size of this cSYM file that is actually uploaded to crashlytics.

Too many commands? Dyld Message: malformed mach-o: load commands size

Some iOS 9 devices in the wild seem to crash with the error message that I receive from the very basic crash reporting in Xcode only
dyld: malformed mach-o: load commands size (16464) > 16384
Unfortunately that's all the info I get. I can't even debug or reproduce locally.
Can anyone hint me into the right direction here?
It occurs after updating my Cocoapods, so I guess there's one of them (or their dependency) that misbehaves.
After some investigation of my mach-O binary, I saw that the sizeofcmds is really 16464.
If I understand correctly, there seems to be a load command size limit of 16384, can anyone confirm this?
Does that mean I should remove dylibs and everything should be fine?
At WWDC18 I went to an Apple engineer who is working on dyld. Here’s what he had to say:
The Dyld code is downloadable from https://opensource.apple.com (the one specific to us can be found inside macOS 10.12)
For iOS 9 the maximum size of load commands is indeed 16k aka 1 memory page (There’s no way around it! This is imposed by the OS itself. For customer service telling people to update to iOS 10 (all devices that run iOS 9 can except for iPhone 4S) would be viable.)
Since iOS 10 the maximum size of commands is 32k
Majority of the size of the load commands is determined by strings (paths) of the frameworks (use command otool -L to see them
Possible solutions:
Use less libraries (that was our goto solution thus far, but we will change to umbrella libraries (see below))
Shortening names (might screw up header lookup of cocoa pods, maybe use head maps to fix that inside the Xcode build process → maybe more (high-level) info in WWDC18 session “Behind the scenes of the Xcode Build Process”)
Try to build static archives for libraries (should not have dynamic resources otherwise make copy phases and figure out where resources are)
Build frameworks that re-export other frameworks (umbrella frameworks). Use -reexport-l as a linker flag (not done often) → gonna make some runtime overhead when starting the app, also uses a bit more memory (man ld → for info on re-exports)
The engineer recommended to file a bugreport via bugreport.apple.com, because in the future even hitting the 32k limit is possible.
I found a solution that will (at least temporarily) work for me - but I still encourage everyone to provide a real solution and more detailed insights.
Anyway:
Extract your binary
Xcode Archive
Export as IPA
Rename YourApp.ipa to YourApp.zip and extract
Navigate to the subfolder payload to find your YourApp.app
Right click & Show Package Contents of your YourApp.app file
Copy your binary YourApp (no file extension) to a different location
Investigate your mach-o binary
Run otool -f on your binary
Note the align for both architectures are listed which, for me, says 2^14 (16384). This seems to be the threshold for the size of load commands.
Run otool -l on your binary
You'll see that the different architectures and their load commands are listed - as well as their sizeofcmds (size of commands).
Now the funny thing:
For arm64, the sizeofcmds (16464) was larger than the align (16384), while it wasn't for armv7.
Now I haven't found enough documentation on this, but I assume that align symbolizes a threshold that should not be reached by the load command size. And also that it adjusts automatically (since we are definitely not having that many frameworks in our app, there have to be apps that have more).
So I guess the error came from this unlikely case, that the sizeofcmds was different in between the architectures AND that one of them was actually valid (so that the align was not automatically adjusted).
Please correct me if I'm wrong, I am just assuming here and I really really want to understand why this happens.
Solve the issue
Remove frameworks until you are under the sizeofcmds for both architectures.
I know this is not scalable, we were lucky (and stupid) that we still had one barely used framework in there that we could easily remove.
Fortunately this only seems to be an issue on iOS9 and will therefore loose relevance over the next months, nevertheless, we should find out if I'm right
Investigation ideas
My assumption that the align is automatically adjusted could be investigated by just putting in more and more frameworks to see if it actually does.
If so, adding frameworks would also solve the original issue - not nice, but at least slightly more scalable.
Sidenote
I don't feel like I shed enough light on the origins of this issue and I had a lot of assumptions.
While my solution works, I really hope you feel encouraged to investigate this as well and give a better answer.
So here's the problem:
The Mach-O header size is expected to be 16k (optimized for the platform's pagesize). In the reference by rachit it's basically the same thing but the limit is 32K. Both are correct in that this is hard limit of dyld, the loader.
The total size of load commands exceeds this max size. Removing frameworks and libraries works for you because that removes LC_LOAD_DYLIB commands (And, there is no reason why you'd need so many frameworks anyway). Instead of removing frameworks, build your app from the ground up starting with the core frameworks, and adding so long as you get linker errors.
btw, 'Align' has nothing to do with this - Alignment refers to the fat (universal) architecture slices, and doesn't have anything to do with the Mach-O.
I was able to resolve this for my team after reviewing the result of otool -l. Turns out we had the same directory included in our framework search paths 5x causing our dylibs to be added as rpaths 5x.

How to make an object file that cannot be dead_stripped?

What is the easiest way to produce a Mach-O object file that does not have the SUBSECTIONS_VIA_SYMBOLS flag set, such that the linker (with -dead_strip) will not later try to cut the text section into pieces and guess which pieces are used?
I can use either a command-line option to llvm/gcc (4.2.1) that will prevent it from emitting .subsections_via_symbols in the first place, or a command-line tool that will remove the flag from an existing object file.
(Writing such a tool myself based on the Mach-O spec is an option, but if possible I'd rather not reinvent the wheel that hard).
Platform: iOS, cross-compiling from OSX with XCode 4.5.
Background: We're supplying a static library that other companies build into apps. When our library encounters a problem it produces a crash report with a stack trace and certain other key information that (if we're lucky) we get to analyze later. Typically the apps as deployed have been stripped of debug information so interpreting stack traces is a problem. If we were making the app ourselves we would just save the DWARF debug data from before stripping and use that to decode the addresses in the incoming crash reports. But we can't depend on the app makers supplying us with such data from their linking steps.
What we're doing instead is to let the crash report include the run-time address of selected function; from that we can deduce the offset between addresses in our linker map and addresses in the crash report. We're linking our entire library incrementally into a single .o before we stuff it into an .a; since it does only one big thing there wouldn't be much to save from removing unused functionality from it when the app is eventually linked. Unfortunately there's a few small pieces of code in the library that are sometimes not used (alternative API entry points for the main functionality, small helper functions for interpreting our error codes and the like), and if the app developer links with -dead_strip, it disturbs the address reconstruction of crash reports that the relative offsets in the final app differ from the linker map from our incremental link operation.
We can't realistically ask all app developers to disable dead-code stripping in their build process, so it seems a better way forward if we could mark our .o as "not dead-strippable" and have the eventual app linking respect that.
I solved it.
The output of an incremental link operation only has MH_SUBSECTIONS_VIA_SYMBOLS set if all the input objects have it set. And an object file produced from assembler input only has it set if there's an explicit directive set. So one can remove the flag by linking with an empty assembler input:
echo > empty.s
$(CC) $(CFLAGS) input.o empty.s -nostdlib -Wl,r -o output.o

Debug iOS application on device without symbols

I need to debug the startup for an ios application on an actual device... and by start up I mean the very first instruction that is is executed when the OS hands control over to the app. Not "main". Also, this application doesn't have any symbols (ie. the debug information isn't available.. yet). I don't care if I have to debug at the CPU instruction level. I know how to do that (done it for over 30 years). I want the debugger to stop when control is about to transfer to the app. When I use the Attach|by Name command and run, it just says "Finished running".
Oh, and this application was not built in XCode. It is, however an application I built, signed and provisioned and moved to the device. The application does run since I can see the console output. Just in case you're thinking I'm some hacker trying to debug someone's application.
How's that for a tall order? I'll bet nobody can answer this... I've not been able to find any information on how I could do this with an XCode-built project. I wonder if it is simply not possible or "allowed" by the Apple overlords?
What do you say, Stack Overflow gods?
UPDATE: I should clarify something. This application is not built with any commercially available or open-source tool. I work with a tools vendor creating compilers, frameworks, and IDEs. IOW, you cannot get this tool... yet. In the process of bootstrapping a new tool chain, one regularly must resort to some very low-level raw debugging. Especially if there are bugs in the code generated by the tools.
I'm going to answer my own question because I think I've stumbled upon a solution. If anyone has anything more elegant and simple than this, please answer as well. On to the steps:
Starting with a raw monolithic iOS executable (not a bundled .app, but the actual binary mach-o file that is the machine code).
Create a new like-named empty Xcode project. Build and run it on the device.
Locate the output bundle's .app folder.
Copy the above raw iOS executable over the existing one in the .app bundle's folder.
The application will now have an invalid signature and cannot be deployed and run.
Run codesign against the app bundle (you can find out the command-line by running xcodebuild on the above Xcode project).
In the bundle's .app folder, run otool -h -l on the binary image. Locate the LC_UNIXTHREAD load command and find the value associated with the 'pc' register. This is address where the os loader will jump to your application. If this address is odd, then these are Thumb instructions otherwise it will be ARM (I think that's how it works).
Add a symbolic breakpoint (I used GDB instead of LLDB) and enter the address as '*0x00001234' as the symbol.
Select Product|Perform Action|Run Without Building.
Assuming that GDB is able to evaluate the breakpoint expression and set the break point, and you've selected Product|Debug Workflow|Show Disassembly When Debugging, the process should break at the very first instruction to be executed in the application.
You can now single step the instructions and use the GDB console to get/set register values.
Your question does not make sense - main is the entry point into the application. It is the first code that should be encountered, unless possibly you have initialize() overridden for some classes (but even then I think main would get hit before the runtime).
I think you are seeing some kind of odd error on launch and you think you want to set a breakpoint on entry to catch it, but far more likely what would help you is to describe the problem on launch and let one of the 4000 people who have seen and fixed the same crash help you...
However, if you really want to use GDB to break on an application with no symbols (but that you launch from XCode) you can have GDB break on an assembly address as per:
How to break on assembly instruction at a given address in gdb?
To find the address of main (or other methods) you can use tool or atos, some examples in this question:
Matching up offsets in iOS crash dump to disassembled binary
ADDITION:
If for some reason XCode cannot launch your application for debugging, you could also jailbreak and install GDB on the device itself which would give complete control over debugging. If XCode can launch you application I see no reason why being able to break at an arbitrary memory address does not give you the ability you seek...
One solution for applications with webviews is to run them in the iOS Simulator, and connect to that with the remote-debugger in macOS Safari. This is off-topic but maybe the one or other could benefit.
http://hiediutley.com/2011/11/22/debugging-ios-apps-using-safari-web-inspector/
Or use NetCat for iOS... not the most perfect solution, but at least you see what's going on.

OpenGL ES Analyzer for iPad

I'm trying to use the OpenGL ES Analyzer for my iPad application and I can't get it to show me any symbols from my code in the extended detail pane's stack trace. I see the names of UIKit and UIApplication and other Apple supplied frameworks in the stack trace, but the portion of the stack trace that represents calls into my code just shows up as instruction pointer values, and there are no symbols whatever.
When I run the same app in Xcode 4 I can debug into my code without problem, all symbols are there, etc. So I believe the application is compiled correctly in this regard.
Do others out there have this problem? The information this analyzer is collecting would be extremely useful if I could see where it my code these calls are being made...
Any pointers / workaround very much appreciated.
-Eric
Well, figured this out myself eventually, so just for completeness and for anyone else who runs into this:
It is necessary to have dSYM debugger output, i.e. "DWARD with dSYM file" in the "Debug Information Format" setting in the project.
I had changed this to be just DWARF as creating the dSYM was taking a long time each build cycle.

Resources