Change Subroutine Names at Build Time to Avoid Collisions in Xcode - ios

BACKGROUND
I'm building an iOS app (which I'll just call MyApp from here) that will rely on calculations done by several separate static libraries (which I'll call Lib1, Lib2, Lib3,...). Each library is built in it's own project, then imported into a single workspace (so the workspace will contain MyApp, Lib1, Lib2, ...). More details on how this is set up here. The libraries are used by other products that are independent from MyApp, so I want to minimize any changes in the libraries. The libraries are also written in (plain) C, so there are no header files.
Certain function names are used by multiple libraries (so both Lib1 and Lib2 might each have a DoStuff method). Functions with the same name generally do the same thing, but there are some specifics about how that do it that can be different between libraries, so the actual code in DoStuff on Lib1 might be quite different than the code in DoStuff on Lib2. It would be very difficult to write one universal DoStuff that would be exactly the same in each library.
THE ISSUE
While the app is running, it isn't calling the correct DoStuff from the correct library. I found out about this because the wrong function was called during a debug session (which eventually caused the app to crash, due to the subtle differences in the DoStuff functions).
WHAT I'M LOOKING FOR
Each library has only one entry point from MyApp, and each entry point is uniquely named. If DoStuff is called from the entry point method of Lib1 (or any other method on Lib1, for that matter), then I want it to call the DoStuff method on Lib1. What's the best way to make that happen?
Is there any way (maybe through a setting somewhere in XCode) I can make it so that each library is it's own namespace? That would be my preferred way to fix the issue. I guess I could go through and rename the duplicate functions so that they are all unique (so the DoStuff method on Lib1 could be renamed to Lib1DoStuff, or something similar), but there are hundreds of functions that could have duplicate names, and we are going to be adding hundreds of libraries to the project, so having to go in and rename all the functions by hand and fix all the calls to them would take a significant amount of time, and my boss doesn't see that as a viable option.
UPDATE
After looking at the comments from Josh Caswell and some of the links he provided, it looks like it might be possible to automatically rename all the functions when the libraries are compiled, and that would be the best way to try to fix THE ISSUE above. From what I've seen, the objcopy that gets mentioned in a couple of the links in the comments isn't support on iOS. I eventually came across this blog entry, which talks about creating custom build rules for Xcode targets, and this blog that talks about custom build settings and build phases.
Am I right to assume that I can use scripts at some point in the build process to automatically append to the name of all the functions in each of my libraries, instead of doing it manually as I described in the last paragraph of the WHAT I'M LOOKING FOR section above? If so, which is the correct part of the build process to make those changes? Lastly, what would the syntax look like for doing something like that? The 'scripts' used in the different parts of the build processes certainly doesn't look like Obj-C. I've never used these 'scripts' before, so I'm completely in the dark on how I'd use them, and that's what I'm looking for help with.
I tried to be as clear as I could, but if there are any questions on what I'm asking please let me know.

Why isn't xcode calling the correct library function?
Let's say I have 3 C libraries as mentioned by you. Let's say it has the following code.
Library 1 - test1lib.a with code:
#include <stdio.h>
void doStuff()
{
printf("\nDoing stuff for lib1\n");
}
void uniqueEntryPoint1()
{
printf("\nUnique entry point for lib1\n");
doStuff();
}
Library 2 - test2lib.a with code:
#include <stdio.h>
void doStuff()
{
printf("\nDoing stuff for lib2\n");
}
void uniqueEntryPoint2()
{
printf("\nUnique entry point for lib2\n");
doStuff();
}
Library 3 - test3lib.a with code:
#include <stdio.h>
void doStuff()
{
printf("\nDoing stuff for lib3\n");
}
void uniqueEntryPoint3()
{
printf("\nUnique entry point for lib3\n");
doStuff();
}
Here each library has a unique function and one common function doStuff()
When we add these 3 libraries to xcode and link them. xcode links but does not load all the objects files. Let's say the objective C code is like this:
#implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
// Do any additional setup after loading the view, typically from a nib.
uniqueEntryPoint1();
}
The output is
Unique entry point for lib1
Doing stuff for lib1
In this case xcode will only load symbols which are referred (library 1 objects) in this case.
If you are read about linker flags/options such as -all_load, -force_load and -objC, you will have better understanding.
If we add -all_load linker option, it will force linker to load all the objects of libraries so we will get the following error in xcode
ld: 2 duplicate symbols for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
The reason this fails as the linker detects that doStuff() is redefined multiple times.
The only way around this problem is change the linker input i.e. symbols present in these 3 libraries. This is already mentioned by Josh in the comments. I will add my $0.02 to it.
Possible Solutions
Solution 1:
The best solution (self explanatory) is to change the source code if you have access to the same.
Solution 2:
Use objcopy to rename or prefix the function as provided in the answer of this How to deal with symbol collisions between statically linked libraries?
Now your doubt on how to find the objcopy.
Option 1:
You can use this project https://github.com/RodAtDISA/llvm-objcopy. This will be tricky to compile as it builds along with llvm. You will have to follow instructions at http://llvm.org/docs/GettingStarted.html and http://llvm.org/docs/CMake.html.
If you rewrite the https://github.com/RodAtDISA/llvm-objcopy/blob/master/llvm-objcopy.cpp, you can probably reuse the parsing and object rewriting logic without depending upon llvm.
Option 2:
Compile and reuse binutils objcopy from https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=binutils/objcopy.c;h=2636ab4bcb34cf1e1e54db9933018a805b366727;hb=HEAD
Solution 3:
You can follow the answer provided by Richard in this link Rewriting symbols in static iOS libraries. This is more of a hack but using a hex editor you can rewrite the symbols if their length is kept the same. If you have more symbols and its a big library, you can consider using https://sourceforge.net/projects/bbe-/ and nm to write a script.
All of this is a considerable effort but apparently there is no shortcut.

Related

Prevent ArmClang to add calls to Standard C library

I am evaluating Keil Microvision IDE on STM32H753.
I am doing compiler comparison between ARMCC5 and AC6 in the different optimisation levels. AC6 is based on Clang.
My code is not using memcpy and I have unchecked "Use MicroLIB" in the project settings , However a basic byte per byte copy loop in my code is replaced by a memcpy with AC6 (only in "high" optimisation levels). It doesn't happen with ARMCC5.
I tried using compilation options to avoid that, as described here: -ffreestanding and -disable-simplify-libcalls, at both compiler and linker levels but it didn't change (for the second option, I get an error message saying that the option is not supported).
In the ARMCLANG reference guide i've found the options -nostdlib -nostdlibinc that prevent (??) the compiler to use any function of a standard lib.
However I still need the math.h function.
Do you know how to prevent clang to use functions from the Standard C Lib that are not explicitely called in the code ?
EDIT: here is a quick and dirty reproduceable example:
https://godbolt.org/z/AX8_WV
Please do not discuss the quality of this example, I know it is dumb !!, I know about memset, etc... It is just to understand the issue
gcc know a lot about the memcpy, memset and similar functions and even they are called "the builtin functions". If you do not want those functions to be used by default just use the command line option -fno-builtin
https://godbolt.org/z/a42m4j

Add #include's to the headers of a program using llvm clang

I need to add headers to an already existing program by transforming it with LLVM and Clang.
I have used clang's rewriter to accomplish a similar thing in the changing function names and arguments, etc.
But the header files aren't present in clang's AST. I already know we need to use PPCallbacks (https://clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html) but I am in dire need of some examples on how to make it work with the rewriter if at all possible.
Alternatively, adding a #include statement just before the first
using namespace <namespace>;
Also works. I would like to know an example of this as well.
Any help would be appreciated.
There is a bit of confusion in your question. You need to understand in details how the preprocessor works. Be aware that most of C++ compilation happens after the preprocessing phase (so most C++ static analyzers work after that phase).
In other words, the C++ specification (and also the C specification) defines first what is preprocessing, and then what is the syntax and the semantics of the preprocessed form.
In other words, when compiling foo.cc your compiler see the preprocessed form foo.ii that you could obtain with clang++ -C -E foo.cc > foo.ii
In the 1980s the preprocessor /lib/cpp was a separate program forked by the compiler (and some temporary foo.ii was sitting on the disk and removed at end of compilation). Today, it is -for performance reasons- some initial processing done inside the compiler. But you could reason as if it was still separate.
Either you want to alter the Clang compiler, and it deals (like every other C++ compiler or C++ static analyzer) mostly with the preprocessed form. Then you don't want to add new #include-s, but you want to alter the flow of AST given to the compiler (after preprocessing), and that is a different question: you then want to add some AST between existing AST elements (independently of any preprocessor directives).
Or you want to automatically change the C++ source code. The hard part is determining what you want to change and at what place. I suppose that you have used complex stuff to determine that a #include <vector> has to be inserted after line 34 of file foo.cc. Once you've got that information (and getting it is the hard thing), doing the insertion is pretty trivial. For example, you could read every C++ source line, and insert your line when you have read enough lines.

CFBundleGetFunctionPointerForName and dlsym return NULL for exported function

I have a fork of the JavaScriptCore framework, where I have added a function of my own, which is exported. The framework compiles just find. Running nm on the framework reveals that the function (JSContextCreateBacktrace_unsafe) is indeed exported:
Leo-Natans-Wix-MPB:JavaScriptCore.framework lnatan$ nm -gU JavaScriptCore.framework/JavaScriptCore | grep JSContextCreateBacktrace
00000000004cb860 T _JSContextCreateBacktrace
00000000004cba10 T _JSContextCreateBacktrace_unsafe
However, I am unable to obtain the pointer of that function using CFBundleGetFunctionPointerForName or dlsym; both return NULL. At first, I used dlopen to open my framework, then tried using CFBundleCreate and then CFBundleGetFunctionPointerForName but that also returns NULL.
What could cause this?
Update
Something fishy is going on. I renamed one of the JSC functions, and nm reflects this. However, dlsym is still able to find the function with the original name, rather than the renamed.
It's hard to track this down since it's highly dependent on your specific environment and circumstances, but it is very likely you're running into this issue because the system image has already been loaded and you haven't changed the name of the framework.
If you look at the source code for dlopen in dyld/dyldAPIS.cpp:1458, you'll notice the context passed to dyld is configured with matchByInstallName = true. This context is then passed to load which executes the various stages necessary for image loading. There are a few phases worth noting:
loadPhase2 in dyld/dyld.cpp:2896 extracts the ending of the framework path and searches for it in the search path
loadPhase5check in dyld/dyld:2712 iterates over all loaded images and determines if any of them have a matching install name, and if one does, it returns that instead of loading a new one.
loadPhase5load in dyld/dyld:2601 finally loads the image if it wasn't loaded/found by any earlier steps. (It's worth noting loadPhase5check is executed first, since image loading is a two pass process.)
Given all of the above, I'd try renaming your framework to something besides JavaScriptCore.framework. Depending on the install name of both the system framework and your framework, I'd also recommend changing the install name. (There are plenty of blog articles and StackOverflow posts that document how to do this using install_name_tool -id.)

What is the best way to avoid duplicate symbols in project that will use my iOS framework and one of the dependencies?

Here is a quotation from the other post:
I'm working in a iOS project that includes a static library created by another company. The library include an old version of AFNeworking and I don't have any source files.
Now i need to use a more recent (and less bugged) version of afneworking, but i cannot include the same class twice in the project (of course) because all the "duplicate symbols"
My problem is that I'm preparing an iOS framework and I want to avoid this kind of situation in the future. I'm not talking about AFNetworking, but other quite popular iOS framework. In addition I applied some custom changes in the original framework code.
The one way to avoid "duplicate symbols" and "Class X is implemented in both Y and Z. One of the two will be used" that comes to my mind is to add some prefix to the original framework classes, but is this the right solution?
UPDATE 1:
I tried to apply John's solution but no joy. I have created a simplified project (here is the link to the repo) with two classes FrameworkClass which is present in framework target only, and SharedClass which is present in both framework and application targets, so maybe you can see if I'm doing something wrong. After application did launch I'm still getting:
objc[96426]: Class SharedClass is implemented in both .../TestFramework.framework/TestFramework and .../SymbolsVisibilityTest.app/SymbolsVisibilityTest. One of the two will be used. Which one is undefined
UPDATE 2:
Here is my output from nm based on the provided sample project's framework-output:
0000000000007e14 t -[FrameworkClass doFramework]
0000000000007e68 t -[SharedClass doShared]
U _NSLog
U _NSStringFromSelector
00000000000081f0 s _OBJC_CLASS_$_FrameworkClass
U _OBJC_CLASS_$_NSObject
0000000000008240 s _OBJC_CLASS_$_SharedClass
00000000000081c8 s _OBJC_METACLASS_$_FrameworkClass
U _OBJC_METACLASS_$_NSObject
0000000000008218 s _OBJC_METACLASS_$_SharedClass
0000000000007fb0 s _TestFrameworkVersionNumber
0000000000007f70 s _TestFrameworkVersionString
U ___CFConstantStringClassReference
U __objc_empty_cache
U _objc_release
U _objc_retainAutoreleasedReturnValue
U dyld_stub_binder`
UPDATE 3:
I did manage to "hide" SharedClass symbols by applying the solution by #bleater and my output from nm is now:
U _NSLog
U _NSStringFromSelector
00001114 S _OBJC_CLASS_$_FrameworkClass
U _OBJC_CLASS_$_NSObject
00001100 S _OBJC_METACLASS_$_FrameworkClass
U _OBJC_METACLASS_$_NSObject
U ___CFConstantStringClassReference
U __objc_empty_cache
U _objc_release
U _objc_retainAutoreleasedReturnValue
U dyld_stub_binder`
But I'm still getting double implementation warning in Xcode.
You should limit the visibility of symbols in any framework or library you are developing. Set the default visibility to hidden, and then explicitly mark all the functions in the public interface as visible.
This avoids all the problems you have described. You can then include any version of any public library (AFNetworking, SQLite, etc.), without fear of future conflict because anything linking to your framework or library won't be able to "see" those symbols.
To set the default visibility to hidden you can go into the project settings and set "Symbols Hidden by Default" to YES. It is set to NO unless you change it.
There are at least a couple of ways to mark the symbols from your public interface as "Visible". One is by using an exports file, another is to go through and explicitly mark certain functions as visible:
#define EXPORT __attribute__((visibility("default")))
EXPORT int MyFunction1();
The define is obviously just for convenience. You define EXPORT once and then just add EXPORT to all of your public symbols.
You can find official apple documentation on this here:
Runtime Environment Programming Guide
Update:
I took a look at your sample project, and it looks like I pointed you in the wrong direction. It appears that you can only truly hide C and C++ symbols. So if your were having this problem with a C lib (like sqlite), setting the default visibility to hidden would work. It looks like the nature of the Objective C runtime prevents you from truly making the symbols invisible. You CAN mark the visibility on these symbols, but with Objective-C it appears that is just a way to have the linker enforce what you should or shouldn't be able to use from the library (while still leaving them visible).
So if you redefine a Objective-C symbol in different compilation unit with the same name (by perhaps compiling in a new version of a popular open source library), then you will still have a conflict.
I think your only solution at this point is to do what you first suggested and prefix the symbols you are including into your framework with a unique identifier. It isn't a very elegant solution, but with the limits of the objective C runtime I believe it is probably the best solution available.
So the blog post by Kamil Burczyk was a good starting point, thanks for the hint MichaƂ Ciuba! It has covered most of the symbols, but it didn't cope with categories and class clusters. You can see what category methods are still exposed without any change by invoking nm with parameter list sth like:
nm MyLibrary | grep \( | grep -v "\[LIBRARYPREFIX" | grep -v \(MyLibrary | grep -v ") prefix_"
When it comes to categories we have 3 groups of categories, and they all require a specific, different approach:
Categories on classes that has been renamed by NamespacedDependencies.h
Categories on classes not renamed by NamespacedDependencies.h
Categories on class clusters like NSString, NSArray...
Ad 1.
Everything is ok - class name will be prefixed so category will exist on prefixed sumbol in object file
Ad 2.
This problem occours whenever inside of the dependency we have category on a class like NSObject. It would be exposed without any change in object file, thus would cause a conflict. My approach was to internally rename NSObject to PREFIX_NSObject, this ofcourse requires me also to create and add the PREFIX_NSObject class implementation to the project (empty implementation, just a subclass of original NSObject)
#import "PREFIX_NSObject.h"
#ifndef NSValueTransformer
#define NSValueTransformer __NS_SYMBOL(NSObject)
#endif
Ad 3.
We cannot apply Ad 2. approach here. Actual objects created by let's say PREFIX_NSArray class methods are still of type that wont derive from my presumable PREFIX_NSArray class, so this doesn't make sense as category methods defined on PREFIX_NSArray won't be visible on NSArray derived objects. I ended up by manually prefixing methods of those categories in source code.
It's kind of crazy workflow, but at least gives warranty that everything will be 'invisible' and won't cause a conflict.
It's always good idea to run nm to check if all category symbols are hidden:
nm MyLibrary | grep \( | grep -v "\[LIBRARYPREFIX" | grep -v \(MyLibrary | grep -v ") prefix_"

Unresolved external UrlCombineW

I have a file which looks something like this:
#include <Shlwapi.h>
...
void SomeFunction()
{
UrlCombineW(...)
}
This compiled just fine until I installed another Delphi component in C++ Builder IDE but now reports unresolved external for UrlCombineW. The above call was fine before installing this component.
It seems that the component is overwriting this in some way so I need to explicitly tell the compiler where to look for UrlCombineW. This is a function from Shlwapi.dll.
Compiler does not complain, but how do I explicitly tell the linker where to look for this function and avoid unresolved external error?
Expanding my comment to an answer.
You need to link to Shlwapi.lib in order for the linker to find the functions. (This explanation glosses over a few things, but a .lib, a library file, can be either a static or import library. A static library contains the functions themselves - it's basically a collection of .obj files bundled together; an import library says that functions X, Y and Z are found in a specific DLL.) Either way, if you link the .lib in you will get the functions that you need.
There are a couple of ways to do tell the linker to link in the file:
Use #pragma comment(lib, "Filename.lib") in a .cpp file somewhere. For your case, this is #pragma comment(lib, "Shlwapi.lib").
Add it to the project options, which in turn adds it to the linker command line. In C++ Builder you do this by actually adding the .lib file to the project, ie drag and drop it onto the project in the Project Manager, or use File > Add To Project.
Which you prefer is up to you. I tend to link to localized things locally - so in my code, there's only one unit which uses Shlwapi.h and the fact it does so is an implementation detail hidden from the outside, it's not shown in the interface. Therefore, in that file, I link using #pragma comment at the point I include the header. On the other hand, if you have something used far more widely - to pick the widest example, kernel32.lib - I would add that the project itself. (Note this is an example, you don't actually need to explicitly link to kernel32, that will be done for you!)

Resources