How to use ARM intrinsics in iOS? - ios

I need to compute MSB (most significant bit) on millions of 32-bit integers on iPad very fast. I have my own (ugly) implementation of MSB written on plain C, which is slow. ARM processors have CLZ (count leading zeroes) hardware command, which can be very useful for that. According to ARM reference there is an intrinsic C function __CLZ. How can I add support of ARM intrinsic functions to my Xcode project?
P.S. I've managed to find the way of accessing hardware CLZ from NEON (by including arm_neon.h), but that's not what I need, because it's only works with vector, but I need scalar MSB.

I found ARM intrinsic functions names on page 44 of ARM C language extensions. Some of them works in Xcode. This prints 31, as expected:
NSLog(#"%u", __builtin_clz(1));
Notes:
I haven't found any references of this in Apple docs. Most likely Xcode inherited those functions from LLVM or CLANG.
You don't need to include any special headers or frameworks to use those functions. Xcode IDE autocomplete doesn't know about them.
Only a few functions from extensions list are implemented. According to pages 12-13 of the same document it should be two header files: arm_acle.h for non-NEON intrinsics and arm_neon.h for NEON intrinsics. Xcode have only the second file, but some of the functions from the first file declared somewhere else.

This may be obvious, but if if you use ARM-specific instructions, you will not be able to run your app in the iOS simulator. The simulator uses the native x86-64 hardware of your Mac.
You could create a wrapper function that uses a compiler directive to use the ARM command or fall back to the "ugly" code if you don't have support.

Related

Does the Swift toolchain eliminate code that is never called?

If I create an Xcode project with the iOS Single View Application template and choose Swift for the language, will the compiler exclude from the release build (binary) functions that never get called?
I'm wondering because I want to include a third-party library that has a lot of superfluous classes & functions, and I want to keep my app small & fast.
While I agree with comments, it is unlikely to impact performance in any significant way even if it was included...
Xcode 6 uses Apple LLVM Compiler Version 6.1, depending on how closely related it is to LLVM Developer Group's version the optimization feature is available http://llvm.org/docs/Passes.html with options such as -dce: Dead Code Elimination, -adce: Aggressive Dead Code Elimination.
One way to know for sure what is included is checking the assembly output using -emit-assembly option in the swift compiler and review the output, or opening the binary in a disassembler such as Hopper ( http://www.hopperapp.com/download.html )

Is there a way to detect VFP/NEON/Thumb/... on iOS at runtime?

So it's fairly easy to figure out what kind of CPU an iOS device runs by querying sysctlbyname("hw.cpusubtype", ...), but there seems to be no obvious way to figure out what features the CPU actually has (think VFP, NEON, Thumb, ...). Can someone think of a way to do this?
Basically, what I need is something similar to getauxval(AT_HWCAP) on Linux/Android, which returns a bit mask of features supported by the CPU.
A few things to note:
The information must be retrieved at runtime from the OS. No preprocessor defines.
Fat binaries is not a solution. I really do need to know this stuff in an ARM v6 binary.
Thanks in advance!
sysctlbyname has “hw.optional.neon”. I do not see a name for VFP, except “hw.optional.vfp_shortvector”, which is a deprecated feature.
Do a matrix float multiplaction via accelerate.framework and measure the execution time. The difference will be huge enough between Neon and VFP driven math, you simply cannot miss.
Thumb is always there, and the presence of NEON means armv7= Thumb2.
First, consider carefully whether or not you really need to support armv6 binaries for iOS. According to published version share statistics, something like 98.5% of iOS devices are running iOS 5.0 or later, which does not support armv6 devices (armv6 binaries will still run on current iOS versions, obviously, but all new apps should really be targeting armv7; there’s basically zero reason for your customers to be shipping armv6 binaries for iOS today).
Similarly, your concerns about code size are misplaced. If you provide a fat library, and your customer builds an armv6 binary against it, only the armv6 bits of your library will be built into their application. Furthermore, code size is usually a nearly trivial fraction of application bundle size; most of the size of an application comes from other resources.
Ok. All that aside, if you really want to pursue this: VFP and thumb are supported on all iOS devices, so there’s no need to check for support. You can check for NEON and thumb-2 using the method that Eric Postpischil suggested (all armv7 iOS devices have NEON support, so availability of NEON coincides exactly with availability of thumb-2).

OS X: convert .dylib to .a/.o (dynamic to static)?

Suppose I've read this caveat, and I still want to use TBB as a statically-linked library. (Pretend I'm working in an environment where users aren't allowed to create their own dylibs.) But I don't really want to rewrite the TBB makefile to generate libtbb.a instead of libtbb.dylib.
Is there a simple command-line way to convert libtbb.dylib into libtbb.o with the same entry points?
I have heard a good argument for not being able to go the other way, from static to dynamic. Namely: dynamic libraries need to be PIC, and converting a non-PIC static library to PIC isn't feasible. But that argument doesn't apply in the other direction, as far as I know.
Here's someone saying it's impossible to convert .dll to .a on Windows, but I think they're just talking about the impossibility of breaking a .dll or .exe back up into its original .o files, not necessarily saying it would be impossible to create a linkable .o file with the same contents. Also, the situation on Windows is slightly odder than "real" PIC, although I don't think that's relevant.
Intel Threading Building Blocks (TBB) is available as binary for Windows, Mac and Linux. If you expect to use libtbb.dylib from the Mac distribution on iOS then you are out of luck. The Mac distribution is targeted for Intel (32 and 64 bits). Since iOS runs on ARM processors, you could not use it, even if you found a way to convert a dynamic library to a static library.
If you found a libtbb.dylib file somewhere else targeted for ARM, then you could probably use it on iOS. It's actually possible to load dynamic libraries on iOS. Have a look at the dlopen(3) man page.
Finally, you should read about Grand Central Dispatch (GCD) instead, which is built-in support for concurrent code execution on multicore hardware in iOS and OS X.

using a C Dll and lib in obj c - ios

I have a C lib and dll file from windows application. No source code with me.
Is it possible to use that in an IOS application.
I have seen mixed responses and am confused.
If we have source code , i think we need to create dylib and then we can use the same after including relevant header file.
Please share any expert ideas to guide me in right direction.
Appreciate your help .
mia
Dynamic Libraries are not permitted on iOS to begin with, but above that, the DLL file format is not recognized by Darwin or the underlying XNU Kernel at all, as the binary format is different.
Windows APIs are not usable on the Darwin OS either (Both Mac OS X and iOS are wrappers around the basic Darwin OS). You will need to rewrite the code from the DLL to use the POSIX and/or Objective-C APIs and compile it as a static library to use it.
You need to get a iOS compatible library, no other way around it. There are several reasons:
iOS doesn't support DLLs as they are windows format, but moreover, you can't use any dynamic library on iOS, as Apple restricts it.
DLLs are usually for intel CPUs, while iOS devices have ARM CPUs.
Most dlls are calling windows APIs - are you sure this one's not?
No. If you all you have is a compiled binary DLL, there is no way to use it on iOS. Unless you happen to have an ARM DLL for the upcoming Windows 8, your DLL contains either x86 or x86-64 machine code (or maybe IA64 if you have a lot of money), which absolutely will not run on iOS devices, which are all ARM architectures. Plus many more reasons.
If you have the source code, you can recompile it for iOS, either directly into your app, as a static library that can be linked in with your app, or as a dynamic library as part of a framework. But in all cases, you need to recompile it from source code using the iOS compiler.
You are going to have to recompile it as a static library (.a file). Apple doesn't allow dynamic libraries except for their own frameworks (so you can't compile it as a dylib).

Trying to mix in OpenCL with CUDA in NVIDIA's SDK template

I have been having a tough time setting up an experiment where I allocate memory with CUDA on the device, take that pointer to memory on the device, use it in OpenCL, and return the results. I want to see if this is possible. I had a tough time getting a CUDA project to work so I just used Nvidia's template project in their SDK. In the makefile I added -lOpenCL to the libs section of the common.mk. Everything is fine when I do that, but when I add #include <CL/cl.h> to template.cu so I can start making OpenCL calls, I get over a 100 errors. They all look similar to this, but with different function names at the end:
/usr/lib/gcc/x86_64-linux-gnu/4.4.1/include/xmmintrin.h(334): error:
identifier "__builtin_ia32_cmpeqps" is undefined
I am having a hard time figuring out why. Please help if you can. Also, if there is an easier way to set up a project that'll be able to call the CUDA and OpenCL APIs let me know.
I haven't really worked with cuda, so I don't know how helpful my answer is.
From what I understand you are trying to use opencl directly from your cuda hostcode, which is if I remember correctly compiled using some compiler from nvidia instead the standard gcc. So the problem is probably that this compiler doesn't implement the necessary builtins to work with the mentioned headers.
Look here for a similar problem and it's solution:
http://forums.nvidia.com/lofiversion/index.php?t88573.html
It seems you have to put everything which needs the opencl api into a different (non cuda) compilation unit so that it will be compiled by the non nvidia compiler.
However I wouldn't count on this working (since opencl buffers aren't just pointers to the memory but should contain some metainformations to), simply because there is no real reason it should work and if it does there is no guarantee that it continues to do so.
What you could try if you really want to is using opengl for the interop, since both opencl and cuda have extensions to allow creating buffers from opengl buffers.
However why do you need to do this? Whats keeping you from using Apple's implementation shortterm, since IIRC it's open source and most of it (the opencl parts) should be platform independent anyways.

Resources