high performance buffers in objective-c - ios

I'm wondering what the most applicable kind of buffer implementation is for audio data in objective-c. I'm working with audio data on the iPhone, where I do some direct data manipulation/DSP of the audio data while recording or playing, so performance matters. I do iPhone development since some months now. Currently I'm dealing with c-arrays of element type SInt16 or Float32, but I'm looking for something better.
AFAIK, the performance of pointer-iterated c-arrays is unbeatable in an objective-c environment. However, pointer arithmetic and c-arrays are error prone. You always have to make sure that you do not access the arrays out of their bounds. You will not get a runtime error immediately if you do. And you have to make sure manually that you alloc and dealloc the arrays correctly.
Thus, I'm looking for alternatives. What high performance alternatives are there? Is there anything in objective-c similar to the c++ style std::vector?
With similar I mean:
good performance
iteratable with pointer-/iterator-based loop
no overhead of boxing/unboxing basic data types like Float32 or SInt16 into objective-c objects (btw, what's the correct word for 'basic data types' in objective-c?)
bounds-checking
possibility to copy/read/write chunks of other lists or arrays into and out of my searched-for list implementation
memory management included
I've searched and read quite a bit and of course NSData and NSMutableArray are among the mentioned solutions. However don't they double processing cost because of the overhead for the boxing/unboxing of basic data types? That the code looks outright ugly like a simple 'set'-operation becoming some dinosaur named replaceObjectAtIndex:withObject: isn't of my concern, but still it subtly makes me think that this class is not made for me.

NSMutableData hits one of your requirements in that it brings Objective-C memory management semantics to plain C buffers. You can do something like this:
NSMutableData* data = [NSMutableData dataWithLength: sizeof(Float32) * numberOfFloats];
Float32* cFloatArray = (Float32*)[data mutableBytes];
And you can then treat cFloatArray as a standard C array and use pointer iteration. When the NSMutableData object is dealloc'ed the memory backing it will be freed. It doesn't give you bounds checking, but it delivers memory management help while preserving the performance of C arrays.
Also, if you want some help from the tools in ironing out bounds-checking issues read up on Xcode's Malloc Scribble, Malloc Guard Edges and Guard Malloc options. These will make the runtime much more sensitive to bounds problems. Not useful in production, but can be helpful in ironing out issues during development.

The containers provided in the Foundation framework have little to offer for audio processing, being on the whole rather heavy-weight, nor providing extrinsic iterators.
Furthermore, none of the audio APIs in iOS or MacOSX that interact with buffers of samples are Objective-C - based, or take any parameters of Foundation framework containers.
Most likely, you would want to make use of the Accelerate Framework for DSP operations, and its APIs all work on arrays of floats or int16s.
Whilst all of the APIs are C-style, C++ and STL is the obvious weapon of choice for your requirements, and interworks cleanly with the rest of an application in the guise of Objective-C++. STL frequently compiles down to code which is about as efficient as hand-crafted C.
To memory-manage your buffers, perhaps use std::array - if you want bounds checking or std::shared_ptr or std::unique_ptr with a custom deleter if you're not worried.
Places where an iterator is expected - for instance algorithm functions in <algorithm> - can usually also take pointers to basic types - such as your sample buffers.

Related

Using something other than a Swift array for mutable fixed-size thread-safe data passed to OpenGL buffer

I am trying to squeeze every bit of efficiency out of my application I am working on.
I have a couple arrays that follow the following conditions:
They are NEVER appended to, I always calculate the index myself
The are allocated once and never change size
It would be nice if they were thread safe as long as it doesn't cost performance
Some hold primitives like floats, or unsigned ints. One of them does hold a class.
Most of these arrays at some point are passed into a glBuffer
Never cleared just overwritten
Some of the arrays individual elements are changed entirely by = others are changed by +=
I currently am using swift native arrays and am allocating them like var arr = [GLfloat](count: 999, repeatedValue: 0) however I have been reading a lot of documentation and it sounds like Swift arrays are much more abstract then a traditional C-style array. I am not even sure if they are allocated in a block or more like a linked list with bits and pieces thrown all over the place. I believe by doing the code above you cause it to allocate in a continuous block but i'm not sure.
I worry that the abstract nature of Swift arrays is something that is wasting a lot of precious processing time. As you can see by my above conditions I dont need any of the fancy appending, or safety features of Swift arrays. I just need it simple and fast.
My question is: In this scenario should I be using some other form of array? NSArray, somehow get a C-style array going, create my own data type?
Im looking into thread safety, would a different array type that was more thread safe such as NSArray be any slower?
Note that your requirements are contradictory, particularly #2 and #7. You can't operate on them with += and also say they will never change size. "I always calculate the index myself" also doesn't make sense. What else would calculate it? The requirements for things you will hand to glBuffer are radically different than the requirements for things that will hold objects.
If you construct the Array the way you say, you'll get contiguous memory. If you want to be absolutely certain that you have contiguous memory, use a ContiguousArray (but in the vast majority of cases this will give you little to no benefit while costing you complexity; there appear to be some corner cases in the current compiler that give a small advantage to ContinguousArray, but you must benchmark before assuming that's true). It's not clear what kind of "abstractness" you have in mind, but there's no secrets about how Array works. All of stdlib is open source. Go look and see if it does things you want to avoid.
For certain kinds of operations, it is possible for other types of data structures to be faster. For instance, there are cases where a dispatch_data is better and cases where a regular Data would be better and cases where you should use a ManagedBuffer to gain more control. But in general, unless you deeply know what you're doing, you can easily make things dramatically worse. There is no "is always faster" data structure that works correctly for all the kinds of uses you describe. If there were, that would just be the implementation of Array.
None of this makes sense to pursue until you've built some code and started profiling it in optimized builds to understand what's going on. It is very likely that different uses would be optimized by different kinds of data structures.
It's very strange that you ask whether you should use NSArray, since that would be wildly (orders of magnitude) slower than Array for dealing with very large collections of numbers. You definitely need to experiment with these types a bit to get a sense of their characteristics. NSArray is brilliant and extremely fast for certain problems, but not for that one.
But again, write a little code. Profile it. Look at the generated assembler. See what's happening. Watch particularly for any undesired copying or retain counting. If you see that in a specific case, then you have something to think about changing data structures over. But there's no "use this to go fast." All the trade-offs to achieve that in the general case are already in Array.

iOS: Memory consumption for data type allocation

It may be a basic question but just curious to know the answer. How much memory will be occupied when we are creating each variable like int, NSString, NSDictionary, NSData etc. in our iOS Obj C program. I hope, it will be similar to C program based on the size of the data type (or) differently ?
Thank you!
In short, you can't. At least not in a useful fashion.
Just about every class will have any number of allocations that are not directly credited to the instance's allocation directly. As well, many classes -- NSDictionary, NSArray, NSString -- are actually a part of a class cluster and, thus, what is actually allocated is a subclass. Finally, for the various collection and data classes, the size of the associated allocations are going to vary wildly based on their contents. Some classes -- UIImage, NSData, etc -- may contain MBs of data that isn't actually represented by a heap allocation in that they are mapping data from the filesystem.
Or, to summarize, class_getInstanceSize() is useless.
Instead, you need to focus on the memory usage of your app as a systemic characteristic of its operating behavior. The Allocations Instrument can do a good job of measuring memory usage and, more importantly, help you identify what is responsible for consumption therein.
You should consider the size of Objective-C objects an implementation detail that you can't rely on. They may happen to be consistent from one environment to another, but they offer no guarantee of that. Further, there's no consistent and complete way for you to measure their size since they may be implemented in all kinds of clever ways.
If you need precise management of memory allocations, just use C types, which are themselves entirely valid Objective-C.
You can get the size of any class by calling this, an example:
#import "objc/runtime.h"
int size = class_getInstanceSize([NSDictionary class]);
It depends of content you would like to put into this classes (NSString, NSDictionary, NSData). You can say that NSMutableString and etc changes theirs sizes, but they just reallocates themselves.
The below document by Apple developer site will be helpful for you.
http://developer.apple.com/library/ios/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/MemoryManagementforYouriOSApp/MemoryManagementforYouriOSApp.html

Object Array in Objective C with ARC

I am writing an application that I'd like to speed up. One way I have thought to do this is by switching from using NSArray and NSMutableArray to using straight c-style arrays of pointers.
I had tried naively to just do:
MyObject** objects = (MyObject**) malloc(N/2*sizeof(MyObject*))
This reports a compiler error when using ARC as it doesn't know what to do with a ** object;
this can be fixed by adding a bridge directive.
My question is how is this memory being handled and how to do memory management mixing C and Objective-C objects.
Two solutions are
MyObject* __weak* objects = (MyObject* __weak*) malloc(N/2*sizeof(MyObject*));
MyObject* __strong* objects = (MyObject* __strong*) malloc(N/2*sizeof(MyObject*));
What are the differences between those two arrays and how do I go about freeing/releasing them when done. Are NSArrays optimized to the point where this wouldn't result it much of a speed up?
Are NSArrays optimized to the point where this wouldn't result it much of a speed up?
Yes.
You should profile your code in Instruments -- chances are that even if you make heavy use of arrays, you're going to find that your code spends most of its time in places other than NSArray methods like -objectAtIndex:.
Taking that a step further, you should really be able to tell us whether NSArray is optimized sufficiently that you don't need to improve it. If you're looking to speed up your code by replacing NSArray, you should have already profiled your code and identified the expensive parts. Don't just guess at what needs to be improved; measure it.

What are the performance implications of these C# features?

I have been designing a component-based game library, with the overall intention of writing it in C++ (as that is my forte), with Ogre3D as the back-end. Now that I am actually ready to write some code, I thought it would be far quicker to test out my framework under the XNA4.0 framework (somewhat quicker to get results/write an editor, etc). However, whilst I am no newcomer to C++ or C#, I am a bit of a newcomer when it comes to doing things the "XNA" way, so to speak, so I had a few queries before I started hammering out code:
I read about using arrays rather than collections to avoid performance hits, then also read that this was not entirely true and that if you enumerated over, say, a concrete List<> collection (as opposed to an IEnumerable<>), the enumerator is a value-type that is used for each iteration and that there aren't any GC worries here. The article in question was back in 2007. Does this hold true, or do you experienced XNA developers have real-world gotchas about this? Ideally I'd like to go down a chosen route before I do too much.
If arrays truly are the way to go, no questions asked, I assume when it comes to resizing the array, you copy the old one over with new space? Or is this off the mark? Do you attempt to never, ever resize an array? Won't the GC kick in for the old one if this is the case, or is the hit inconsequential?
As the engine was designed for C++, the design allows for use of lambdas and delegates. One design uses the fastdelegate library which is the fastest possible way of using delegates in C++. A more flexible, but slightly slower approach (though hardly noticeable in the world of C++) is to use C++0x lambdas and std::function. Ideally, I'd like to do something similar in XNA, and allow delegates to be used. Does the use of delegates cause any significant issues with regard to performance?
If there are performance considerations with regards to delegates, is there a difference between:
public void myDelegate(int a, int b);
private void myFunction(int a, int b)
{
}
event myDelegate myEvent;
myEvent += myFunction;
vs:
public void myDelegate(int a, int b);
event myDelegate myEvent;
myEvent += (int a, int b) => { /* ... */ };
Sorry if I have waffled on a bit, I prefer to be clear in my questions. :)
Thanks in advance!
Basically the only major performance issue to be aware of in C# that is different to what you have to be aware of in C++, is the garbage collector. Simply don't allocate memory during your main game loop and you'll be fine. Here is a blog post that goes into detail.
Now to your questions:
1) If a framework collection iterator could be implemented as a value-type (not creating garbage), then it usually (always?) has been. You can safely use foreach on, for example, List<>.
You can verify if you are allocating in your main loop by using the CLR Profiler.
2) Use Lists instead of arrays. They'll handle the resizing for you. You should use the Capacity property to pre-allocate enough space before you start gameplay to avoid GC issues. Using arrays you'd just have to implement all this functionality yourself - ugly!
The GC kicks in on allocations (not when memory becomes free). On Xbox 360 it kicks in for every 1MB allocated and is very slow. On Windows it is a bit more complicated - but also doesn't have such a huge impact on performance.
3) C# delegates are pretty damn fast. And faster than most people expect. They are about on-par with method calls on interfaces. Here and here are questions that provide more detials about delegate performance in C#.
I couldn't say how they compare to the C++ options. You'd have to measure it.
4) No. I'm fairly sure this code will produce identical IL. You could disassemble it and check, or profile it, though.
I might add - without checking myself - I suspect that having an event myDelegate will be slower than a plain myDelegate if you don't need all the magic of event.

Does functional programming take up more memory?

Warning! possibly a very dumb question
Does functional programming eat up more memory than procedural programming?
I mean ... if your objects(data structures whatever) are all imutable. Don't you end up having more object in the memory at a given time.
Doesn't this eat up more memory?
It depends on what you're doing. With functional programming you don't have to create defensive copies, so for certain problems it can end up using less memory.
Many functional programming languages also have good support for laziness, which can further reduce memory usage as you don't create objects until you actually use them. This is arguably something that's only correlated with functional programming rather than a direct cause, however.
Persistent values, that functional languages encourage but which can be implemented in an imperative language, make sharing a no-brainer.
Although the generally accepted idea is that with a garbage collector, there is some amount of wasted space at any given time (already unreachable but not yet collected blocks), in this context, without a garbage collector, you end up very often copying values that are immutable and could be shared, just because it's too much of a mess to decide who is responsible for freeing the memory after use.
These ideas are expanded on a bit in this experience report which does not claim to be an objective study but only anecdotal evidence.
Apart from avoiding defensive copies by the programmer, a very smart implementation of pure functional programming languages like Haskell or Standard ML (which lack physical pointer equality) can actively recover sharing of structurally equal values in memory, e.g. as part of the memory management and garbage collection.
Thus you can have automatic hash consing provided by your programming language runtime-system.
Compare this with objects in Java: object identity is an integral part of the language definition. Even just exchanging one immutable String for another poses semantic problems.
There is indeed at least a tendency to regard memory as affluent ressource (which, in fact, it really is in most cases), but this applies to modern programming as a whole.
With multiple cores, parallel garbage collectors and available RAM in the gigabytes, one used to concentrate on different aspects of a program than in earlier times, when every byte one could save counted. Remember when Bill Gates said "640K should be enough for every program"?
I know that I'm a lot late on this question.
Functional languages does not in general use more memory than imperative or OO languages. It depends more on the code you write. Yes F#, SML, Haskell and such has immutable values (not variables), but for all of them it goes without saying that if you update f.x. a single linked list, it re-compute only what is necessary.
Say you got a list of 5 elements, and you are removing the first 3 and adding a new one in front of it. it will simply get the pointer that points to the fourth element and let the new list point to that point of data i.e. reusing data. as seen below.
old list
[x0,x1,x2]
\
[x3,x4]
new list /
[y0,y1]
If it was an imperative language we could not do this because the values x3 and x4 could very well change over time, the list [x3,x4] could change too. Say that the 3 elements removed are not used afterward, the memory they use can be cleaned up right away, in contrast to unused space in an array.
That all data are immutable (except IO) are a strength. It simplifies the data flow analysis from a none trivial computation to a trivial one. This combined with a often very strong type system, will give the compiler a bunch of information about the code it can use to do optimization it normally could not do because of indicability. Most often the compiler turn values that are re-computed recursively and discarded from each iteration (recursion) into a mutable computation. These two things gives you the proof that if your program compile it will work. (with some assumptions)
If you look at the language Rust (not functional) just by learning about "borrow system" you will understand more about how and when things can be shared safely. it is a language that is painful to write code in unless you like to see your computer scream at you that your are an idiot. Rust is for the most part the combination of all the study made of programming language and type theory for more than 40 years. I mention Rust, because it despite the pain of writing in it, has the promise that if your program compile, there will be NO memory leaking, dead locking, dangling pointers, even in multi processing programs. This is because it uses much of the research of functional programming language that has been done.
For a more complex example of when functional programming uses less memory, I have made a lexer/parser interpreter (the same as generator but without the need to generate a code file) when computing the states of the DFA (deterministic finite automata) it uses immutable sets, because it compute new sets of already computed sets, my code allocate less memory simply because it borrow already known data points instead of copying it to a new set.
To wrap it up, yes functional programming can use more memory than imperative once. Most likely it is because you are using the wrong abstraction to mirror the problem. i.e. If you try to do it the imperative way in a functional language it will hurt you.
Try this book, it has not much on memory management but is a good book to start with if you will learn about compiler theory and yes it is legal to download. I have ask Torben, he is my old professor.
http://hjemmesider.diku.dk/~torbenm/Basics/
I'll throw my hat in the ring here. The short answer to the question is no, and this is because immutability does not mean the same thing as stored in memory. For example, let's take this toy program :
x = 2
x = x * 3
x = x * 2
print(x)
Which uses mutation to compute new values. Compare this to the same program which does not use mutation:
x = 2
y = x * 3
z = y * 2
print(z)
At first glance, it appears this requires 3x the memory of the first program! However, just because a value is immutable doesn't mean it needs to be stored in memory. In the case of the second program, after y is computed, x is no longer necessary, because it isn't used for the rest of the program, and can be garbage collected, or removed from memory. Similarly, after z is computed, y can be garbage collected. So, in principle, with a perfect garbage collector, after we execute the third line of code, I only need to have stored z in memory.
Another oft-worried about source of memory consumption in functional languages is deep recursion. For example, calculating a large Fibonacci number.
calc_fib(x):
if x > 1:
return x * calc_fib(x-1)
else:
return x
If I run calc_fib(100000), I could implement this in a way which requires storing 100000 values in memory, or I could use Tail-Call Elimination (basically storing only the most-recently computed value in memory instead of all function calls). For less straightforward recursion you can resort to trampolining. So for functional languages which support this, recursion does not need to be a source of massive memory consumption, either. However, not all nominally functional languages do (for example, JavaScript does not).

Resources