CFMutableArray grows beyond its capacity - ios

Considere I have a CFMutableArray object created with the following function call:
CFMutableArrayRef marray = CFArrayCreateMutable(kCFAllocatorDefault, 1, &kCFTypeArrayCallBacks);
According to the CFMutableArray Documentation, the second argument of CFArrayCreateMutable, which is called capacity, is "the maximum number of values that can be contained by the new array. The array starts empty and can grow to this number of values (and it can have less).
Pass 0 to specify that the maximum capacity is not limited. The value must not be negative."
However, if I append more than one value to my new array, it keeps growing. I mean, if the new array already has one value and I append a new one with CFArrayAppendValue(marray, newValue), this value is stored and the array count goes to 2, exceeding its capacity.
So, why this happens? Did I misunderstand the documentation?

Interesting. I don't think you're mis-understanding the documentation. I will point out that CFMutableArrayRef is "Toll Free Bridged" on iOS with NSMutableArray and therefore interchangeable. On iOS [NSMutableArray arrayWithCapacity] is NOT so limited and capacity is basically "guidance" and NOT a hard upper limit.
It might be worth filing a bug report on that, probably the docs are just wrong.
UPDATE: Just goes to show ya... always, always, always follow the maxim of the best docs are the source. I CMD-clicked on CFArrayCreateMutable to look at the comment in the source .h file.... guess I was right, because that says 'capacity' is a HINT that the implementation may ignore, as it apparently does in this case.
#function CFArrayCreateMutable
⋮
#param capacity A hint about the number of values that will be held by the CFArray.
Pass 0 for no hint. The implementation may ignore this hint,** or may use it to
optimize various operations. An array's actual capacity is only limited by
address space and available memory constraints). If this parameter is negative,
the behavior is undefined.
Don't forget, header comments are written by developers, whilst "docs" are written by some tech writer that doesn't have the same depth of knowledge.

From docs:
CFArrayAppendValue
Adds a value to an array giving it the new largest index.
Parameters
theArray
The array to which value is to be added. If theArray is a limited-capacity array and it is full before this operation, the behavior is undefined.
So, I think you should only make sure not to add a value to a full limited-capacity array, or the behavior will be undefined!
Or, you can pass 0 for the capacity parameter to specify that the maximum capacity is not limited.
Update:
As #Cliff Ribaudo pointed out, it seems there is a contradiction between the official documentation and code documentation:
#param capacity A hint about the number of values that will be held
by the CFArray. Pass 0 for no hint. The implementation may
ignore this hint, or may use it to optimize various
operations. An array's actual capacity is only limited by
address space and available memory constraints).
So we can assume the online documentation is outdated and the code documentation is possibly the right.

Related

Reorder Buffer: Pointer values when it is full?

The re-order buffer used in the Tomasulo's algorithm for out-of-order execution and branch speculation has 2 pointers.
Head pointer: Points to the oldest instruction. This will be the next
instruction to be committed.
Tail pointer: Points where the newest instruction will be added.
Each ROB entry has 4 fields: instruction type, destination, value and ready field.
I've seen many sources teaching it this way. The ROB entry doesn't need a committed bit or busy bit.
For a ROB of size N.
If the ROB is empty or flushed, then TAIL=HEAD.
If the ROB has one free slot and HEAD=0, then TAIL=N-1.
If the ROB is full and HEAD=0, then TAIL=?
If TAIL=HEAD how do we know if the ROB is full or empty?
Normally it's a circular buffer (indices wrap around); the almost-full state doesn't have to have HEAD=0, just an arbitrary position.
Yes, there's an ambiguity between full vs. empty if you use HEAD and TAIL read/write indices. As the wikipedia article explains, one way to solve that is by tracking used-capacity which can go from 0..n instead of only 0..n-1.
For thinking about CPUs, this is an implementation detail where it doesn't matter exactly how the hardware solves it; it's safe to assume that it correctly and efficiently implements a circular buffer that can use all its entries. (Or unlikely, the effective ROB size is 1 less than the physical ROB size.)
BTW, your model with HEAD=0 corresponds to adding instructions at the current tail, but retiring (committing) by shifting all entries down to 1 index lower instead of moving HEAD. In a circular buffer, the condition to watch for is HEAD = (TAIL-1)%n or something like that.

When to unref a GVariant that has a floating reference?

https://developer.gnome.org/glib/unstable/glib-GVariant.html#g-variant-ref-sink
I have read the above glib manual which says: "GVariant uses a floating reference count system. All functions with names starting with g_variant_new_ return floating references." But where is the actual description of what a floating reference count is? I couldn't find a comprehensive description of it.
In particular I want to understand when there is a need to unreference a variant and when not to. For example:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
I think I don't need to unreference a_v because it is consumed by the second g_variant_new. Is that correct?
Do I need to unreference another_v (assuming another_v is not passed to anything else from that point on)?
Where is this documented? (I think I have the right understanding by inferring from different examples found during search but can't seem to find the official glib documentation that explains this clearly).
There is a section on floating references in the GObject reference manual which goes into a bit more detail. Floating references may seem a bit obscure, but they are really very useful for C so taking a few minutes to really understand them is a good idea.
I'm going to assume you understand how reference counting work—if not there is a lot of documentation out there, take a few minutes and read up on that first.
First, lets look at what would happen with your example if g_variant_new_boolean returned a regular reference. When you first get the value, the reference count would be 1. When you pass it to g_variant_new, g_variant_new will increase the reference count to 2. At some point I assume you'll dispose of another_v, at which point the reference count for a_v will drop to 1… but remember, the memory isn't released until the reference count reaches 0.
In order to get around this you have two options. The first is to make g_variant_new steal the caller's reference, which basically sucks as a solution. You give away your reference when you call g_variant_new (or any similar function), so in the future you need to manually ref a_v every time you want to pass it to something else.
The other option is to just unref it manually when you're done. It's not the end of the world, but it's easy to forget to do or get wrong (like by forgetting to unref it in an error path).
What GVariant does instead is return a "floating" ref. The easiest way to think of it (IMHO) is that the first time g_variant_ref gets called it doesn't really do anything—it just "sinks" the floating ref. The reference count goes from 1 to 1. Subsequent calls to g_variant_ref, however, will increase the reference count.
Now lets look at what actually happens with your example. g_variant_new_boolean returns a floating reference. You then pass it to g_variant_new, which calls g_variant_ref, which sinks the floating reference. The reference count is now 1, and when another_v's refcount reaches 0 a_v's refcount will be decremented, in this case reaching 0 and everything will be freed. No need for you to call g_variant_unref.
The cool part about floating references, though, is what happens with something like this:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
GVariant *yet_another_v = g_variant_new("v", a_v);
When g_variant_new is called the second time a_v's refcount will increment again (to 2). No need to call g_variant_ref before passing a_v to g_variant_new a second time—the first call looks just like the first, and consistency is a very nice feature in an API.
At this point it's probably obvious, but yes, you do need to call g_variant_unref on another_v (and, in that last example, yet_another_v).
The reference counting system is explained in the manual of GObject, in particular, in the section Object Memory Management.
When to use it might depend on your application (how the ownership of the variables will work).
The idea is similar to the way i-node works in Unix/Linux when handling files. A file is an object, located in a specific block in the storage. Whenever you create symlink to that file, the file is owned by one extra file (the reference counting increases). Whenever you remove a symlink, the reference counting decreases. When there is nothing owning the object, then it can be destroyed (or the space can be given back to the system).
If you destroy an object, and nothing is linking that object, you cannot use it anymore. If your object might have multiple owners, then you might want to use reference counting, so when one of these owners remove a counter, the object does not get destroyed... no until the last of the owners destroy it.
There is a section on floating references in the GObject reference manual which goes into a bit more detail. Floating references may seem a bit obscure, but they are really very useful for C so taking a few minutes to really understand them is a good idea.
I'm going to assume you understand how reference counting work—if not there is a lot of documentation out there, take a few minutes and read up on that first.
First, lets look at what would happen with your example if g_variant_new_boolean returned a regular reference. When you first get the value, the reference count would be 1. When you pass it to g_variant_new, g_variant_new will increase the reference count to 2. At some point I assume you'll dispose of another_v, at which point the reference count for a_v will drop to 1… but remember, the memory isn't released until the reference count reaches 0.
In order to get around this you have two options. The first is to make g_variant_new steal the caller's reference, which basically sucks as a solution. You give away your reference when you call g_variant_new (or any similar function), so in the future you need to manually ref a_v every time you want to pass it to something else.
The other option is to just unref it manually when you're done. It's not the end of the world, but it's easy to forget to do or get wrong (like by forgetting to unref it in an error path).
What GVariant does instead is return a "floating" ref. The easiest way to think of it (IMHO) is that the first time g_variant_ref gets called it doesn't really do anything—it just "sinks" the floating ref. The reference count goes from 1 to 1. Subsequent calls to g_variant_ref, however, will increase the reference count.
Now lets look at what actually happens with your example. g_variant_new_boolean returns a floating reference. You then pass it to g_variant_new, which calls g_variant_ref, which sinks the floating reference. The reference count is now 1, and when another_v's refcount reaches 0 a_v's refcount will be decremented, in this case reaching 0 and everything will be freed. No need for you to call g_variant_unref.
The cool part about floating references, though, is what happens with something like this:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
GVariant *yet_another_v = g_variant_new("v", a_v);
When g_variant_new is called the second time a_v's refcount will increment again (to 2). No need to call g_variant_ref before passing a_v to g_variant_new a second time—the first call looks just like the first, and consistency is a very nice feature in an API.
At this point it's probably obvious, but yes, you do need to call g_variant_unref on another_v (and, in that last example, yet_another_v).

epoll_create and epoll_wait

I was wondering about the parameters of two APIs of epoll.
epoll_create (int size) - in this API, size is defined as the size of event pool. But, it seems that having more events than the size still works. (I've put the size as 2 and forced event pool to have 3 events... but it still works !?) Thus I was wondering what this parameter actually means and curious about the maximum value of this parameter.
epoll_wait (int maxevents) - for this API, the maxevents definition is straight-forward. However, I can see the lackness of information or advices on how to determin this parameter. I expect this parameter to be changed depending on the size of epoll event pool size. Any suggestions or advices will be great. Thank you!
1.
"man epoll_create"
DESCRIPTION
...
The size is not the maximum size of the backing store but just a hint
to the kernel about how to dimension internal structures. (Nowadays,
size is unused; see NOTES below.)
NOTES
Since Linux 2.6.8, the size argument is unused, but must be greater
than zero. (The kernel dynamically sizes the required data struc‐
tures without needing this initial hint.)
2.
Just determine an accurate number by yourself, but be aware that
giving it a small number may drop out the efficiency a little bit.
Because the smaller number assigned to "maxevent" , the more often you may have to call epoll_wait() to consume all the events, queued already on the epoll.

How can I avoid running out of memory with a growing TDictionary?

TDictionary<TKey,TValue> uses an internal array that is doubled if it is full:
newCap := Length(FItems) * 2;
if newCap = 0 then
newCap := 4;
Rehash(newCap);
This performs well with medium number of items, but if one gets to the upper limit it is very unfortunate, because it might throw an EOutOfMemory exception even if there is almost half of the memory still available.
Is there any way to influence this behaviour? How do other collection classes deal with this scenario?
You need to understand how a Dictionary works. A dictionary contains a list of "hash buckets" where the items you insert are placed. That's a finite number, so once you fill it up you need to allocate more buckets, there's no way around it. Since the assignment of objects-to-buckets is based on the result of a hash function, you can't simply add buckets to the end of the array and put stuff in there, you need to re-allocate the whole list of blocks, re-hash everything and put it in the (new) corresponding buckets.
Given this behavior, the only way to make the dictionary not re-allocate once full is to make sure it never gets full. If you know the number of items you'll insert in the dictionary pass it as a parameter to the constructor and you'll be done, no more dictionary reallocations.
If you can't do that (you don't know the number of items you'll have in the dictionary) you'll need to reconsider what made you select the TDictionary in the first place and select a data structure that offers better compromise for your particular algorithm. For example you could use binary search trees, as they do the balancing by rotating information in existing nodes, no need for re-allocations ever.

What happens when increasing the size of a dynamic array?

I'd like to understand what happens when the size of a dynamic array is increased.
My understanding so far:
Existing array elements will remain unchanged.
New array elements are initialised to 0
All array elements are contiguous in memory.
When the array size is increased, will the extra memory be tacked onto the existing memory block, or will the existing elements be copied to a entirely new memory block?
Does changing the size of a dynamic array have consequences for pointers referencing existing array elements?
Thanks,
[edit] Incorrect assumption struck out. (New array elements are initialised to 0)
Existing array elements will remain unchanged: yes
New array elements are initialized to 0: yes (see update) no, unless it is an array of compiler managed types such as string, an other array or a variant
All array elements are contiguous in memory: yes
When the array size is increased, the array will be copied. From the doc:
...memory for a dynamic array is reallocated when you assign a value to the array or pass it to the SetLength procedure.
So yes, increasing the size of a dynamic array does have consequences for pointers referencing existing array elements.
If you want to keep references to existing elements, use their index in the array (0-based).
Update
Comments by Rob and David prompted me to check the initialization of dynamic arrays in Delphi5 (as I have that readily available anyway). First using some code to create various types of dynamic arrays and inspecting them in the debugger. They were all properly initialized, but that could still have been a result prior initialization of the memory location where they were allocated. So checked the RTL. It turns out D5 already has the FillChar statement in the DynArraySetLength method that Rob pointed to:
// Set the new memory to all zero bits
FillChar((PChar(p) + elSize * oldLength)^, elSize * (newLength - oldLength), 0);
In practice Embarcadero will always zero initialise new elements simply because to do otherwise would break so much code.
In fact it's a shame that they don't officially guarantee the zero allocation because it is so useful. The point is that often at the call site when writing SetLength you don't know whether you are growing or shrinking the array. But the implementation of SetLength does know – clearly it has to. So it really makes sense to have a well-defined action on any new elements.
What's more, if they want people to be able to switch easily between managed and native worlds then zero allocation is desirable since that's what fits with the managed code.

Resources