When to unref a GVariant that has a floating reference? - glib

https://developer.gnome.org/glib/unstable/glib-GVariant.html#g-variant-ref-sink
I have read the above glib manual which says: "GVariant uses a floating reference count system. All functions with names starting with g_variant_new_ return floating references." But where is the actual description of what a floating reference count is? I couldn't find a comprehensive description of it.
In particular I want to understand when there is a need to unreference a variant and when not to. For example:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
I think I don't need to unreference a_v because it is consumed by the second g_variant_new. Is that correct?
Do I need to unreference another_v (assuming another_v is not passed to anything else from that point on)?
Where is this documented? (I think I have the right understanding by inferring from different examples found during search but can't seem to find the official glib documentation that explains this clearly).

There is a section on floating references in the GObject reference manual which goes into a bit more detail. Floating references may seem a bit obscure, but they are really very useful for C so taking a few minutes to really understand them is a good idea.
I'm going to assume you understand how reference counting work—if not there is a lot of documentation out there, take a few minutes and read up on that first.
First, lets look at what would happen with your example if g_variant_new_boolean returned a regular reference. When you first get the value, the reference count would be 1. When you pass it to g_variant_new, g_variant_new will increase the reference count to 2. At some point I assume you'll dispose of another_v, at which point the reference count for a_v will drop to 1… but remember, the memory isn't released until the reference count reaches 0.
In order to get around this you have two options. The first is to make g_variant_new steal the caller's reference, which basically sucks as a solution. You give away your reference when you call g_variant_new (or any similar function), so in the future you need to manually ref a_v every time you want to pass it to something else.
The other option is to just unref it manually when you're done. It's not the end of the world, but it's easy to forget to do or get wrong (like by forgetting to unref it in an error path).
What GVariant does instead is return a "floating" ref. The easiest way to think of it (IMHO) is that the first time g_variant_ref gets called it doesn't really do anything—it just "sinks" the floating ref. The reference count goes from 1 to 1. Subsequent calls to g_variant_ref, however, will increase the reference count.
Now lets look at what actually happens with your example. g_variant_new_boolean returns a floating reference. You then pass it to g_variant_new, which calls g_variant_ref, which sinks the floating reference. The reference count is now 1, and when another_v's refcount reaches 0 a_v's refcount will be decremented, in this case reaching 0 and everything will be freed. No need for you to call g_variant_unref.
The cool part about floating references, though, is what happens with something like this:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
GVariant *yet_another_v = g_variant_new("v", a_v);
When g_variant_new is called the second time a_v's refcount will increment again (to 2). No need to call g_variant_ref before passing a_v to g_variant_new a second time—the first call looks just like the first, and consistency is a very nice feature in an API.
At this point it's probably obvious, but yes, you do need to call g_variant_unref on another_v (and, in that last example, yet_another_v).

The reference counting system is explained in the manual of GObject, in particular, in the section Object Memory Management.
When to use it might depend on your application (how the ownership of the variables will work).
The idea is similar to the way i-node works in Unix/Linux when handling files. A file is an object, located in a specific block in the storage. Whenever you create symlink to that file, the file is owned by one extra file (the reference counting increases). Whenever you remove a symlink, the reference counting decreases. When there is nothing owning the object, then it can be destroyed (or the space can be given back to the system).
If you destroy an object, and nothing is linking that object, you cannot use it anymore. If your object might have multiple owners, then you might want to use reference counting, so when one of these owners remove a counter, the object does not get destroyed... no until the last of the owners destroy it.

There is a section on floating references in the GObject reference manual which goes into a bit more detail. Floating references may seem a bit obscure, but they are really very useful for C so taking a few minutes to really understand them is a good idea.
I'm going to assume you understand how reference counting work—if not there is a lot of documentation out there, take a few minutes and read up on that first.
First, lets look at what would happen with your example if g_variant_new_boolean returned a regular reference. When you first get the value, the reference count would be 1. When you pass it to g_variant_new, g_variant_new will increase the reference count to 2. At some point I assume you'll dispose of another_v, at which point the reference count for a_v will drop to 1… but remember, the memory isn't released until the reference count reaches 0.
In order to get around this you have two options. The first is to make g_variant_new steal the caller's reference, which basically sucks as a solution. You give away your reference when you call g_variant_new (or any similar function), so in the future you need to manually ref a_v every time you want to pass it to something else.
The other option is to just unref it manually when you're done. It's not the end of the world, but it's easy to forget to do or get wrong (like by forgetting to unref it in an error path).
What GVariant does instead is return a "floating" ref. The easiest way to think of it (IMHO) is that the first time g_variant_ref gets called it doesn't really do anything—it just "sinks" the floating ref. The reference count goes from 1 to 1. Subsequent calls to g_variant_ref, however, will increase the reference count.
Now lets look at what actually happens with your example. g_variant_new_boolean returns a floating reference. You then pass it to g_variant_new, which calls g_variant_ref, which sinks the floating reference. The reference count is now 1, and when another_v's refcount reaches 0 a_v's refcount will be decremented, in this case reaching 0 and everything will be freed. No need for you to call g_variant_unref.
The cool part about floating references, though, is what happens with something like this:
GVariant *a_v = g_variant_new_boolean(TRUE);
GVariant *another_v = g_variant_new("v", a_v);
GVariant *yet_another_v = g_variant_new("v", a_v);
When g_variant_new is called the second time a_v's refcount will increment again (to 2). No need to call g_variant_ref before passing a_v to g_variant_new a second time—the first call looks just like the first, and consistency is a very nice feature in an API.
At this point it's probably obvious, but yes, you do need to call g_variant_unref on another_v (and, in that last example, yet_another_v).

Related

Get interation number in UMAT

Although it is easy to get the current step or increment number (variables KSTEP and KINC), I can't find an easy way to know the iteration number when inside the subroutine UMAT.
I know the following "if clause" will tell me if I'm in the first iteration of the first increment of the first step:
IF((KINC.EQ.1).AND.(SUM(STRAN+DSTRAN).EQ.0.0D0)) THEN
And I also know that I can get the iteration number writing to external files. However, is it possible to do it just inside the UMAT subroutine?
There is never really a reason to need to know the iteration number in a UMAT. If you think you need to know it, this is often a sign that you there is a better way to achieve what you want to know.
You can use a common block to track how often you enter a umat, and also which iteration you are on. But I really recommend against this. There is no good reason to know the iteration number. Unless your algorithm is perfect it will cause you more problems than it's worth.
Also in your code to check for the first increment - that will not tell you when you are in a real iteration, it will happen in PRE most likely.

Instantiation within loop - primitives and objects

To make this language agnostic let's pseudo code something along the lines of:
for(int i=0;i<=N;i++){
double d=0;
userDefinedObject o=new userDefinedObject();
//effectively do something useful
o.destroy();
}
Now, this may get into deeper details between Java/C++/Python etc, but:
1 - Is doing this with primitives wrong or just sort of ugly/overkill (d could be defined above, and set to 0 in each iteration if need be).
2 - Is doing this with an object actually wrong? Now,I know Java will take care of the memory but for C++ let's assume we have a proper destructor that we call.
Now - the question is quite succinct - is this wrong or just a matter of taste?
Thank you.
Java Garbage Collector will take care of any memory allocation without a reference, which means that if you instantiate on each iteration, you will allocate new memory and lose the reference to the previous one. Said this, you can conclude that the GC will take care of the non-referenced memory, BUT you also have to consider the fact that memory allocation, specifically object initialization takes time and process. So, if you do this on a small program, you're probably not going to feel anything wrong. But let say you're working with something like Bitmap, the allocation will totally own your memory.
For both cases, I'd say it is a matter of taste, but in a real life project, you should be totally sure that you need to initialize within a loop

CFMutableArray grows beyond its capacity

Considere I have a CFMutableArray object created with the following function call:
CFMutableArrayRef marray = CFArrayCreateMutable(kCFAllocatorDefault, 1, &kCFTypeArrayCallBacks);
According to the CFMutableArray Documentation, the second argument of CFArrayCreateMutable, which is called capacity, is "the maximum number of values that can be contained by the new array. The array starts empty and can grow to this number of values (and it can have less).
Pass 0 to specify that the maximum capacity is not limited. The value must not be negative."
However, if I append more than one value to my new array, it keeps growing. I mean, if the new array already has one value and I append a new one with CFArrayAppendValue(marray, newValue), this value is stored and the array count goes to 2, exceeding its capacity.
So, why this happens? Did I misunderstand the documentation?
Interesting. I don't think you're mis-understanding the documentation. I will point out that CFMutableArrayRef is "Toll Free Bridged" on iOS with NSMutableArray and therefore interchangeable. On iOS [NSMutableArray arrayWithCapacity] is NOT so limited and capacity is basically "guidance" and NOT a hard upper limit.
It might be worth filing a bug report on that, probably the docs are just wrong.
UPDATE: Just goes to show ya... always, always, always follow the maxim of the best docs are the source. I CMD-clicked on CFArrayCreateMutable to look at the comment in the source .h file.... guess I was right, because that says 'capacity' is a HINT that the implementation may ignore, as it apparently does in this case.
#function CFArrayCreateMutable
⋮
#param capacity A hint about the number of values that will be held by the CFArray.
Pass 0 for no hint. The implementation may ignore this hint,** or may use it to
optimize various operations. An array's actual capacity is only limited by
address space and available memory constraints). If this parameter is negative,
the behavior is undefined.
Don't forget, header comments are written by developers, whilst "docs" are written by some tech writer that doesn't have the same depth of knowledge.
From docs:
CFArrayAppendValue
Adds a value to an array giving it the new largest index.
Parameters
theArray
The array to which value is to be added. If theArray is a limited-capacity array and it is full before this operation, the behavior is undefined.
So, I think you should only make sure not to add a value to a full limited-capacity array, or the behavior will be undefined!
Or, you can pass 0 for the capacity parameter to specify that the maximum capacity is not limited.
Update:
As #Cliff Ribaudo pointed out, it seems there is a contradiction between the official documentation and code documentation:
#param capacity A hint about the number of values that will be held
by the CFArray. Pass 0 for no hint. The implementation may
ignore this hint, or may use it to optimize various
operations. An array's actual capacity is only limited by
address space and available memory constraints).
So we can assume the online documentation is outdated and the code documentation is possibly the right.

How to detect "dangling pointers" if "Assigned()" can't do it?

In another question, I found out that the Assigned() function is identical to Pointer <> nil. It has always been my understanding that Assigned() was detecting these dangling pointers, but now I've learned it does not. Dangling Pointers are those which may have been created at one point, but have since been free'd and haven't been assigned to nil yet.
If Assigned() can't detect dangling pointers, then what can? I'd like to check my object to make sure it's really a valid created object before I try to work with it. I don't use FreeAndNil as many recommend, because I like to be direct. I just use SomeObject.Free.
Access Violations are my worst enemy - I do all I can to prevent their appearance.
If you have an object variable in scope and it may or may not be a valid reference, FreeAndNil is what you should be using. That or fixing your code so that your object references are more tightly managed so it's never a question.
Access Violations shouldn't be thought of as an enemy. They're bugs: they mean you made a mistake that needs fixed. (Or that there's a bug in some code you're relying on, but I find most often that I'm the one who screwed up, especially when dealing with the RTL, VCL, or Win32 API.)
It is sometimes possible to detect when the address a pointer points to resides in a memory block that is on the heap's list of freed memory blocks. However, this requires comparing the pointer to potentially every block in the heap's free list which could contain thousands of blocks. So, this is potentially a computationally intensive operation and something you would not want to do frequently except perhaps in a severe diagnostic mode.
This technique only works while the memory block that the pointer used to point to continues to sit in the heap free list. As new objects are allocated from the heap, it is likely that the freed memory block will be removed from the heap free list and put back into active play as the home of a new, different object. The original dangling pointer still points to the same address, but the object living at that address has changed. If the newly allocated object is of the same (or compatible) type as the original object now freed, there is practically no way to know that the pointer originated as a reference to the previous object. In fact, in this very special and rare situation, the dangling pointer will actually work perfectly well. The only observable problem might be if someone notices that the data has changed out from under the pointer unexpectedly.
Unless you are allocating and freeing the same object types over and over again in rapid succession, chances are slim that the new object allocated from that freed memory block will be the same type as the original. When the types of the original and the new object are different, you have a chance of figuring out that the content has changed out from under the pointer. However, to do that you need a way to know the type of the original object that the pointer referred to. In many situations in native compiled applications, the type of the pointer variable itself is not retained at runtime. A pointer is a pointer as far as the CPU is concerned - the hardware knows very little of data types. In a severe diagnostic mode it's conceivable that you could build a lookup table to associate every pointer variable with the type allocated and assigned to it, but this is an enormous task.
That's why Assigned() is not an assertion that the pointer is valid. It just tests that the pointer is not nil.
Why did Borland create the Assigned() function to begin with? To further hide pointerisms from novice and occasional programmers. Function calls are easier to read and understand than pointer operations.
The bottom line is that you should not be attempting to detect dangling pointers in code. If you are going to refer to pointers after they have been freed, set the pointer to nil when you free it. But the best approach is not to refer to pointers after they have been freed.
So, how do you avoid referring to pointers after they have been freed? There are a couple of common idioms that get you a long way.
Create objects in a constructor and destroy them in the destructor. Then you simply cannot refer to the pointer before creation or after destruction.
Use a local variable pointer that is created at the beginning of the function and destroyed as the last act of the function.
One thing I would strongly recommend is to avoid writing if Assigned() tests into your code unless it is expected behaviour that the pointer may not be created. Your code will become hard to read and you will also lose track of whether the pointer being nil is to be expected or is a bug.
Of course we all do make mistakes and leave dangling pointers. Using FreeAndNil is one cheap way to ensure that dangling pointer access is detected. A more effective method is to use FastMM in full debug mode. I cannot recommend this highly enough. If you are not using this wonderful tool, you should start doing so ASAP.
If you find yourself struggling with dangling pointers and you find it hard to work out why then you probably need to refactor the code to fit into one of the two idioms above.
You can draw a parallel with array indexing errors. My advice is not to check in code for validity of index. Instead use range checking and let the tools do the work and keep the code clean. The exception to this is where the input comes from outside your program, e.g. user input.
My parting shot: only ever write if Assigned if it is normal behaviour for the pointer to be nil.
Use a memory manager, such as FastMM, that provides debugging support, in particular to fill a block of freed memory with a given byte pattern. You can then dereference the pointer to see if it points at a memory block that starts with the byte pattern, or you can let the code run normallly ad raise an AV if it tries to access a freed memory block through a dangling pointer. The AV's reported memory address will usually be either exactly as, or close to, the byte pattern.
Nothing can find a dangling (once valid but then not) pointer. It's your responsibility to either make sure it's set to nil when you free it's content, or to limit the scope of the pointer variable to only be available within the scope it's valid. (The second is the better solution whenever possible.)
The core point is that the way how objects are implemented in Delphi has some built-in design drawbacks:
there is no distinction between an object and a reference to an object. For "normal" variables, say a scalar (like int) or a record, these two use cases can be well told apart - there's either a type Integer or TSomeRec, or a type like PInteger = ^Integer or PSomeRec = ^TSomeRec, which are different types. This may sound like a neglectable technicality, but it isn't: a SomeRec: TSomeRec denotes "this scope is the original owner of that record and controls its lifecycle", while SomeRec: PSomeRec tells "this scope uses a transient reference to some data, but has no control over the record's lifecycle. So, as dumb it may sound, for objects there's virtually no one who has denotedly control over other objects' lifecycles. The result is - surprise - that the lifecycle state of objects may in certain situations be unclear.
an object reference is just a simple pointer. Basically, that's ok, but the problem is that there's sure a lot of code out there which treats object references as if they were a 32bit or 64bit integer number. So if e.g. Embarcadero wanted to change the implementation of an object reference (and make it not a simple pointer any more), they would break a lot of code.
But if Embarcadero wanted to eliminate dangling object pointers, they would have to redesign Delphi object references:
when an object is freed, all references to it must be freed, too. This is only possible by double-linking both, i.e. the object instance must carry a list with all of the references to it, that is, all memory addresses where such pointers are (on the lowest level). Upon destruction, that list is traversed, and all those pointers are set to nil
a little more comfortable solution were that the "one" holding such a reference can register a callback to get informed when a referenced object is destroyed. In code: when I have a reference FSomeObject: TSomeObject I would want to be able to write in e.g. SetSomeObject: FSomeObject.OnDestruction := Self.HandleDestructionOfSomeObject. But then FSomeObject can't be a pointer; instead, it would have to be at least an (advanced) record type
Of course I can implement all that by myself, but that is tedious, and isn't it something that should be addressed by the language itself? They also managed to implement for x in ...

Reading from a stack and memory allocation at compile time

Objects can be put on and removed only from the top of a stack. But what about reading and writing their values? Please correct me if I'm wrong, but I think process must be able to read from any part of the stack, since if only reading from the top was possible it would have to remove (and store somewhere) whole content of the stack above a variable it wants to examine. But in that case, how does the process know where exactly in the stack is a particular variable? I suspect it just holds a pointer to it, but where is that pointer stored?
Another thing - reading about stacks I often find phrases like "All memory allocated on the stack is known at compile time." Well, I probably misunderstand this, so please tell me where's the flaw in my logic:
Suppose a local variable is created when an if() statement is true, and isn't when it's false. Whether it's true will turn out at run time. So at compile time there's no way to know if it should be created, hence I wouldn't think memory for it is allocated at all, as it would be wasteful. Consequently, it isn't created/known at compile time.
At compile time, it's known how much space each type needs: An Integer, for instance, is 4 Bytes wide on 32 bit platforms, and a class with 2 Integers consumes 8 Bytes. Whether this space is allocated for a specific variable is not necessarily known (may depend on an if, as you stated).
When you invoke a method, all parameters and the return address are pushed onto the stack. To get one parameter, you walk up the stack up to its position, which is computed by the base pointer and the size of each parameter.
So it is not entirely true for this stack that you can access the top element only. It is, however, for the Stack data structure.

Resources