As I understand it, when a managed language (like Haxe) can and wants to compiles to a non-managed language (like C++), it includes some form of garbage collector in the runtime.
I was wondering if it would be possible to completely abstract away memory management in the intermediate representation / abstract syntax tree, so that a garbage collector would not be needed and the default behavior (stack allocations live until end of scope and heap allocations live until freed) could be used?
Thank you!
If I understood you correctly, you're asking whether it's possible to take a garbage collected language and compile it to an equivalent program in a non-garbage collected language without introducing memory errors or leaks, just by adding frees in the right places (i.e. no reference counting or otherwise keeping track of references or implementing a garbage collection algorithm in anyway or doing anything else at run time that could be considered garbage collection).
No, that is not possible. To do something like this, you'd have to be able to statically answer the question "What's the point in the program, after which a given object is no longer referenced", which is a non-trivial semantic property and thus undecidable per Rice's theorem.
You could define a sufficiently restricted subset of the language (something like "only one live variable may hold a strong reference to an object at a time and anything else must use weak references"), but programming in that subset would be so different from programming in the original language¹ that there wouldn't be much of a point in doing that.
¹ And perhaps more importantly: it would be highly unlikely that existing code would conform to that subset. So if there's a compiler that can compile my code to efficient GC-free native code, but only if I completely re-write my code to fit an awkward subset of the language, why wouldn't I just re-write the project in Rust instead? Especially since interop with libraries that aren't written in the subset would probably be infeasible as well.
Retain counts are the way in which memory is managed in Objective-C. When you create an object, it has a retain count of 1. When you send an object a retain message, its retain count is incremented by 1, which we know that ARC does it automatically but how it does what is the technique it use??
And I still wonder if memory management is done automatically then why sometimes we get bad access error for objects allocations or retrieval.
I have already gone through this link:- https://developer.apple.com/library/ios/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/MemoryMgmt.html
I think ARC (done by the compiler at compilation time, by inserting retain/release command where 'necessary') relies on the scope of variable, the block of code where they are defined (i.e. initialized) and if its value is stored in another variable whose scope is broader than the initial variable's scope.
That's why you have to declare more precisely the type of variable access and storage: to inform the compiler your intentions with a variable.
But I think too that ARC can't see further than the current file. ARC is more tricky with global variables and inter-files dependencies.
So, Apple a more complex variable's declaration grammar to replace a very simple (IMO) retain/release pattern. So developers don't have to worry about memory management.
That enable Apple ecosystem to be accessible to a lot more developers used to managed languages (like web developers) to develop for iOS.
I think it is a mistake to make developers believe you can develop efficiently without having to understand such a fundamental concept in IT as memory management.
But more developers for iOS means more programs developed and a more stable ecosystem in term of activity, so more revenues for Apple :-)
You would be better off reading the ARC docs: https://developer.apple.com/library/ios/releasenotes/ObjectiveC/RN-TransitioningToARC/Introduction/Introduction.html
ARC will manage the memory for you, but it cannot stop you from writing programming errors such as only holding weak references to an object.
I am working on an iPhone app which does some video processing, I had to include to classes that are not ARC compliant ( dealloc , releasing stuff ). So I manually went and made them arc compliant.
Later on I discovered the compiler flag for any class that can make it non ARC in an ARC project -fno-objc-arc.
My question is, if I do flag those classes with the compiler flag, what are the reprecussions of this? performance hit? is it a good idea? my app iOS 5.0 and up. I couldn't find any resources that talk about pros and cons of doing this.
You ask:
If I do flag those classes with the compiler flag, what are the repercussions of this?
The only thing I believe you need to worry about is to make sure that the non-ARC library follows Cocoa naming conventions associated with memory management (e.g. only return objects with +1 retainCount if the name begins with alloc, new, copy, or mutableCopy). Otherwise your ARC won't be able properly manage the resulting object. Most well written classes will conform to this pattern, so you should be perfectly ok using the fno-objc-arc flag, but it depends entirely upon the class in question.
[Is there a] performance hit?
There are no practical performance issues.
[Is] it a good idea?
All things being equal, I generally like to convert the code to ARC. A couple of situations where I might refrain from converting:
It is a library for which there is active development, and if I create my own personal ARC fork, I'll lose out on the future revisions of the library.
The library is incredibly complex and/or has constructs that are not easily converted to ARC.
Bottom line, if I can convert to ARC, I will. Usually in this process, I'll do the necessary testing to make sure I'm comfortable with the library, that there are no leaks, etc., so it's a productive (if annoying) process to go through. We're all responsible for the code we include in our projects and I don't think one should ever integrate code without going through some due diligence that is a natural by-product of an ARC-conversion and testing process.
If I convert to ARC, I offer to contribute the conversion back to the original author (e.g. via a GitHub "pull request" or whatever mechanism the author is open to) so it can be integrated into the code base.
At first glance, there are no performance issues with using or disusing ARC. ARC is basically normal reference counting, it's just not the programmer who inserts the release calls, but the compiler.
Regarding Delphi memory management, what are your design strategies ?
What are the use cases where you prefer to create and release Objects manually ?
What are the uses cases where Interfaces, InterfacedObjects, and their reference counting mechanism will be preferred ?
Do you have identified some traps or difficulties with reference counted objects ?
Thanks for sharing your experience here.
Whenever you share objects between threads it is better to use interfaces. A shared object doesn't necessarily have one identifiable owner, so letting the thread that gives up the last reference to the interface free the implementing object is a natural fit. See the OmniThreadLibrary for a good example of how to make use of interfaces both for design and for overcoming some of the complicated ownership issues in multi-threaded code.
You should always prefer interfaces unless it's not possible due to VCL restrictions. I suspect that, had interfaces been available in Delphi 1.0, the VCL would have turned out very differently.
One minor consideration is to watch out for reference cycles. If A holds an interface to B and B holds an interface to A, they will both live forever.
Recently, we received a bug report from one of our users: something on the screen was displayed incorrectly in our software. Somehow, we could not reproduce this in our development environment (Delphi 2007).
After some further study, it appears that this bug only manifests itself when "Code optimization" is turned on.
Are there any people here with experience in hunting down such a Heisenbug? Any specific constructs or coding bugs that commonly cause such an issue in Delphi software? Any places you would start looking?
I'll also just start debugging the whole thing in the usual way, but any tips specific to Optimization-related bugs (*) would be more than welcome!
(*) Note: I don't mean to say that the bug is caused by the optimizer; I think it's much more likely some wonky construct in the code is somehow pushed "over the edge" by the optimizer.
Update
It seems the bug boils down to a record being fully initialized with zeros when there's no code optimization, and the same record containing some random data when there is optimization. In this case, the random data seems to cause an enum type to contain invalid data (to my great surprise!).
Solution
The solution turned out to involve an unitialized local record variable somewhere deep in the code. Apparently, without optimization the record was reset (heap?), and with optimization turned on, the record was filled with the usual garbage. Thanks to you all for your contributions --- I learned a lot along the way!
Typically bugs of this form are caused by invalid memory access (reading uninitialised data, reading off the end of a buffer...) or thread race conditions.
The former will be affected by optimisations causing data layout to be rearranged in memory, and/or possibly by debug code that initialises newly allocated memory to some value; causing the incorrect code to "accidentally work".
The latter will be affected due to timings changing between optimisation levels. The former is generally much more likely.
If you have some automated way of making freshly allocated memory be filled with some constant value before it is passed to the program, and this makes the crash go away or become reproducible in the debug build, that'll provide a good point to start chasing things.
Could very well be a memory vs register issue: you programm running fine relying on memory persistence after a free.
I would recommend running your application with FastMM4 in full debug mode to be sure of your memory management.
Another (not free) tool which can be very useful in a case like this is Eurekalog.
Another thing that I've seen: a crash with the FPU registers being botched when calling some outside code (DLL, COM...) while with the debugger everything was OK.
A record that contains different data according to different compiler settings tells me one thing: That the record is not being explicitly initialised.
You may find that the setting of the compiler optimization flag is only one factor that might affect the content of that record - with any uninitialised data structures the one thing that you can rely on is that you can't rely on the initial content of the structure.
In simple terms:
class member data is initialised (to zero's) for new instances of the class
local variables (in functions and procedures) and unit variables are NOT initialised except in a few specific cases: interface references, dynamic arrays and strings and I think (but would need to check) records if they contain one or more fields of those types that would be initialised (strings, interface references etc).
The question as stated is now a little misleading because it seems you found your "Heisenberg" fairly easily enough. Now the issue is how to deal with it, and the answer is simply to explicitly initialise your record so that you aren't reliant on whatever behaviour or side-effect of the compiler is sometimes taking care of that for you and sometimes not.
Especially in purely native languages, like Delphi, you should be more than careful not to abuse the freedom to be able to cast anything to anything.
IOW: One thing, I have seen is that someone copies the definition of a class (e.g. from the implementation section in RTL or VCL) into his own code and then cast instances of the original class to his copy.
Now, after upgrading the library where the original class came from, you might experience all kinds of weird stuff. Like jumping into the wrong methods or bufferoverflows.
There's also the habit of using signed integer as pointers and vice-versa. (Instead of cardinal)
this works perfectly fine as long as your process has only 2GB of address space. But boot with the /3GB switch and you will see a lot of apps that start acting crazy. Those made the assumption of "pointer=signed integer" at least somewhere.
Your customer uses a 64Bit Windows? Chances are, he might have a larger address space for 32Bit apps. Pretty tough to debug w/o having such a test system available.
Then, there's race conditions.
Like having 2 threads, where one is very, very slow. So that you instinctively assume it will always be the last one and so there's no code that handles the scenario where "Captn slow" finishes first.
Changes in the underlying technologies can make these assumptions very wrong, very fast indeed.
Take a look at the upcoming breed of Flash-based super-mega-fast server storage.
Systems that can read and write Gigabytes per second. Applications that assume the IO stuff to be significantly slower than some calculations on in-memory values will easily fail on this kind of fast storage.
I could go on and on, but I gotta run right now...
Cheers
Code optimization does not mean necessarily that debug symbols have to be left out. Do a debug build with code optimization, then you can still debug the program and maybe the error occurs now.
One easy thing to do is Turn on compiler warning and hint, rebuild project and then fix all warnings/hints
Cheers
If it Delphi businesscode, with dataaware components etc, the follow might not apply.
I'm however writing machine vision code which is a bit computational. Most of the unittests are console based. I also am involved with FPC, and over the years have tested a lot with FPC. Partially out of hobby, partially in desperate situations where I wanted any hunch.
Some standard tricks that I tried (decreasing usefulness)
use -gv and valgrind the code (practically this means applications are required to run on Linux/FreeBSD. But for computational code and unittests that can be doable)
compile using fpc param -gt (=trash local vars, randomize local vars on procedure init)
modify heapmanager to randomize data of blocks it puts out (also applyable to Delphi code)
Try FPC's range/overflow checking and compiler hints.
run on a Mac Mini (powerpc) or win64. Due to totally different rules and memory layouts it can catch pretty funky things.
The 2 and 3 together nearly allow you to find most, if not all initialization problems.
Try to find any clues, and then go back to Delphi and search more focussed, debug etc.
I do realize this is not easy. I have a lot of FPC experience, and didn't have to find everything out from scratch for these cases. Still it might be worth a try, and might be a motivation to start setting up non-visual systems and unittests FPC compatible and platform independant. Most of this work will be needed anyway, seeing the Delphi roadmap.
In such problems i always advice to use logfiles.
Question: Can you somehow determine the incorrect display in the sourcecode?
If not, my answer wont help you.
If yes, check for the incorrectness, and as soon as you find it, dump the stack to a logfile. (see post mortem debugging for details about dumping and resymbolizing the stack).
If you see that some data has been corrupted, but you dont know how and then this happend, extract a function that does such a test for validity (with logging if failed), and call this function from more and more places over program execution (i.e. after each menu call). If you reiterate such a approach a few times you have good chances to find the problem.
Is this a local variable inside a procedure or function?
If so, then it lives on the stack, and will contain garbage. Depending on the execution path and compiler settings the garbage will change, potentially pushing your logic 'over the edge'.
--jeroen
Given your description of the problem I think you had uninitialized data that you got away with without the optimizer but which blew up with the optimization on.