Protobuf map keeps allocating new elements to arena - memory

I have code that updates the values of a protobuf map periodically. This code is simplified for clarity.
void my_periodically_called_function() {
my_protobuf_map->Clear();
MyObject obj;
obj.set_value(data);
my_protobuf_map['my_key'] = obj;
}
What happens is that the program memory keeps growing every iteration. After digging through protobuf's map.h it seems that after clearing the map and re-adding elements, [] will just allocate more data to the arena (without clearing any older data) which is obviously undesirable.
What is the most protobuf friendly way to resolve this? I want a good way to be able to delete specific memory from the arena.
An easy way to fix the problem would be to remove the Clear() but I'd like to keep that to avoid weird bugs with old state persisting.
Thanks in advance.

The way protobuf C++ library implements arena allocation, there is no way to free an individual piece of memory. Instead, all of it is freed at once by freeing the whole arena.
The main point of arena allocator is to improve speed by making allocation a constant-time operation (it just increments a pointer).
In your case, it sounds like you'll either want to periodically free the arena and reconstruct the message, or otherwise use the heap allocator which handles freeing memory.

Related

Does dart reuse memory for previously used instances?

It is hard to find a good heading for this, but i think my problem comes clear if i post a small code snipped:
SomeObject instance = SomeObject(importantParameter);
// -> "instance" is now a reference to the instance somewhere in RAM
instance = SomeObject(anotherImportantParameter);
// -> "instance" is now a reference to a different instance somewhere in RAM
My question is now, is the used RAM that was allocated at the first construction reused at the second construction? Or is the RAM of the first instance marked as unused for the garbage collector and the second construction is done with a completely new instance with a different portion of RAM?
If the first is true, what with this:
while(true) {
final SomeObject instance = SomeObject(importantParameter);
}
Will then, each time the while is repeated, the RAM be reused?
It's unspecified. The answer is a resounding "maybe".
The language specification never says what happens to unreachable objects, since it's unobservable to the program. (That's what being unreachable means).
In practice, the native Dart implementation uses a generational garbage collector.
The default behavior would be to allocate a new object in "new-space" and overwrite the reference to the previous object. That makes the previous object unreachable (as long as you haven't store other references to it), and it can therefore be garbage collected. If you really go through objects quickly, that will be cheap, since the unreachable object is completely ignored on the next new-space garbage collection.
Allocating a lot of short-lived objects still has an overhead since it causes new-space GC to happen more often, even if the individual objects don't themselves cost anything.
There is also a number of optimization that may change this behavior.
If your object is sufficiently simple and the compiler can see that no reference to it ever escapes, or is used in an identical check, or ... any other number of relevant restrictions, then it might "allocation sink" the object. That means it never actually allocates the object, it just stores the contents somewhere, perhaps even on the stack, and it also inlines the methods so they refer to the data directly instead of going through a this pointer.
In that case, your code may actually reuse the memory of the previous object, because the compiler recognizes that it can.
Do not try to predict whether an optimization like this happens. The requirements can change at any time. Just write code that is correct and not unnecessarily complex, then the compiler will do its best to optimize in all the ways that it can.

How can we decide whether we should use autoreleasepool?

Since Apple's API is not opened source nor it is mentioned in the documentation, when writing in Swift, we have no way, to know whether the returned object is an autorelease objective-c object.
Hence, it becomes unclear when we should use autoreleasepool
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmAutoreleasePools.html#//apple_ref/doc/uid/20000047-1041876
If you write a loop that creates many temporary objects.
You may use an autorelease pool block inside the loop to dispose of
those objects before the next iteration. Using an autorelease pool
block in the loop helps to reduce the maximum memory footprint of the
application.
Without autoreleasepool
for ... {
FileManager.default.copyItem
CGImageSourceCreateWithURL
CGImageSourceCopyPropertiesAtIndex
CGImageSourceCreateThumbnailAtIndex
CGImageDestinationCreateWithURL
CGImageDestinationFinalize
}
With autoreleasepool
for ... {
autoreleasepool {
FileManager.default.copyItem
CGImageSourceCreateWithURL
CGImageSourceCopyPropertiesAtIndex
CGImageSourceCreateThumbnailAtIndex
CGImageDestinationCreateWithURL
CGImageDestinationFinalize
}
}
I try to run an intensive loop over the above 2 code for comparison purpose.
I found no significant difference in their memory usage pattern, based on XCode memory report.
I was wondering, what are some good guideline/ thought process, to decide whether we should apply autoreleasepool throughout our code?
I have such concern, as recently I saw autoreleasepool is required in code which involves FileHandle.read - https://stackoverflow.com/a/42935601/72437
Use FileManager to copy item doesn't have a huge payload. And Image I/O you're using will save a lot of memory during the I/O process. In addition, apple's image api will have caches for the same file.
That's why your code won't have a significant difference. Because you didn't make any memory payload.
You could try another way to validate the usage of autoreleasepool. And I can assure that it will have a tremendous difference.
Use for-loop(10000 times) to generate random strings (longer is better), and use each string to transform an UTF-8 data in each loop. Then see different memory growth from the with or without autoreleasepool case.
Try it out.

How iOS handles memory corruption from low level C/ Objective-C code?

I have an iOS app that uses some legacy low level memory manipulation code using pointers. I'm debugging an issue where multiple threads are causing multiple copies of these code to be executed on global variables simultaneously and cause memory corruption by writing invalid length or overwriting data.
The effect is that the length of the buffer below may change. I've seen iOS throw EXC_BAD_ACCESS or EXC_BREAKPOINT as a result of these calls.
My question is - would iOS always throw exceptions when I use memcpy incorrectly, or will it complain only when I write outside my allowed memory?
In other words, is my code free to corrupt my memory and create invisible issues, without causing exceptions, as long as it does not step outside allocated memory or access deallocated memory?
NSData* buffer = ...
Byte *array = (Byte*)malloc(buffer.length);
memcpy(array, buffer.bytes, buffer.length);
The last time I checked or had such an issue it would only complain when outside your own allowed memory. In my situation I was writing over whatever objects followed the address I was trying to write to. The result was what seemed a random crash on objects as unrelated as even NSString.
My mistake was something like the following:
MyStructure *myStructure = malloc(sizeof(myStructure)); // Incorrect
MyStructure *myStructure = malloc(sizeof(MyStructure)); // Fixed
A simple autocomplete error which led to days long hunting of this bug. MyStructure was a fairly big one so accessing some property (both read and write) would in this case simply overflow and read/write through whatever was after it. It eventually randomly crash; sometimes bad access, other times just some random exception on a random object.

If using an autoreleasepool to control references to a realm, is it good practice to call realm.invalidate before leaving the pool

I have an application that does a lot of background reading of a realm, during which time, another background thread (i.e. not the main thread) may be writing to the same realm, so I am using an autoreleasepool on the background threads to ensure the thread reference to the realm is reclaimed quickly. See excerpt below
autoreleasepool {
do {
let backgroundRealm = try Realm(configuration: self.configuration)
.... Do lots of reading
backgroundRealm.beginWrite()
.... Do lots of writing here
try backgroundRealm.commitWrite()
// Is this good practice or not?
backgroundRealm.invalidate()
}
catch {
....
}
}
From reading the documentation Using a realm across threads and inWriteTransaction, it is not clear if after the commitWrite() and/or before leaving the autoreleasepool, would a call to backgroundRealm.invalidate() help keep file sizes down and improve performance? Does this implicitly happen when the realm is reclaimed behind the scenes? Would the call to invalidate() only waste CPU cycles and provide no additional benefits?
Would a call to backgroundRealm.invalidate() help keep file sizes down and improve performance?
No. invalidate() has no impact on the file size. If you want to keep the file size down, you would need to use writeCopyToURL(_:, encryptionKey:_, error: _) to write a compacted copy. But there is no convenience method for an in-place compact, which would require to invalidate all accessors across threads.
Does this implicitly happen when the realm is reclaimed behind the scenes?
It wouldn't be necessary. A Realm is deallocated, when there isn't any acccessor left keeping a hold off it anymore. So there is nothing left to be invalidated.
Would the call to invalidate() only waste CPU cycles and provide no additional benefits?
As long as you don't leak accessors from your autoreleasepool, you should be fine. Calling invalidate() might help if you leak objects to locate these later at runtime. But take care: when you access an invalidated object, it will fail.

Using C++-objects in Objective-C-blocks

I have a C++-object obj which I want to access within a block:
MyCppClass obj;
void(^myBlock)() = ^{
obj.test();
};
The problem here is that obj gets copied into the block, but I want to use a reference to the original object. I could use a pointer instead:
MyCppClass obj;
MyCppClass *objP = &obj;
void(^myBlock)() = ^{
objP->test();
};
But I think dereferencing pointers takes more time than using references. (Performance is critical, because in my project such a block is called for each pixel of a large image.)
How can I access the object within the block?
Dereferencing a pointer is faster than a function call or a C++ method call,which involves several pointer dereferences to look up the method in the class, and then an actual function call, which actually allocates memory on the stack, saves away some registers to RAM, and then does the reverse when it returns. So don't worry about performance here. If the profiler shows you that a single pointer de-reference causes performance issues, you have bigger problems, and you should probably drop down to hand-crafted assembler, and forget using C++ classes or blocks. Also, it's a very rare thing.
Use a pointer, or a C++ reference (the latter is the same under the hood, but it's harder to shoot yourself in the foot with it, and code stays more readable). Just keep in mind that this is only a solution if you are going to use the block in the same function as (or a function called by) the current function.
Because the C++ object is on the stack, it will go away, so if whoever you hand the block to copies it and calls it later, the pointer will be invalid. In the best case you will crash. In the worst case, the spot on the RAM chip that was used for that object is already in use for something else again, and you'll overwrite some other bit of memory, and cause a seemingly unrelated crash.
So, if you want to keep the block around, you'll have to do 'objP = new MyCppClass' to create the object, and later do 'delete objP' (maybe in the last call to the block, if you can detect that and don't need the object after that) to get rid of it.

Resources