I'll send large file to server. The file will be separated to chunks. I receive high memory consumption when I call FileHandle.readData(ofLength:). Memory for chunk don't deallocate, and after some time I receive EOM exception and crash.
Profiler show problem in FileHandle.readData(ofLength:) (see screenshots)
func nextChunk(then: #escaping (Data?) -> Void) {
self.previousOffset = self.fileHandle.offsetInFile
autoreleasepool {
let data = self.fileHandle.readData(ofLength: Constants.chunkLength)
if data == Constants.endOfFile {
then(nil)
} else {
then(data)
self.currentChunk += 1
}
}
}
The allocations tool is simply showing you where the unreleased memory was initially allocated. It is up to you to figure out what you subsequently did with that object and why it was not released in a timely manner. None of the profiling tools can help you with that. They can only point to where the object was originally allocated, which is only the starting point for your research.
One possible problem might be if you are creating Data-based URLRequest objects. That means that while the associated URLSessionTask requests are in progress, the Data is held in memory. If so, you might consider using a file-based uploadTask instead. That prevents the holding the Data associated with the body of the request in memory.
Once your start using file-based uploadTask, that begs the question as to whether you need/want to break it up into chunks at all. A file-based uploadTask, even when sending very large assets, requires very little RAM at runtime. And, at some future point in time, you may even consider using a background session, so the uploads will continue even if the user leaves the app. The combination of these features may obviate the chunking altogether.
As you may have surmised, the autoreleasepool may be unnecessary. That is intended to solve a very specific problem (where one create and release autorelease objects in a tight loop). I suspect your problem rests elsewhere.
Related
Say I have an app and I notice it has high memory usage. How do I determine WHAT is taking up all the memory in terms of specific object(s). Can I do this through the Xcode Memory Debugger somehow? Instruments?
Take this code example:
class RootViewController: UIViewController {
var image: UIImage?
override func viewDidLoad() {
super.viewDidLoad()
let data = try! Data(contentsOf: URL(string: "https://effigis.com/wp-content/uploads/2015/02/Airbus_Pleiades_50cm_8bit_RGB_Yogyakarta.jpg")!)
self.image = UIImage(data: data)
}
}
The image at that URL is about 40 MB, and in this example contributes significantly to my app's large memory footprint.
How do I determine "Oh yeah, it's this UIImage right here taking up 40 MB of memory by itself!"
Short answer:
Unfortunately, there’s no simple “for this given large memory allocation, it is associated with this particular UIImage”. You can use stack traces, either in Instruments’ “Allocations” tool or the Xcode “Debug memory graph” (with “malloc stack” feature), to identify what was allocated where, but it’s exceedingly difficult to use this to track from some large malloc for the image data and the original UIImage object. For simple objects it works fine, but it’s a little more convoluted for for images.
Long answer:
The challenge with images is that that often the memory allocated for the image data is somewhat decoupled from the UIImage object itself. The allocation of the UIImage object is easily tracked back to where you instantiated it, but not the buffer for the data backing the image. Worse, when we supply this image to some image view, the stack trace for that image buffer will drop you into rendering engine call tree, not your code, making it even harder.
That having been said, using Instruments, you can often get clues about what’s going on. For example, using the “Allocations” tool, go to the list of allocations, and see what was allocated where. If you take that list, sort it by size, and you can see a stack trace, on the right, of where it was allocated:
Now in this case, I used the image in a UIImageView, and therefore the resulting allocation is buried inside the the iOS frameworks, not directly to our code. But one can infer from the stack trace that this was the result of rendering this JPG in the UI.
So, while you can’t easily conclude “oh, that’s the specific Airbus Pleiades image,” you can at least conclude that the particular allocation was associated with some JPG.
A few unrelated observations:
I suspect you were just keeping your example simple, but obviously you would never use Data(contentsOf:) from the main thread like that. Your UI will be blocked and you risk having your app killed by the watchdog process.
You'd generally initiate the network request asynchronously:
let url = URL(string: "https://effigis.com/wp-content/uploads/2015/02/Airbus_Pleiades_50cm_8bit_RGB_Yogyakarta.jpg")!
URLSession.shared.dataTask(with: url) { data, _, _ in
guard
let data = data,
let image = UIImage(data: data)
else {
return
}
DispatchQueue.main.async {
self.image = image
}
}.resume()
This not only avoids blocking the main thread, but you theoretically could use the URLResponse and Error parameters if you wanted any special handling for given errors (e.g. customized error messages in the UI or whatever).
When downloading large assets like this, if you don’t need to show the image in the UI immediately, you might use a download task instead, which has a much lower peak memory usage than Data(contentsOf:) or a dataTask:
let url = URL(string: "https://effigis.com/wp-content/uploads/2015/02/Airbus_Pleiades_50cm_8bit_RGB_Yogyakarta.jpg")!
let filename = url.lastPathComponent
URLSession.shared.downloadTask(with: url) { location, _, _ in
guard let location = location else { return }
do {
let folder = try FileManager.default.url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
.appendingPathComponent("images")
try FileManager.default.createDirectory(at: folder, withIntermediateDirectories: true)
let fileURL = folder.appendingPathComponent(filename)
try FileManager.default.moveItem(at: location, to: fileURL)
} catch {
print(error)
}
}.resume()
If you do this, you won't require anything close to the 40mb during the download process. That might be critical if downloading lots of assets or if you’re not immediately showing the image in the UI. Also, if you later choose to use background URLSession, you can do this with download tasks, but not data tasks.
It’s worth noting that JPG images (and to a lesser degree, PNG images) are generally compressed. Thus, you can easily find that you might be downloading an asset whose size may be measured in kilobytes, but when you go to use it, will require megabytes. The general rule of thumb is that, regardless of the size of the file you use or the size of the control in which you’re using it, the memory required when you use the image is generally 4 × width × height (measured in pixels).
For example, a 5,494 × 5,839 px image may take up 122 mb (!) when you go to use it. The particulars may vary, but 4 × width × height is a good assumption. When considering memory consumption, the size of the file is a misleading indication of the amount of memory that might be used when you go to use this asset. Always consider the actual image dimensions because it’s going to be uncompressed when you use it.
In my answer above, I focused on Instruments’ Allocations tool. But it's worth noting that when diagnosing memory usage, the “Debug Memory Graph” feature is great when you’re trying to diagnose where the strong references are (great for identifying strong reference cycles). It’s not really relevant to this particular discussion, but can be useful if you’re tracking down where you used an image.
For example, here, I’ve downloaded your image (using URLSession) and not only set the image property of my view controller, but also used it in a UIImageView. This “Debug Memory Graph” tool is great for visualizing what is used where (but admittedly, not for correlating specific memory allocations to code):
I also editing my scheme’s diagnostic options to include the “malloc stack” feature, giving me the stack trace, on the right, like you see in the Allocations tool, above.
The Allocations instrument in Instruments can do this. Choosing Allocations List from the jump bar will show every memory allocation your app makes. Sort the table by allocation size to see the largest memory allocations.
What most developers are interested in is finding the code that allocates large amounts of memory. I answered that question at the following link:
Using instruments tool to locate leaks
I know the title of the question is about leaks, but the technique works the same for memory allocations.
It is said in Realm's doc:
You may also see this problem when accessing Realm using Grand Central Dispatch. This can happen when a Realm ends up in a dispatch queue’s autorelease pool as those pools may not be drained for some time after executing your code. The intermediate versions of data in the Realm file cannot be reused until the RLMRealm object is deallocated. To avoid this issue, you should use an explicit autorelease pool when accessing a Realm from a dispatch queue.
Does this mean that we must use explicit autorelease pool everytime in GCD even under ARC? Can someone post a code sample? This is kind of important, but the official documentation does not emphasize it so much
You don't really have to use an explicit autorelease pool every time. It is more relevant for scenarios where you do a lot of concurrent transactions and could easily run into the risk of tracking to many immediate versions. Or when you want to make sure that you close the Realm file in the lifetime of your app by releasing all open accessors to it.
The docs are at this point more to understand as a making aware of technical limitations and a hint how to resolve that once you hit an issue like that than a general best practice. Sure, you could always do that and it wouldn't necessarily hurt you (à la "If you have a hammer, everything looks like a nail."), but you wouldn't necessarily have to do it.
The extra complication of what this exactly means is not required for everyone. Understanding explicit autorelease pools needs a deeper understanding of ARC that is not a general requirement. If you have ideas, how that could be resolved in a better way, your feedback is more than welcome.
The section Using a Realm Across Threads gives an example for that, inserting a million objects in a background queue:
dispatch_async(queue) {
autoreleasepool {
// Get realm and table instances for this thread
let realm = try! Realm()
// Break up the writing blocks into smaller portions
// by starting a new transaction
for idx1 in 0..<1000 {
realm.beginWrite()
// Add row via dictionary. Property order is ignored.
for idx2 in 0..<1000 {
realm.create(Person.self, value: [
"name": "\(idx1)",
"birthdate": NSDate(timeIntervalSince1970: NSTimeInterval(idx2))
])
}
// Commit the write transaction
// to make this data available to other threads
try! realm.commitWrite()
}
}
}
It generally makes sense to have object creations in a number like that in a separate autoreleasepool as you can't really predict with ARC when the object releases will happen and so you have an explicit point in time, when they will happen latest, which makes your program more deterministically to understand for you and other humans.
To avoid this issue, you should use an explicit autorelease pool when accessing a Realm from a dispatch queue.
Does this mean that we must use explicit autorelease pool everytime in GCD even under ARC?
I disagree with the current accepted answer, on any background thread (especially threadpools such as the GCD), you should always force the close of the Realm instance as soon as possible when it is no longer needed, to avoid version retention. In iOS, forcing the close of a Realm instance is possible with autoreleasepool { ... }.
So for background threads, it is generally recommended to always use an explicit autorelease pool.
dispatch_async(queue) {
autoreleasepool {
let realm = try! Realm()
//...
}
}
It's also preferable to minimize the number of transactions you commit from background threads, so you should instead try to have 1 transactions instead of N.
// Break up the writing blocks into smaller portions
// by starting a new transaction
realm.beginWrite()
for idx1 in 0..<1000 {
// Add row via dictionary. Property order is ignored.
for idx2 in 0..<1000 {
realm.create(Person.self, value: [
"name": "\(idx1)",
"birthdate": NSDate(timeIntervalSince1970: NSTimeInterval(idx2))
])
}
}
// Commit the write transaction
// to make this data available to other threads
try! realm.commitWrite()
Consider this simple Swift code that logs device motion data to a CSV file on disk.
let motionManager = CMMotionManager()
var handle: NSFileHandle? = nil
override func viewDidLoad() {
super.viewDidLoad()
let documents = NSSearchPathForDirectoriesInDomains(.DocumentDirectory, .UserDomainMask, true)[0] as NSString
let file = documents.stringByAppendingPathComponent("/data.csv")
NSFileManager.defaultManager().createFileAtPath(file, contents: nil, attributes: nil)
handle = NSFileHandle(forUpdatingAtPath: file)
motionManager.startDeviceMotionUpdatesToQueue(NSOperationQueue.currentQueue(), withHandler: {(data, error) in
let data_points = [data.timestamp, data.attitude.roll, data.attitude.pitch, data.attitude.yaw, data.userAcceleration.x,
data.userAcceleration.y, data.userAcceleration.z, data.rotationRate.x, data.rotationRate.y, data.rotationRate.z]
let line = ",".join(data_points.map { $0.description }) + "\n"
let encoded = line.dataUsingEncoding(NSUTF8StringEncoding)!
self.handle!.writeData(encoded)
})
}
I've been stuck on this for days. There appears to be a memory leak, as memory
consumption steadily increases until the OS suspends the app for exceeding resources.
It's critical that this app be able to run for long periods without interruption. Some notes:
I've tried using NSOutputStream and a CSV-writing library (CHCSVParser), but the issue is still present
Executing the logging code asynchronously (wrapping startDeviceMotionUpdatesToQueue in dispatch_async) does not remove the issue
Performing the sensor data processing in a background NSOperationQueue does fix the issue (only when maxConcurrentOperationCount >= 2). However, that causes concurrency issues in file writing: the output file is garbled with lines intertwined between each other.
The issue does not seem to appear when logging accelerometer data only, but does seem to appear when logging multiple sensors (e.g. accelerometer + gyroscope). Perhaps there's a threshold of file writing throughput that triggers this issue?
The memory spikes seem to be spaced out at roughly 10 second intervals (steps in the above graph). Perhaps that's indicative of something? (could be an artifact of the memory instrumentation infrastructure, or perhaps it's garbage collection)
Any pointers? I've tried to use Instruments, but I don't have the skills the use it effectively. It seems that the exploding memory usage is caused by __NSOperationInternal. Here's a sample Instruments trace.
Thank you.
First, see this answer of mine:
https://stackoverflow.com/a/28566113/341994
You should not be looking at the Memory graphs in the debugger; believe only what Instruments tells you. Debug builds and Release builds are memory-managed very differently in Swift.
Second, if there is still trouble, try wrapping the interior of your handler in an autoreleasepool closure. I do not expect that that would make a difference, however (as this is not a loop), and I do not expect that it will be necessary, as I suspect that using Instruments will reveal that there was never any problem in the first place. However, the autoreleasepool call will make sure that autoreleased objects are not given a chance to accumulate.
Is there a technique for avoiding undue memory consumption by testing the availability of memory before it's allocated? I understand that the general iOS approach is to optimize memory usage and respond to didReceiveMemoryWarning when necessary, but sometimes that doesn't cut it.
In my use case (image processing), I'm allocating space for a (potentially) large image using UIGraphicsBeginImageContext(). If the image is too big, I eventually get a didReceiveMemoryWarning. But, it's too late at that point: from a user experience perspective, it would've been better to prevent the user from working with such a large image to begin with; it would make more sense to say, "Sorry! Image size too big! Do something else!" before creating it than to say, "Ooops! Crashing now!"
I found a few SO threads on querying available memory and/or total physical memory, but using them is a messy and unreliable solution: there's no way to tell how much memory the OS is actually going to let you use at a given point in time, regardless of how much is free.
Basically, I want these semantics: (in "Swift-Java-ese")
try {
UIGraphicsBeginImageContext(CGRect(x: reallyBig, y: reallyBig))
}
catch NotEnoughMemoryException {
directUserToPickSmallerImage()
}
// The memory is mine; it's OK to use it
continueUsingBigImage()
Is there a methodology for doing this in iOS?
You might try pre-flitting with NSMutableData var length: Int and check for nil.
let data: NSMutableData? = NSMutableData(length:1000)
if data != nil {
println("Success")
}
else {
println("Failure")
}
My question is about Core Data and memory not being released. I am doing a sync process importing data from a WebService which returns a json. I load in, in memory, the data to import, loop through and create NSManagedObjects. The imported data needs to create objects that have relationships to other objects, in total there are around 11.000. But to isolate the problem I am right now only creating the items of the first and second level, leaving the relationship out, those are 9043 objects.
I started checking the amount of memory used, because the app was crashing at the end of the process (with the full data set). The first memory check is after loading in memory the json, so that the measurement really takes only in consideration the creation, and insert of the objects into Core Data. What I use to check the memory used is this code (source)
-(void) get_free_memory {
struct task_basic_info info;
mach_msg_type_number_t size = sizeof(info);
kern_return_t kerr = task_info(mach_task_self(),
TASK_BASIC_INFO,
(task_info_t)&info,
&size);
if( kerr == KERN_SUCCESS ) {
NSLog(#"Memory in use (in bytes): %f",(float)(info.resident_size/1024.0)/1024.0 );
} else {
NSLog(#"Error with task_info(): %s", mach_error_string(kerr));
}
}
My setup:
1 Persistent Store Coordinator
1 Main ManagedObjectContext (MMC) (NSMainQueueConcurrencyType used to read (only reading) the data in the app)
1 Background ManagedObjectContext (BMC) (NSPrivateQueueConcurrencyType, undoManager is set to nil, used to import the data)
The BMC is independent to the MMC, so BMC is no child context of MMC. And they do not share any parent context. I don't need BMC to notify changes to MMC. So BMC only needs to create/update/delete the data.
Plaform:
iPad 2 and 3
iOS, I have tested to set the deployment target to 5.1 and 6.1. There is no difference
XCode 4.6.2
ARC
Problem:
Importing the data, the used memory doesn't stop to increase and iOS doesn't seem to be able to drain the memory even after the end of the process. Which, in case the data sample increases, leads to Memory Warnings and after the closing of the app.
Research:
Apple documentation
Efficiently importing Data
Reducing Memory Overhead
Good recap of the points to have in mind when importing data to Core Data (Stackoverflow)
Tests done and analysis of the memory release. He seems to have the same problem as I, and he sent an Apple Bug report with no response yet from Apple. (Source)
Importing and displaying large data sets (Source)
Indicates the best way to import large amount of data. Although he mentions:
"I can import millions of records in a stable 3MB of memory without
calling -reset."
This makes me think this might be somehow possible? (Source)
Tests:
Data Sample: creating a total of 9043 objects.
Turned off the creation of relationships, as the documentation says they are "expensive"
No fetching is being done
Code:
- (void)processItems {
[self.context performBlock:^{
for (int i=0; i < [self.downloadedRecords count];) {
#autoreleasepool
{
[self get_free_memory]; // prints current memory used
for (NSUInteger j = 0; j < batchSize && i < [self.downloadedRecords count]; j++, i++)
{
NSDictionary *record = [self.downloadedRecords objectAtIndex:i];
Item *item=[self createItem];
objectsCount++;
// fills in the item object with data from the record, no relationship creation is happening
[self updateItem:item WithRecord:record];
// creates the subitems, fills them in with data from record, relationship creation is turned off
[self processSubitemsWithItem:item AndRecord:record];
}
// Context save is done before draining the autoreleasepool, as specified in research 5)
[self.context save:nil];
// Faulting all the created items
for (NSManagedObject *object in [self.context registeredObjects]) {
[self.context refreshObject:object mergeChanges:NO];
}
// Double tap the previous action by reseting the context
[self.context reset];
}
}
}];
[self check_memory];// performs a repeated selector to [self get_free_memory] to view the memory after the sync
}
Measurment:
It goes from 16.97 MB to 30 MB, after the sync it goes down to 28 MB. Repeating the get_memory call each 5 seconds maintains the memory at 28 MB.
Other tests without any luck:
recreating the persistent store as indicated in research 2) has no effect
tested to let the thread wait a bit to see if memory restores, example 4)
setting context to nil after the whole process
Doing the whole process without saving context at any point (loosing therefor the info). That actually gave as result maintaing less amount of memory, leaving it at 20 MB. But it still doesn't decrease and... I need the info stored :)
Maybe I am missing something but I have really tested a lot, and after following the guidelines I would expect to see the memory decreasing again. I have run Allocations instruments to check the heap growth, and this seems to be fine too. Also no memory Leaks.
I am running out of ideas to test/adjust... I would really appreciate if anyone could help me with ideas of what else I could test, or maybe pointing to what I am doing wrong. Or it is just like that, how it is supposed to work... which I doubt...
Thanks for any help.
EDIT
I have used instruments to profile the memory usage with the Activity Monitor template and the result shown in "Real Memory Usage" is the same as the one that gets printed in the console with the get_free_memory and the memory still never seems to get released.
Ok this is quite embarrassing... Zombies were enabled on the Scheme, on the Arguments they were turned off but on Diagnostics "Enable Zombie Objects" was checked...
Turning this off maintains the memory stable.
Thanks for the ones that read trough the question and tried to solve it!
It seems to me, the key take away of your favorite source ("3MB, millions of records") is the batching that is mentioned -- beside disabling the undo manager which is also recommended by Apple and very important).
I think the important thing here is that this batching has to apply to the #autoreleasepool as well.
It's insufficient to drain the autorelease pool every 1000
iterations. You need to actually save the MOC, then drain the pool.
In your code, try putting a second #autoreleasepool into the second for loop. Then adjust your batch size to fine-tune.
I have made tests with more than 500.000 records on an original iPad 1. The size of the JSON string alone was close to 40MB. Still, it all works without crashes, and some tuning even leads to acceptable speed. In my tests, I could claim up to app. 70MB of memory on an original iPad.