Fully dealloc NSMutableDictionary / NSObject - ios

I am retrieving a tremendous amount of data from a server and I have to quickly initialize and deallocate a NSMutableDictionary.
I would use core data but requirements say I have to use one allocated object for fast storage of incoming JSON data, save it to NSUserDefaults and completely remove it. This will be happening multiple times in a background process over a period of time.
Tests were on a live iPhone 6S (not a simulator)
I tested a completely bare "Single View Application" that resulted with:
As you can see a bare bones "Single View Application" consumes 9MB of memory.
I then ran a test on a NSMutableDictionary. My test was to initialize with data, remove all objects, and restore the RAM.
The dictionary stored 10,000 tiny values (in viewDidLoad - for testing purposes):
_dictionary = [[NSMutableDictionary alloc] initWithDictionary:#{
#"Item00000" : [NSNumber numberWithInt:0],
...
#"Item10000" : [NSNumber numberWithInt:10000]}];
After initializing - I waited two seconds and removed all objects and attempted to remove the NSMutableDictionary.
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(2.0 * NSEC_PER_SEC)), dispatch_get_main_queue(), ^{
[_largeDictionary removeAllObjects];
_largeDictionary = nil;
});
But the memory consumption did not show the NSMutableDictionary getting removed:
Does anyone have a solution to this problem? I thought this was taken care of when Apple depreciated the autorelease method. My live environment is quickly initializing relatively large chunks of data per key-value pair - then deallocating and preparing for the next chunk. The next chunk may be stored in a different NSMutableDictionary. That's why it is necessary for me to completely remove the first NSMutableDictionary from memory. I have seen many answers pertaining to this question but none more recent than 2013.
Update: this #property is: (nonatomic, strong)
Thank you in advance.

Related

Objective-C iOS: Begin populating UITableView immediately from JSON Stream

I have a UITableView with cells to display images and text from a largish (5000 items) JSON file. I want to stream the JSON in and start updating the UITableView immediately, but can't seem to work out the plumbing for this.
- (NSArray *)parseJSONIntoImageObjectsFromData:(NSData *)rawJSONData {
NSError *error;
NSMutableArray *arrayOfImageObjects = [[NSMutableArray alloc] init];
NSURL *myURL = [[NSURL alloc] initWithString:self.urlString];
NSData *objects = [NSData dataWithContentsOfURL:myURL];
NSInputStream *stream = [[NSInputStream alloc] initWithData:objects];
[stream open];
NSMutableArray *arrayFromStream = [NSJSONSerialization JSONObjectWithStream:stream options:NSJSONReadingAllowFragments error:&error];
for (NSDictionary *JSonDictionary in arrayFromStream) {
NSLog(#"Count is %lu", (unsigned long)arrayOfImageObjects.count);
NSInteger imgID = (NSInteger)JSonDictionary[#"id"];
ImageObject *newImageObject = [[ImageObject alloc] initWithID:imgID andTitle:JSonDictionary[#"title"] andThumbnailURL:JSonDictionary[#"thumbnailUrl"]];
[arrayOfImageObjects addObject:newImageObject];
}
return arrayOfImageObjects;
}
This definitely gets them as a stream, as the NSLog reveals in the debug window. But since it waits for the return it has to complete. I'm a little puzzled at going about this and can't find a good code sample. Do I perhaps return a stream?
EDIT: I am not terribly concerned about the brief delay I am encountering and I am sure the delay is more on the retrieval than in the parsing, I just want to learn to retrieve the data as a stream and update the UITableView incrementally as a way to do this better. I enjoy working on data retrieval and manipulation and am trying to improve my skills by knowing more.
Also, the images are retrieved asynchronously at display time using an NSOperationQueue and don't really matter for this task.
If you benchmark this, I think you'll find that the parsing time of the JSON is inconsequential. The slow parts are going to be the download of the original JSON (and possibly the creation of the ImageObject objects). You should benchmark this in Instruments (use the "time profiler" tool) and use the "record waiting threads" option. See WWDC video Building Concurrent User Interfaces on iOS for a demonstration on how to use Instruments to diagnose these sorts of issues.
I would first retire the dataWithContentsOfURL, as that runs synchronously. I would advise using an asynchronous technique such as NSURLSession method dataWithURL (or if you need support for pre-iOS 7, NSURLConnection method sendAsynchronousRequest).
Usually in these cases, the JSON is small enough, that the biggest delay stems from the network latency in making the initial request. I mention that so that you don't bother embarking on some major refactoring of the code for paging/streaming approaches without confirming that this will solve the problem.
Also, you haven't shared this ImageObject logic, but if that is synchronously loading images, that's a likely candidate for refactoring for asynchronous retrieval, too. Without knowing more about that class, it's hard to advise you further on that point.
Define NSMutableArray *arrayOfImageObjectsas a property or variable outside this method and then in your for loop, call [self.tableView reloadData] after maybe every 100 objects.
That's assuming that your numberOfRowsInSection is keying off of arrayOfImageObjects as well and cellForRowAtIndexPath is using it to populate the table data.
But also consider 'paging' your data, so as to only load 50 objects or so, at once (assuming your API supports this like 'http://example.com/imagedata?page=1'). Then if the user flicks or scrolls the tableview you can do another api call, increasing the page number and adding that new set of data to your current set and calling reloadData.
EDIT: also I'm assuming your "parseJSONIntoImageObjectsFromData" is running asynchronously. If not then use something like AFNetworking (or sendAsynchronousRequest:queue:completionHandler: in NSURLConnection) and in the completion block you can start adding to your array.

Core Data Import - Not releasing memory

My question is about Core Data and memory not being released. I am doing a sync process importing data from a WebService which returns a json. I load in, in memory, the data to import, loop through and create NSManagedObjects. The imported data needs to create objects that have relationships to other objects, in total there are around 11.000. But to isolate the problem I am right now only creating the items of the first and second level, leaving the relationship out, those are 9043 objects.
I started checking the amount of memory used, because the app was crashing at the end of the process (with the full data set). The first memory check is after loading in memory the json, so that the measurement really takes only in consideration the creation, and insert of the objects into Core Data. What I use to check the memory used is this code (source)
-(void) get_free_memory {
struct task_basic_info info;
mach_msg_type_number_t size = sizeof(info);
kern_return_t kerr = task_info(mach_task_self(),
TASK_BASIC_INFO,
(task_info_t)&info,
&size);
if( kerr == KERN_SUCCESS ) {
NSLog(#"Memory in use (in bytes): %f",(float)(info.resident_size/1024.0)/1024.0 );
} else {
NSLog(#"Error with task_info(): %s", mach_error_string(kerr));
}
}
My setup:
1 Persistent Store Coordinator
1 Main ManagedObjectContext (MMC) (NSMainQueueConcurrencyType used to read (only reading) the data in the app)
1 Background ManagedObjectContext (BMC) (NSPrivateQueueConcurrencyType, undoManager is set to nil, used to import the data)
The BMC is independent to the MMC, so BMC is no child context of MMC. And they do not share any parent context. I don't need BMC to notify changes to MMC. So BMC only needs to create/update/delete the data.
Plaform:
iPad 2 and 3
iOS, I have tested to set the deployment target to 5.1 and 6.1. There is no difference
XCode 4.6.2
ARC
Problem:
Importing the data, the used memory doesn't stop to increase and iOS doesn't seem to be able to drain the memory even after the end of the process. Which, in case the data sample increases, leads to Memory Warnings and after the closing of the app.
Research:
Apple documentation
Efficiently importing Data
Reducing Memory Overhead
Good recap of the points to have in mind when importing data to Core Data (Stackoverflow)
Tests done and analysis of the memory release. He seems to have the same problem as I, and he sent an Apple Bug report with no response yet from Apple. (Source)
Importing and displaying large data sets (Source)
Indicates the best way to import large amount of data. Although he mentions:
"I can import millions of records in a stable 3MB of memory without
calling -reset."
This makes me think this might be somehow possible? (Source)
Tests:
Data Sample: creating a total of 9043 objects.
Turned off the creation of relationships, as the documentation says they are "expensive"
No fetching is being done
Code:
- (void)processItems {
[self.context performBlock:^{
for (int i=0; i < [self.downloadedRecords count];) {
#autoreleasepool
{
[self get_free_memory]; // prints current memory used
for (NSUInteger j = 0; j < batchSize && i < [self.downloadedRecords count]; j++, i++)
{
NSDictionary *record = [self.downloadedRecords objectAtIndex:i];
Item *item=[self createItem];
objectsCount++;
// fills in the item object with data from the record, no relationship creation is happening
[self updateItem:item WithRecord:record];
// creates the subitems, fills them in with data from record, relationship creation is turned off
[self processSubitemsWithItem:item AndRecord:record];
}
// Context save is done before draining the autoreleasepool, as specified in research 5)
[self.context save:nil];
// Faulting all the created items
for (NSManagedObject *object in [self.context registeredObjects]) {
[self.context refreshObject:object mergeChanges:NO];
}
// Double tap the previous action by reseting the context
[self.context reset];
}
}
}];
[self check_memory];// performs a repeated selector to [self get_free_memory] to view the memory after the sync
}
Measurment:
It goes from 16.97 MB to 30 MB, after the sync it goes down to 28 MB. Repeating the get_memory call each 5 seconds maintains the memory at 28 MB.
Other tests without any luck:
recreating the persistent store as indicated in research 2) has no effect
tested to let the thread wait a bit to see if memory restores, example 4)
setting context to nil after the whole process
Doing the whole process without saving context at any point (loosing therefor the info). That actually gave as result maintaing less amount of memory, leaving it at 20 MB. But it still doesn't decrease and... I need the info stored :)
Maybe I am missing something but I have really tested a lot, and after following the guidelines I would expect to see the memory decreasing again. I have run Allocations instruments to check the heap growth, and this seems to be fine too. Also no memory Leaks.
I am running out of ideas to test/adjust... I would really appreciate if anyone could help me with ideas of what else I could test, or maybe pointing to what I am doing wrong. Or it is just like that, how it is supposed to work... which I doubt...
Thanks for any help.
EDIT
I have used instruments to profile the memory usage with the Activity Monitor template and the result shown in "Real Memory Usage" is the same as the one that gets printed in the console with the get_free_memory and the memory still never seems to get released.
Ok this is quite embarrassing... Zombies were enabled on the Scheme, on the Arguments they were turned off but on Diagnostics "Enable Zombie Objects" was checked...
Turning this off maintains the memory stable.
Thanks for the ones that read trough the question and tried to solve it!
It seems to me, the key take away of your favorite source ("3MB, millions of records") is the batching that is mentioned -- beside disabling the undo manager which is also recommended by Apple and very important).
I think the important thing here is that this batching has to apply to the #autoreleasepool as well.
It's insufficient to drain the autorelease pool every 1000
iterations. You need to actually save the MOC, then drain the pool.
In your code, try putting a second #autoreleasepool into the second for loop. Then adjust your batch size to fine-tune.
I have made tests with more than 500.000 records on an original iPad 1. The size of the JSON string alone was close to 40MB. Still, it all works without crashes, and some tuning even leads to acceptable speed. In my tests, I could claim up to app. 70MB of memory on an original iPad.

Huge memory consumption while parsing JSON and creating NSManagedObjects

I'm parsing a JSON file on an iPad which has about 53 MB. The parsing is working fine, I'm using Yajlparser which is a SAX parser and have set it up like this:
NSData *data = [NSData dataWithContentsOfFile:path options:NSDataReadingMappedAlways|NSDataReadingUncached error:&parseError];
YAJLParser *parser = [[YAJLParser alloc] init];
parser.delegate = self;
[parser parse:data];
Everything worked fine until now, but the JSON-file became bigger and now I'm suddenly experiencing memory warnings on the iPad 2. It receives 4 Memory Warnings and then just crashes. On the iPad 3 it works flawlessly without any mem warnings.
I have started profiling it with Instruments and found a lot of CFNumber allocations (I have stopped Instruments after a couple of minutes, I had it run before until the crash and the CFNumber thing was at about 60 mb or more).
After opening the CFNumber detail, it showed up a huge list of allocations. One of them showed me the following:
and another one here:
So what am I doing wrong? And what does that number (e.g. 72.8% in the last image) stand for? I'm using ARC so I'm not doing any Release or Retain or whatever.
Thanks for your help.
Cheers
EDIT: I have already asked the question about how to parse such huge files here: iPad - Parsing an extremely huge json - File (between 50 and 100 mb)
So the parsing itself seems to be fine.
See Apple's Core Data documentation on Efficiently Importing Data, particularly "Reducing Peak Memory Footprint".
You will need to make sure you don't have too many new entities in memory at once, which involves saving and resetting your context at regular intervals while you parse the data, as well as using autorelease pools well.
The general sudo code would be something like this:
while (there is new data) {
#autoreleasepool {
importAnItem();
if (we have imported more than 100 items) {
[context save:...];
[context reset];
}
}
}
So basically, put an autorelease pool around your main loop or parsing code. Count how many NSManagedObject instances you have created, and periodically save and reset the managed object context to flush these out of memory. This should keep your memory footprint down. The number 100 is arbitrary and you might want to experiment with different values.
Because you are saving the context for each batch, you may want to import into a temporary copy of your store in case something goes wrong and leaves you with a partial import. When everything is finished you can overwrite the original store.
Try to use [self.managedObjectContext refreshObject:obj refreshChanges:NO] after certain amount of insert operations. This will turn NSManagedObjects into faults and free up some memory.
Apple Docs on provided methods

My Grand Central Dispatch usage: Am I using it correctly?

I am fetching data from a server in JSON format. It's only about 150 records and I was not using GCD initially but every now and again when I hit the button in the app to view the table with data it would delay for about a couple of seconds before switching to the table view and displaying the data. So I implemented GCD and now when I hit the button it switches to the tableview immediately but then there is a few seconds delay in loading the data, which seems longer than the pre-GCD implementation. So I'm not sure if I am using GCD correctly, or if it's my server causing the delay (which I think is the culprit). Here is the implementation of GCD in a method called retrieveData which I call in viewDidLoad as [self retrieveData]:
- (void)retrieveData
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0),^{
NSURL *url = [NSURL URLWithString:#"http://MY_URL/JSON/document.json"];
NSData * data = [NSData dataWithContentsOfURL:url];
dispatch_async(dispatch_get_main_queue(), ^{
json = [NSJSONSerialization JSONObjectWithData:data options:kNilOptions error:nil];
//Set up our exhibitors array
exhibitorsArray = [[NSMutableArray alloc] init];
for (int i = 0; i < json.count; i++) {
//create exhibitors object
NSString * blabel = [[json objectAtIndex:i] objectForKey:#"BoothLabel"];
NSString * bName = [[json objectAtIndex:i] objectForKey:#"Name"];
NSString * bURL = [[json objectAtIndex:i] objectForKey:#"HyperLnkFldVal"];
exhibitors * myExhibitors = [[exhibitors alloc] initWithBoothName: bName andboothLabel: blabel andBoothURL: bURL];
//Add our exhibitors object to our exhibitorsArray
[exhibitorsArray addObject:myExhibitors];
//Sort by name
NSSortDescriptor *sort = [NSSortDescriptor sortDescriptorWithKey:#"name" ascending:YES];
[exhibitorsArray sortUsingDescriptors:[NSMutableArray arrayWithObject:sort]];
}
[self.myTableView reloadData];
});
});
}
This is basically correct. Dispatch the data retrieval to the background queue, and then dispatch the model and UI update back to the main queue. Well done.
In terms of its being slower, I don't see anything there that would account for that. GCD introduces some overhead, but generally not observable. It may be a bit of a "watched kettle never boils" issue.
A couple of unrelated thoughts, though:
I might suggest moving the sort to outside of the for loop, but before the reloadData. You're sorting it 150 times. If doing an insertion sort, you could do it within the loop, but I don't think that's happening here. I'd move the sort to the end of the loop. I'm not sure if the performance gain will be observable, but there should be some modest improvement.
You might want to make sure data is not nil (e.g. no network, or some other network issue), because if it is, JSONObjectWithData will crash.
Your json object is an external variable. It should probably be a local variable of your retrieveData method. There's no need to make it an instance variable. It's cleaner to make it a local variable if appropriate.
You probably should adopt the naming convention whereby class names start with uppercase letters (e.g. Exhibitor instead of exhibitors).
Very minor point, but your blabel variable should probably be bLabel. Even better, I might rename these three variables boothLabel, boothName, and boothUrlString.
You're using instance variable for exhibitorsArray. I presume you're doing this elsewhere, too. You might want to consider using declared properties instead.
You might want to turn on the network activity indicator before dispatching your code to the background, and turning it back off when you perform reloadData.
[[UIApplication sharedApplication] setNetworkActivityIndicatorVisible:YES];
If you wanted to get really sophisticated, you might reconsider whether you want to use GCD's global queues (because if you refresh 10 times quickly, all ten will run, whereas you probably only want the last one to run). This is a more complicated topic, so I won't cover that here, but if you're interested, you might want to refer to the discussion of operation queues the Concurrency Programming Guide, in which you can create operations that are cancelable (and thus, when initiating a new operation, cancel the prior one(s)).
You might also want to refer to the Building Concurrent User Interfaces on iOS WWDC 2012 video.
But this is all tangential to your original question: Yes, you have tackled this appropriately.

Thread safety of NSMutableDictionary access and destruction

I have an application that downloads information from a web service and caches it in memory. Specifically, my singleton cache class contains an instance variable NSMutableDictionary *memoryDirectory which contains all of the cached data. The data in this cache can be redownloaded easily, so when I receive a UIApplicationDidReceiveMemoryWarningNotification I call a method to simply invoke
- (void) dumpCache:(NSNotification *)notification
{
memoryDirectory = nil;
}
I’m a little worried about the thread safety here. (I’ll admit I don’t know much about threads in general, much less in Cocoa’s implementation.) The cache is a mutable dictionary whose values are mutable dictionaries, so there are two levels of keys to access data. When I write to the cache I do something like this:
- (void) addDataToCache:(NSData *)data
forKey:(NSString *)
subkey:(NSString *)subkey
{
if (!memoryDirectory)
memoryDirectory = [[NSMutableDictionary alloc] init];
NSMutableDictionary *methodDictionary = [memoryDirectory objectForKey:key];
if (!methodDictionary) {
[memoryDirectory setObject:[NSMutableDictionary dictionary] forKey:key];
methodDictionary = [memoryDirectory objectForKey:key];
}
[methodDictionary setObject:data forKey:subkey];
}
I’m worried that sometime in the middle of the process, dumpCache: is going to nil out the dictionary and I’m going to be left doing a bunch of setObject:forKey:s that don’t do anything. This isn’t fatal but you can imagine the problems that might come up if this happens while I’m reading the cache.
Is it sufficient to wrap all of my cache reads and writes in some kind of #synchronized block? If so, what should it look like? (And should my dumpCache: be similarly wrapped?) If not, how should I ensure that what I’m doing is safe?
Instead of using an NSMutableDictionary, consider using NSCache, which is thread safe. See this answer for example usage. Good luck!

Resources