Core Data - Trying to prefetch attributes for all rows is slow - ios

I'm displaying some images in a collection view, and using SDWebImage to prefetch thumbnails for all of them.
I'm using a batch size of 20 in the NSFetchRequest, but when iterating over every object to get the url of the image I need, the batch size performance gain is wasted. Reloading the data takes 0.3s instead of 0.000295s, which results in a obvious delay in the UI.
I've tried setting [request setPropertiesToFetch:#[#"propertyName"]] but it doesn't seem to make a difference. I guess it's not the method I'm looking for.
Any suggestions ?
Edit:
I am using using an UICollectionView backed by a NSFetchedResultsController (and delegate) data-source. I am making a NSFetchRequest with a batch size of 20 and fetching it via the NS-FRC. I also need to get the list of urls from ALL the fetched objects, ie:
for(NSManagedObjectSubclass *object in frc.fetchedObjects)
// this is causing the slow-down, because it's faulting all the objects
// not what you'd want if you have many objects!
{
[urlList addObject:(object.url)]
}
[SDWebImagePrefetcher prefetchUrls:urlList];
// this runs in the background and downloads/gets from cache a list of images

See NSFetchRequest's class reference, -setPropertiesToFetch:. "This value is only used if resultType is set to NSDictionaryResultType."
You're correct that iterating over all of the fetched objects kills the benefit of the small batch size. I'm curious, though, why you need to prefetch all of the image URLs at once. If you're doing this to trigger a download, look into doing that in -awakeFromFetch: pass the object.url off to your SDWebImagePrefetcher, which will enqueue the request and start or continue processing. You'll have to subclass NSManagedObject to do this.

Related

Multi-threading with core data and API requests

Intro
I've read alot of tutorials and articles on Core Data concurrency, but I'm having an issue that is not often covered, or covered in a real-world way that I am hoping someone can help with. I've checked the related questions in SO and none give an answer to this particular question that I can find.
Background
We have an existing application which fetches data from an API (in the background thread) and then saves the records returned into core data. We also need to display these records in the application at the time.
So the process we currently go through is to:
Make a network request for data (background)
Parse the data and map the objects to NSManagedObjects and save (background)
In the completion handler (main thread) we fetch records from core data with the same order and limit that we requested from the API.
Most tutorials on core data concurrency follow this pattern of saving in one thread and then fetching in another, but most of them give examples like:
NSArray *listOfPeople = ...;
[NSManagedObjectHelper saveDataInBackgroundWithContext:^(NSManagedObjectContext *localContext){
for (NSDictionary *personInfo in listOfPeople)
{
PersonEntity *person = [PersonEntity createInContext:localContext];
[person setValuesForKeysWithDictionary:personInfo];
}
} completion:^{
self.people = [PersonEntity findAll];
}];
Source
So regardless of the amount of records you get back, you just fetch all content. This works for small datasets, but I want to be more efficient. I've read many times not to read/write data across threads, so fetching afterwards gets around this issue, but I don't want to fetch all, I just want the new records.
My Problem
So, for my real world example. I want to make a request to my API for the latest information (maybe anything older than my oldest record in core data) and save it, them I need the exact data returned from the API in the main thread ready for display.
So my question is, When I reach my completion handler, how do I know what to fetch? or what did the API return?. A couple of methods I've considered so far:
after saving each record, store the ID in a temporary array and then perform some fetch where id IN array_of_ids.
If I am asking for the latest records, I could just use the count of records returned, then use an order by and limit in my request to the latest x records.
My Question
I realize that the above could be answering my own question but I want to know if there is a better way, or is one of those methods much better to use than the other? I just have this feeling that I am missing something
Thanks
EDIT:
Neither answer below actually addresses the question, This is to do with fetching and saving data in the background and then using the returned data in the main thread. I know it's not a good idea to pass data between threads, so the common way around this is to fetch from core data after inserting. I want to work out the more efficient way.
Have you checked NSFetchedResultsController? Instead of fetching presented objects into array, you will use fetched controller in similar fashion. Through NSFetchedResultsControllerDelegate you would be notified about all the changes performed in background (rows added, removed, changed) and no manual tracking would be needed.
I feel You missing case with two silmultaneous API calls. Both storring ids and counting created enities wont work for that case. Consider adding timestamp property for each PersonEntity.
Assuming that Your intention is to display recently updated persons.
The calcutation of the oldest timestamp to display can look like this:
#property NSDate *lastViewRefreshTime;
#property NSDate *oldestEntityToDisplay;
(...)
if (self.lastViewRefreshTime.timeIntervalSinceNow < -3) {
self.oldestEntityToDisplay = self.lastViewRefreshTime;
}
self.lastViewRefreshTime = [NSDate date];
[self displayPersonsAddedAfter: self.oldestEntityToDisplay];
Now, if two API responses returns in period shorter than 3s their data will be displayed together.

Core data slow processing updates on a background thread

I am having a major problem with my application speed in processing updates on a background thread. Instruments shows that almost all of this time is spend inside performBlockAndWait where I am fetching out the objects which need updating.
My updates may come in by the hundreds depending on the amount of time offline and the approach I am currently using is to process them individually; ie fetch request to pull out the object, update, then save.
It sounds slow and it is. The problem I have is that I don't want to load everything into memory at once, so need to fetch them individually as we go, also I save as I go to ensure that if there is an issue with a single update it won't mess up the rest.
Is there a better approach?
I hit similar slow performance when upserting a large collection of objects. In my case I'm willing to keep the full change set in memory and perform a single save so the large volume of fetch requests dominated my processing time.
I got a significant performance improvement from maintaining an in memory cache mapping my resources' primary keys to NSManagedObjectIDs. That allowed me to use existingObjectWithId:error: rather than a fetch request for an individual object.
I suspect I might do even better by collecting the primary keys for all resources of a given entity description, issuing a single fetch request for all of them at once (batching those results as necessary), and then processing the changes to each resource.
You may benefit from using NSBatchUpdateRequest assuming you're targeting iOS 8+ only.
These guys have a great example of it but the TLDR is basically:
Example: Say we want to update all unread instances of MyObject to be marked as read:
NSBatchUpdateRequest *req = [[NSBatchUpdateRequest alloc] initWithEntityName:#"MyObject"];
req.predicate = [NSPredicate predicateWithFormat:#"read == %#", #(NO)];
req.propertiesToUpdate = #{
#"read" : #(YES)
};
req.resultType = NSUpdatedObjectsCountResultType;
NSBatchUpdateResult *res = (NSBatchUpdateResult *)[context executeRequest:req error:nil];
NSLog(#"%# objects updated", res.result);
Note the above example is taken from the aforementioned blog, I didn't write the snippet.

NSArray with NSManagedObject's relationships faults on second display in UITableView

I'm parsing a bunch of data and mapping it to Core Data NSManagedObjects. I pass these to my UITableViewController in an NSArray which I use as the dataSource.
The NSManagedObjects are often linked to other NSManagedObjects with relationships. Most of these entities that are linked via a relationship have the content I need to display (depending on the relationship). Initially the UITableView displays the content no problem. As soon as I start scrolling and the cell is re-used or if I scroll back to the same location, the cell is blank (I'm just debugging right now, so only displaying content as a string).
I'm logging the NSManagedObject and get the following before scrolling:
<NewPost: 0xd5dcc00> (entity: NewPost; id: 0xd5dcc60 <x-coredata:///NewPost/t63931035-BB67-467D-8598-CAD8563BA5DC267> ;
data: {
group = "0xd5b7a50 <x-coredata:///Group/t63931035-BB67-467D-8598-CAD8563BA5DC265>";
newPostAttributedText = "0xd5e78d0 <x-coredata:///AttributedText/t63931035-BB67-467D-8598-CAD8563BA5DC269>";
newPostSection = "0xe85cc70 <x-coredata://3DE41B33-C64E-44C4-9F86-98DF3C6AD700/PostSection/p7>";
newPostCreator = "0xd297050 <x-coredata://3DE41B33-C64E-44C4-9F86-98DF3C6AD700/Person/p9>";
})
When scrolling and showing the same object I see the following in my log:
<NewPost: 0xd5dcc00> (entity: NewPost; id: 0xe8b3cc0 <x-coredata://3DE41B33-C64E-44C4-9F86-98DF3C6AD700/NewPost/p75> ;
data: <fault>)
Why does the relationship fault when I need to re-use the cell or re-display the same data?
Thanks
You need to ensure that each managed object's context is associated to either the main thread (thread confinement) or the main queue. (Note: a managed object has a property managedObjectContext)
If you access managed objects in the main thread, for example to render the content with UIKit, their managed object context MUST be associated to the main thread respectively the main queue.
Also, it makes sense to have only one context for the objects and its related objects.
Additionally, you need to ensure that your context (in this case the "main context") is actually up to date. That means, you possibly need to fetch objects which retrieves them from the persistent store if required, and it also updates the set of objects (when using a predicate).
This above is required for example when you have a context M (the main context associated to the main queue) whose parent is the root context (M -> root), where the persistent store is handled. Then, delete objects from a different context B (B-> root) and save it persistently. Now, objects in context Main might be deleted but they still exist as "registered" managed objects in that context, but are faults. Fetch context M (main context), in order to update it.
Note: You are better off using a NSFetchedResultsController in order to handle updates.
Caution: When using MagicalRecord, you may end up with a Core Data stack, where the "default context" handles the persistent store and is also associated to the main thread. Children contexts will execute on a private queue and will have set the main context as its parent. IMHO, this is suboptimal.
Take also a look here on SO NSOperation in iOS - How to handle loops and nested NSOperation call to get Images, for dealing with managed objects.
I had the same problem and I solved it by setting to NO the returnsObjectsAsFaults property in NSFetchRequest object:
NSFetchRequest * fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setReturnsObjectsAsFaults:NO];
Try it!
I've had issues with my tables doing strange things when a cell is scrolled off screen and then back on. My usual work-around is to not reuse cells. Note that if you try doing this with very large tables, there may be some performance loss. The tables I've used this on are never more than twice the size of the screen, and I've never noticed any performance lag, but I imaging if you are doing it with 100s of rows or something you could run into trouble.
If you want to go this way, the general idea is:
Call a new method from viewDidLoad (or something similar) that creates all the cells for the whole table (on screen and off) - this method is usually pretty similar to whatever was originally being done in cellForRowAtIndexPath
Put the cells into an NSMutableArray ivar as they are made.
In cellForRowAtIndexPath, just return the cell from that array.
It's probably not the ideal solution, but it has worked for me pretty well in the past when my cells are misbehaving when they get reused.

Fetching large number of data from CoreData using two NSFetchedResultsController

I am trying to fetch 10000records from CoreData in a UITableView using NSFetchedResultsController and trying to make it as fast as possible (since the request has a sort descriptor it takes longer to fetch this amount of data).
I am trying to fetch 100records from CoreData with the first and main NSFetchedResultsController(used in the delegate methods of the table), that is displayed on the table while in another queue I started in viewDidAppear another fetch for all 10000records on an auxiliarFRC. After the fetch in the AuxFRC ends, I assign to the main FRC the AuxFRC so all the records get transfered and I reload the table.
My problem is that the UITableView gets stuck at the first loaded rows till the AuxFRC ends the fetch even if I dispatch the performFetch, and I can not understand why this happens, or if this way is wrong what other way can be used to fetch 10000records and stay up to date if the data changes?
I guess the problem was the NSManagedObjectContext - that is not thread safe and I used the same one for both the fetches. I created a copy of the original one and changed the AuxFRC on the second context. This solved everything.
The trick is to use the cache mechanism of FRC and coordinate the asynchronous fetch with the display thread.
All is summarised in this answer.

what is lazy fetching in cocoa binding coredata?

I watched a video tutorial on CoreData. I saw a that an option
Uses Lazy fetching
To map an Object Model with Array Controller.
But did not understand what is Lazy fetching?
That means that data would be actually fetched (automatically) not a the time of request, but at the time when controller would ask for data. For example consider large tableview with thousands of lines - they are not fetched all at the same time when request is executed, rather they are fetched by small parts dynamically when users scrolls tableview.

Resources