Fast Zero result filtering with Core Data

Fast Zero result filtering with Core Data - ios

I'm building an interface to allow users to filter a set of Photos. The data model is above. I'd like the disable any controls that would result in 0 results.
To accomplish this, I'm running new fetch requests for every control that is off/not selected each time the user makes their own change. I add the data that the control represents to my NSCompoungPredicate, then remove it after I get the result. If the count of the result is 0, I disable that control.
I'm doing this all on the main thread so in some cases there is a bit of a lag in the app. Is there a better way to do this type of filtering with less overhead? Should I run these filter fetches on their own thread? I've never done that with CoreData and worm what I've read I need a separate context for that and I'm not sure how to go about setting up code for that.

A bit of code would help. Aside from that, here are a few suggestions.
First, on your fetch request, use countForFetchRequest:error: because it will just query the database, and return the count instead of object information.
Second, if you don't want to use threads, and the search is still too slow, you can do the initial query when the app starts. This will then enable/disable the various controls.
You can simple catch the context notifications that tell when data has changed, and update that information accordingly. Then, you don't have to do any queries at all. Just initialize, and update the status as objects are added/removed from the database.
If you want to use threads, then that's not really all that difficult.
It sounds like all you want is a thread that is just running queries. You setup a MOC...
NSManagedObjectContext *checkerMoc = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
checkerMoc.persistentStoreCoordinator = MyCurrentMoc.persistentStoreCoordinator;
Now, whenever you want to check the database...
[checkerMoc performBlock:^{
NSFetchRequest *fetchRequest = ...
// Do your fetch request... this block of code is running in the other thread
[checkerMoc fetch...];
// When the fetch request is done, do whatever you want in your UI...
dispatch_async(dispatch_get_main_queue(), ^{
// Now this code is running in the main thread... access your UI
self.myControl.enabled = fetchResultCount > 0;
});
}];
Note, you are using the same persistent store coordinator, so if the main thread tries to access the database, it will get stacked behind this request. You can also use a separate persistentStoreCoordinator for checkerMoc is this is an issue.

Related

CoreData - usage of backgroundContext

I am trying to understand a concept of backgroundContext in CoreData. Though I read several articles about it, I am not still sure about its purpose.
I have an app using CoreData that allows user create, update or delete records. User can also fetch the data he added. I want to ensure that if there are a lot of records, it will not influence a flow of the UI while fetching data.
So I studied a bit about backgroundContexts. I implemented following according to what I understood. I am not sure though whether it is a correct solution. My idea is - if I fetch in background, it cannot influence the main thread.
//I have an PersistentContainer created by Xcode.
//I create an backgroundContext.
self.backContext = (UIApplication.shared.delegate as! AppDelegate).persistentContainer.newBackgroundContext()
//Then if the user adds a new record, it's added to backContext and saved to mainContext
...
let newRecord = Record(context: self.backContext!)
...
self.backContext!.save()
self.context!.save() // mainContext
...
//If the user fetches the data I use:
self.backContext.perform {
...
}
//Since I want to show results in UI, I know these objects (from fetch) exist just in the background thread, so instead I fetch for IDs
.resultType = .managedObjectIDResultType
//Now I have IDs of fetched objects, so I restore objects in main thread using their IDs:
let object = try? self.backContext.existingObject(with: ID) as? Record
//and I can finally use fetched objects to update UI.
The question is:
Is this even correct what I am doing? (it works perfectly though)
Will this solve the problem of freezing UI if user fetches a large amount of data?
How do we use backgroundContexts correctly? Why is it not recommended to work directly with mainContext? How to prevent freezing UI while fetching big data?
One more question: If I use FetchedResultsConteroller - do I need to handle the problem of freezing UI? (while waiting on first (init) fetch result?)
Of course I am ignoring that while fetching data, my context is blocked, so I cannot create a new record

If you are fetching objects to be displayed on screen, you should absolutely be using fetch requests against the main thread context. Core Data is designed for this specific use case and you should not be experiencing slowdowns or freezes because of executing fetches. If you are having problems, then profile your app in instruments and find out where the actual slowdown is.
Background contexts are meant to be used if you are performing bulky or long-running work like processing large API responses which you've shown to be affecting main thread performance.
So I do not have to be afraid of freezing UI, even if my database will contain thousands of records? I can make fetch request with mainContext?
Yes
If I would like to do some special time consuming operations that would not be shown to UI, my code would be correct, right?
Yes, you'd normally create a background context, do work, save the background context - and then access those objects as normal from the main context.
And last but not least - why is it not recommended to work directly with mainContext when I add a new record?
I'm not sure where you've seen this recommendation, but quite a common pattern is to make a new main-queue (not background) child context to support the application workflow of adding a new object. Then if the user cancels the addition, you can just discard the editing context without needing to worry about undoing your work.

Merging main and private contexts with Core Data

I'm creating temportary contexts in a private queue to asynchronously update the data I persist with Core Data:
NSManagedObjectContext *privateContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
AppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];
privateContext.persistentStoreCoordinator = appDelegate.persistentStoreCoordinator;
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Notify update to user
});
}];
Then, once I've got the new data, I need to merge the changes with my main context. I know this is a common scenario, but I'm not sure how should I proceed... This Apple's Core Data documentation section talks about setting a merge policy and I don't fully understand the way to handle that. On the other hand, I found this link, where my scenario is described in its "Stack #2" section and what it says looks simpler, and it doesn't talk about merge policies at all...
Which the correct or most appropriate way should be? I'd really appreciate an explanation and/or simple example of how to manage this scenario.
Thanks in advance.

What you have there looks pretty good.
You are using a private queue to do your work, and it's being saved to the persistent store.
If you only have a small number of changes, then you will be fine. In that case, you want to handle the NSManagedObjectContextDidSaveNotification for your context, and merge the changes into your other context with
[context mergeChangesFromContextDidSaveNotification:notification];
However, if you are really doing a lot of changes, you probably want a separate persistent store coordinator (attached to the same store). By doing this, you can write to the store, while MOCs on the other PSC are reading. If you share the PSC with other readers, only one will get access at a time and you could cause the readers to block until your write has finished.
Also, if you are doing lots of changes, make sure you do them in small batches, inside an autoreleasepool, saving after each batch. Take a look at this similar question: Where should NSManagedObjectContext be created?
Finally, if you are doing lots of changes, it may be more efficient to just refetch your data than it will be to process all the merges. In that case, you don't need to observe the notification, and you don't need to do the merge. It's pretty easy. Note, that if you do this, you really should have a separate PSC... especially if you want to save and notify in small-ish batches.
[privateContext performBlock: ^{
// Parse files and/or call services and parse
// their responses
dispatch_async(dispatch_get_main_queue(), ^{
// Refetch the data you want... if on iOS, this is likely as simple
// as telling the fetched results controller to refetch, and
// reloading your table view (or whatever else is using the data).
});
}];

Deleting Core Data after X amount of days

So I have a bunch of objects in Core Data and want them to auto delete after X amount of days (this would be based off of an NSDate). I did some searching and it seems that you can only delete one core data object at a time, not a group of them, let alone ones that are based off of a certain date. I'm thinking maybe to have a loop running going through each object - but that seems like it would be very processor heavy. Any ideas on where I should be looking to do this? Thanks.

A loop deleting objects one by one is the correct approach.
Deleting objects in Core Data is extremely processor heavy. If that's a problem, then Core Data is not suitable for your project, and you should use something else. I recommend FCModel, as a light weight alternative that is very efficient.
If you are going to stick with Core Data, it's a good idea to perform large operations on a background NSOperationQueue, so the main application is not locked up while deleting the objects. You need to be very careful with Core Data across multiple threads, the approach is to have a separate managed object context for each thread, both using the same persistent store coordinator. Do not ever share a managed object across threads, but you can share the objectID, to fetch a second copy of the same database record on the other managed object context.
Basically your background thread creates a new context, deletes all the objects in a loop, then (on the main thread preferably, see documentation) save the background thread context. This will merge your changes unless there is a conflict (both contexts modify the same object) — in that scenario you have a few options, I'd just abort the entire delete operation and start again.
Apple has good documentation available for all the issues and sample code available here: https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/CoreData/Articles/cdConcurrency.html
It's a bit daunting, you need to do some serious homework, but the actual code is very simple once you've got your head around how everything works. Or just use FCModel, which is designed for fast batch operations.

It's not as processor heavy as you may think :) (of course it depends of data amount)
Feel free to use loop
- (void)deleteAllObjects
{
NSArray *allEntities = self.managedObjectModel.entities;
for (NSEntityDescription *entityDescription in allEntities)
{
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setEntity:entityDescription];
fetchRequest.includesPropertyValues = NO;
fetchRequest.includesSubentities = NO;
NSError *error;
NSArray *items = [self.managedObjectContext executeFetchRequest:fetchRequest error:&error];
for (NSManagedObject *managedObject in items) {
[self.managedObjectContext deleteObject:managedObject];
}
if (![self.managedObjectContext save:&error]) {
NSLog(#"Error occurred");
}
}
}

As others have noted, iterating over the objects is the only way to actually delete the objects in Core Data. This is one of those use cases where Core Data's approach kind of falls down, because it's just not optimized for that kind of use.
But there are ways to deal with it to avoid unwanted delays in your app, so that the user doesn't have to wait while your code chugs over a ton of delete requests.
If you have a lot of objects that need to be deleted and you don't want to have to wait until the process is complete, you can fake the initial delete at first and then later do the actual delete when it's convenient. Something like:
Add a custom boolean attribute to the entity type called something like toBeDeleted with a default value of NO.
When you have a bunch of objects to delete, set toBeDeleted to YES on all of them in a single step by using NSBatchUpdateRequest (new in iOS 8). This class is mostly undocumented, so look at the header file or at the BNR blog post about it. You'll specify the property name and the new attribute value, and Core Data will do a mass, quick update.
Make sure your fetch requests all check that toBeDeleted is NO. Now objects marked for deletion will be excluded when fetching even though they still exist.
At some point-- later on, but soon-- run some code in the background that fetches and deletes objects that have toBeDeleted set to YES.

NSFetchedResultsController feeding table view while background update of same persistent store causes deadlock

Still working on converting an app over from downloading information every time it uses or displays it, to caching it on-phone using CoreData (courtesy of MagicalRecord). This is on iOS 7
Because we don't have a data-push system set up to automatically update the phone's cached data whenever some data changes on the backend, I've been thinking over the last many months (as we worked on other aspects of the app) how to manage keeping a local copy of the data on the phone and being able to have the most up to date data in the cache.
I realized that as long as I still fetch the data every time :-( I can use the phone's CoreData backed cache of data to display and use, and just use the fetch of the data to update the on-phone database.
So I have been converting over the main data objects from being downloaded data making up a complete object, to these main data objects being light stand-in objects for CoreData objects.
Basically, each of the normal data objects in the app, instead of containing all the properties of the object internally, contains only the objectIDof the underlying CoreData object and maybe the app specific ID internally, and all other properties are dynamic and gotten from the CoreData object and passed through (most properties are read-only and updates are done through bulk-rewriting of the core data from passed in JSON)
Like this:
- (NSString *)amount
{
__block NSString *result = nil;
NSManagedObjectContext *localContext = [NSManagedObjectContext MR_newContext];
[localContext performBlockAndWait:^{
FinTransaction *transaction = (FinTransaction *)[localContext existingObjectWithID:[self objectID] error:nil];
if (nil != transaction)
{
result = [transaction.amount stringValue];
}
}];
return result;
}
Occasionally there is one that needs to be set and those look like this:
- (void)setStatus:(MyTransactionStatus)status
{
[MagicalRecord saveWithBlock:^(NSManagedObjectContext *localContext) {
FinTransaction *transaction = (FinTransaction *)[localContext existingObjectWithID:[self objectID] error:nil];
if (nil != transaction)
{
transaction.statusValue = status;
}
} completion:^(BOOL success, NSError *error){}];
}
Now, my issue is that I have a view controller that basically uses an NSFetchedResultsController to display stored data from the local phone's CoreData database in a table view. At the same time as this is happening, and the user may start to scroll through the data, the phone spins off a thread to download updates to the data and then starts updating the CoreData data store with the updated data, at which point it then runs an asynchronous GCD call back on the main thread to have the fetched results controller refetch its data and and tells the table view to reload.
The problem is that if a user is scrolling through the initial fetched results controller fetched data and table view load, and the background thread is updating the same Core Data objects in the background, deadlocks occur. It is not the exact same entities being fetched and rewritten (when a deadlock occurs), i.e., not that object ID 1 is being read and written, but that the same persistent data store is being used.
Every access, read or write, happens in a MR_saveWithBlock or MR_saveWithBlockAndWait (writes/updates of data) as the case may be, and a [localContext performBlock:] or [localContext performBlockAndWait:] as may be appropriate. Each separate read or write has its own NSManagedObjectContext. I have not seen any where there are stray pending changes hanging around, and the actual places it blocks and deadlocks is not always the same, but always has to do with the main thread reading from the same persistent store as the background thread is using to update the data.
The fetched results controller is being created like this:
_frController = [[NSFetchedResultsController alloc] initWithFetchRequest:fetchRequest
managedObjectContext:[NSManagedObjectContext MR_rootSavingContext]
sectionNameKeyPath:sectionKeyPath
cacheName:nil];
and then an performFetch is done.
How can I best structure this sort of action where I need to display the extent data in a table view and update the data store in the background with new data?
While I am using MagicalRecord for most of it, I am open to comments, answers, etc with or without (straight CD) using MagicalRecord.

So the way I'd handle this is to look at having two managed object contexts each with its own persistent store coordinator. Both of the persistent store coordinators talk to the same persistent store on disk.
This approach is outlined in some detail in Session 211 from WWDC 2013 — "Core Data Performance Optimization and Debugging", which you can get to on Apple's Developer Site for WWDC 2013.
In order to use this approach with MagicalRecord, you will need to look at using the upcoming MagicalRecord 3.0 release, with the ClassicWithBackgroundCoordinatorSQLiteMagicalRecordStack (yes, that name needs work!). It implements the approach outlined in the WWDC session, although you need to be aware that there will be changes needed to your project to support MagicalRecord 3, and that it's also not quite released yet.
Essentially what you end up with is:
1 x Main Thread Context: You use this to populate your UI, and for your fetched results controllers, etc. Don't ever make changes in this context.
1 x Private Queue Context: Make all of your changes using the block-based saved methods — they automatically funnel through this context and save to disk.
I hope that makes sense — definitely watch the WWDC session — they use some great animated diagrams to explain why this approach is faster (and shouldn't block the main thread as much as the approach you're using now).
I'm happy to go into more detail if you need it.

Core Data: delete all objects of an entity type, ie clear a table

This has been asked before, but no solution described that is fast enough for my app needs.
In the communications protocol we have set up, the server sends down a new set of all customers every time a sync is performed. Earlier, we had been storing as a plist. Now want to use Core Data.
There can be thousands of entries. Deleting each one individually takes a long time. Is there a way to delete all rows in a particular table in Core Data?
delete from customer
This call in sqlite happens instantly. Going through each one individually in Core Data can take 30 seconds on an iPad1.
Is it reasonable to shut down Core Data, i.e. drop the persistence store and all managed object contexts, then drop into sqlite and perform the delete command against the table? No other activity is going on during this process so I don't need access to other parts of the database.

Dave DeLong is an expert at, well, just about everything, and so I feel like I'm telling Jesus how to walk on water. Granted, his post is from 2009, which was a LONG time ago.
However, the approach in the link posted by Bot is not necessarily the best way to handle large deletes.
Basically, that post suggests to fetch the object IDs, and then iterate through them, calling delete on each object.
The problem is that when you delete a single object, it has to go handle all the associated relationships as well, which could cause further fetching.
So, if you must do large scale deletes like this, I suggest adjusting your overall database so that you can isolate tables in specific core data stores. That way you can just delete the entire store, and possibly reconstruct the small bits that you want to remain. That will probably be the fastest approach.
However, if you want to delete the objects themselves, you should follow this pattern...
Do your deletes in batches, inside an autorelease pool, and be sure to pre-fetch any cascaded relationships. All these, together, will minimize the number of times you have to actually go to the database, and will, thus, decrease the amount of time it takes to perform your delete.
In the suggested approach, which comes down to...
Fetch ObjectIds of all objects to be deleted
Iterate through the list, and delete each object
If you have cascade relationships, you you will encounter a lot of extra trips to the database, and IO is really slow. You want to minimize the number of times you have to visit the database.
While it may initially sound counterintuitive, you want to fetch more data than you think you want to delete. The reason is that all that data can be fetched from the database in a few IO operations.
So, on your fetch request, you want to set...
[fetchRequest setRelationshipKeyPathsForPrefetching:#[#"relationship1", #"relationship2", .... , #"relationship3"]];
where those relationships represent all the relationships that may have a cascade delete rule.
Now, when your fetch is complete, you have all the objects that are going to be deleted, plus the objects that will be deleted as a result of those objects being deleted.
If you have a complex hierarchy, you want to prefetch as much as possible ahead of time. Otherwise, when you delete an object, Core Data is going to have to go fetch each relationship individually for each object so that it can managed the cascade delete.
This will waste a TON of time, because you will do many more IO operations as a result.
Now, after your fetch has completed, then you loop through the objects, and delete them. For large deletes you can see an order of magnitude speed up.
In addition, if you have a lot of objects, break it up into multiple batches, and do it inside an auto release pool.
Finally, do this in a separate background thread, so your UI does not pend. You can use a separate MOC, connected to a persistent store coordinator, and have the main MOC handle DidSave notifications to remove the objects from its context.
WHile this looks like code, treat it as pseudo-code...
NSManagedObjectContext *deleteContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateConcurrencyType];
// Get a new PSC for the same store
deleteContext.persistentStoreCoordinator = getInstanceOfPersistentStoreCoordinator();
// Each call to performBlock executes in its own autoreleasepool, so we don't
// need to explicitly use one if each chunk is done in a separate performBlock
__block void (^block)(void) = ^{
NSFetchRequest *fetchRequest = //
// Only fetch the number of objects to delete this iteration
fetchRequest.fetchLimit = NUM_ENTITIES_TO_DELETE_AT_ONCE;
// Prefetch all the relationships
fetchRequest.relationshipKeyPathsForPrefetching = prefetchRelationships;
// Don't need all the properties
fetchRequest.includesPropertyValues = NO;
NSArray *results = [deleteContext executeFetchRequest:fetchRequest error:&error];
if (results.count == 0) {
// Didn't get any objects for this fetch
if (nil == results) {
// Handle error
}
return;
}
for (MyEntity *entity in results) {
[deleteContext deleteObject:entity];
}
[deleteContext save:&error];
[deleteContext reset];
// Keep deleting objects until they are all gone
[deleteContext performBlock:block];
};
[deleteContext preformBlock:block];
Of course, you need to do appropriate error handling, but that's the basic idea.
Fetch in batches if you have so much data to delete that it will cripple memory.
Don't fetch all the properties.
Prefetch relationships to minimize IO operations.
Use autoreleasepool to keep memory from growing.
Prune the context.
Perform the task on a background thread.
If you have a really complex graph, make sure you prefetch all the cascaded relationships for all entities in your entire object graph.
Note, your main context will have to handle DidSave notifications to keep its context in step with the deletions.
EDIT
Thanks. Lots of good points. All well explained except, why create the
separate MOC? Any thoughts on not deleting the entire database, but
using sqlite to delete all rows from a particular table? – David
You use a separate MOC so the UI is not blocked while the long delete operation is happening. Note, that when the actual commit to the database happens, only one thread can be accessing the database, so any other access (like fetching) will block behind any updates. This is another reason to break the large delete operation into chunks. Small pieces of work will provide some chance for other MOC(s) to access the store without having to wait for the whole operation to complete.
If this causes problems, you can also implement priority queues (via dispatch_set_target_queue), but that is beyond the scope of this question.
As for using sqlite commands on the Core Data database, Apple has repeatedly said this is a bad idea, and you should not run direct SQL commands on a Core Data database file.
Finally, let me note this. In my experience, I have found that when I have a serious performance problem, it is usually a result of either poor design or improper implementation. Revisit your problem, and see if you can redesign your system somewhat to better accommodate this use case.
If you must send down all the data, perhaps query the database in a background thread and filter the new data so you break your data into three sets: objects that need modification, objects that need deletion, and objects that need to be inserted.
This way, you are only changing the database where it needs to be changed.
If the data is almost brand new every time, consider restructuring your database where these entities have their own database (I assume your database already contains multiple entities). That way you can just delete the file, and start over with a fresh database. That's fast. Now, reinserting several thousand objects is not going to be fast.
You have to manage any relationships manually, across stores. It's not difficult, but it's not automatic like relationships within the same store.
If I did this, I would first create the new database, then tear down the existing one, replace it with the new one, and then delete the old one.
If you are only manipulating your database via this batch mechanism, and you do not need object graph management, then maybe you want to consider using sqlite instead of Core Data.

iOS 9 and later
Use NSBatchDeleteRequest. I tested this in the simulator on a Core Data entity with more than 400,000 instances and the delete was almost instantaneous.
// fetch all items in entity and request to delete them
let fetchRequest = NSFetchRequest(entityName: "MyEntity")
let deleteRequest = NSBatchDeleteRequest(fetchRequest: fetchRequest)
// delegate objects
let myManagedObjectContext = (UIApplication.sharedApplication().delegate as! AppDelegate).managedObjectContext
let myPersistentStoreCoordinator = (UIApplication.sharedApplication().delegate as! AppDelegate).persistentStoreCoordinator
// perform the delete
do {
try myPersistentStoreCoordinator.executeRequest(deleteRequest, withContext: myManagedObjectContext)
} catch let error as NSError {
print(error)
}
Note that the answer that #Bot linked to and that #JodyHagins mentioned has also been updated to this method.

Really your only option is to remove them individually. I do this method with a ton of objects and it is pretty fast. Here is a way someone does it by only loading the managed object ID so it prevents any unnecessary overhead and makes it faster.
Core Data: Quickest way to delete all instances of an entity

Yes, it's reasonable to delete the persistent store and start from scratch. This happen fairly quick. What you can do is remove the persistent store (with the persistent store URL) from the persistent store coordinator, and then use the url of the persistent store to delete the database file from your directory folder. I did it using NSFileManager's removeItemAtURL.
Edit: one thing to consider: Make sure to disable/release the current NSManagedObjectContext instance, and to stop any other thread which might be doing something with a NSManagedObjectContext which is using the same persistent store. Your application will crash if a context tries to access the persistent store.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart