How to create snapshot of CoreData state?

How to create snapshot of CoreData state? - ios

Background story
I am developing a big iOS app. This app works under specific assumptions. The main of them is that app should work offline with internal storage which is a snapshot of last synchronized state of data saved on server. I decided to use CoreData to handle this storage. Every time app launches I check if WiFi connection is enabled and then try to synchronize storage with server. The synchronization can take about 3 minutes because of size of data.
The synchronization process consists of several stages and in each of them I:
fetch some data from the server (XML)
deserialize it
save it in Core Data
Problem
Synchronization process can be interrupted for several reasons (internet connection, server down, user leaving application, etc). This may cause data to be out-of-sync.
Let's assume that synchronization process has 5 stages and it breaks after third. It results in 3/5 of data being updated in internal storage and the rest being out of sync. I can't allow it because data are strongly connected to each other (business logic).
Goal
I don't know if it is possible but I'm thinking about implementing one solution. On start of synchronization process I would like to create snapshot (some kind of copy) of current state of Core Date and during synchronization process work on it. When synchronization process completes with success then this snapshot could overwrite current CoreData state. When synchronization interrupts then snapshot can be simply aborted. My internal storage will be secured.
Questions
How to create CoreData snapshot?
How to work with CoreData snapshot?
How to overwrite CoreDate state with snapshot?
Thanks in advice for any help. Code examples, if it is possible, will be appreciated.
EDIT 1
The size of data is too big to handle it with multiple CoreData's contexts. During synchronization I am saving current context multiple times to cleanup memory. If I do not do it, the application will crash with memory error.
I think it should be resolved with multiple NSPersistentStoreCoordinators using for example this method: link. Unfortunately, I don't know how to implement this.

You should do exactly what you said. Just create class (lets call it SyncBuffer) with methods "load", "sync" and "save".
The "load" method should read all entities from CoreData and store it in class variables.
The "sync" method should make all the synchronisation using class variables.
Finally the "save" method should save all values from class variables to CoreData - here you can even remove all data from CoreData and save brand new values from SyncBuffer.

A CoreData stack is composed at its core by three components: A context (NSManagedObjectContext) a model (NSManagedObjectModel) and the store coordinator (NSPersistentStoreCoordinator/NSPersistentStore).
What you want is to have two different contexts, that shares the same model but use two different stores. The store itself will be of the same type (i.e. an SQLite db) but use a different source file.
At this page you can see some documentation about the stack:
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/CoreData/InitializingtheCoreDataStack.html#//apple_ref/doc/uid/TP40001075-CH4-SW1
The NSPersistentContainer is a convenience class to initialise the CoreData stack.
Take the example of the initialisation of a NSPersistentContainer from the link: you can have the exact same code to initialise it, twice, but with the only difference that the two NSPersistentContainer use a different .sqlite file: i.e. you can have two properties in your app delegate called managedObjectContextForUI and managedObjectContextForSyncing that loads different .sqlite files. Then in your program you can use the context from one store to get current data to show to the user and you can use the context that use the other store with a different .sqlite if you are doing sync operations. When the sync operations are finally done you can eventually swap the two files and after clearing and reloading the NSPersistentContainer (this might be tricky, because you will want to invalidate and reload all managed objects: you are switching to an entirely new context) you can then show the newly synced data to the user and start syncing again on a new .sqlite file.

The way I understand the problem is that you wish to be able download a large "object graph". It is however so large that it cannot be loaded at once in memory, so you would have to break it in chunks and then merge it locally into to Core data.
If that is the case, I think that's not trivial. I am not sure I can think of direct solution without understanding the object relations and even then it may be really overwhelming.
An overly simplistic solution may be to generate the sqlite file on the backend then download it in chunks; it seems ugly, but it serves to separate the business logic from the sync, i.e. the sqlite file becomes the transport layer. So, I think the essence to the solution would be to find a way to physically represent the data you are syncing in a format that allows for splitting it in chunks and that can afterwards be merged into a sqlite file (if you insist on using Core data).
Please also note that as far as I know Amazon (https://aws.amazon.com/appsync/) and Realm (https://realm.io/blog/introducing-realm-mobile-platform/) provide background sync of you local database, but those are paid services and you would have to be careful not be locked in (should not depend on their libs in your model layer, instead have a translation layer).

Related

Synchronising old data with NSPersistentCloudKitContainer

I'm using NSPersistentCloudKitContainer to synchronise data between different devices with CloudKit. It works perfectly well with a new project, however when I'm using it with old projects the old data which was added with NSPersistentContainer does not synchronise.
What I would like to achieve is to synchronise old data that was added with NSPersistentContainer after changing it to NSPersistentCloudKitContainer. Is it possible?

I've found a solution that works for my Core Data database - and mine is quite complex with multiple many-to-many relationships (A surgery/anaesthesia logbook app called Somnus)
I started by creating a new attribute for all my Core Data Entities called sentToCloud and setting it to FALSE by default in the Core Data model.
On the first load for an existing user:
Fetch request using the predicate "sentToCloud == FALSE" for each Entity type
Change sentToCloud to TRUE for each Object then save the MOC
This triggers NSPersistentCloudKitContainer to start syncing
I've done this in order of 'priority' that works for my database, assuming the iCloud sync sessions match the order in which Core Data is modified. In my testing this seems to be the case:
I first sync all child (or most child-like) Entities
Then sync their parents, and so on, up the tree
I sync the Object the user interacts with last, once everything else is in place so the relationships are intact and they don't think their data is borked while we wait for NSPersistentCloudKitContainer to reconnect all the relationships
I also leave any binary data (magically turned into a CKAsset behind-the-scenes) to last as it's not the most important part of my database
My database synced successfully from my iPad to iPhone and all relationships and binary data appear correct.
Now all I need is a way to tell the user when data is syncing (and/or some sort of progress) and for them to turn it off entirely.
ADDENDUM
So I tried this again, after resetting all data on the iCloud dashboard and deleting the apps on my iPhone & iPad.
Second time around it only synced some of the data. It seems like it still has a problem dealing with large sync requests (lots of .limitExceeded CKErrors in the console).
What's frustrating is that it's not clear if it's breaking up the requests to try again or not - I don't think it is. I've left it overnight and still no further syncing, only more .limitExceeded CKErrors.
Maybe this is why they don't want to sync existing data?
Personally, I think this is silly. Sometimes users will do a batch process on their data which would involve updating many thousands of Core Data objects in one action. If this is just going to get stuck with .limitExceeded CKErrors, NSPersistentCloudKitContainer isn't going to be a very good sync solution.
They need a better way of dealing with these errors (breaking up the requests into smaller requests), plus the ability to see what's going on (and perhaps present some UI to the user).
I really need this to work because as it stands, there is no way to synchronise many-to-many Core Data relationships using CloudKit.
I just hope that they're still working on this Class and improving it.

Singletons vs Core Data

This question is not about the technical problem, but rather the approach.
I know two more or less common approaches to store the data received from the server in your app:
1) Using managers, data holders etc to store the data. They are most often some kind of singleton and are used to store the models received from the server. (E.g. - the array of the posts/places/users) Singletons are needed to be able to access the data from any screen. I think the majority of apps uses this approach.
2) Using Core Data (or maybe Realm) as in-memory storage. This approach avoids having singletons, but, I guess, it is a bit more complex (and crash risky) to maintain and support.
How do you store data and why?
P.S. Any answers would help. But big "thank you" for detailed ones, with reasons.

The reason people opt to use Core Data/Relam/Shark or any other iOS ORM is mainly for the purpose of persisting data between runs of the app.
Currently there are two ways of doing this, for single values and very small (not that I encourage it) objects you can use the UserDefaults to persist between app launches. For a approach closer to a database, infact in the case of Core Data and SharkORM, they are built on top of SQLite, you need to use an ORM.
Using a manager to store an array of a data models will only persist said models for the lifetime of the app. For example when the user force quits the app, restarts their device or in some circumstances when iOS terminates your app, all that data will be lost permanently. This is because it is stored in RAM which is volatile memory, rather than in a database on the disk itself.
Using a database layer even if you don't specifically require persistence between launches can have its advantages though; for instance SharkORM allows you to execute raw SQL queries on your objects if you don't want to use the built in powerful query builder. This can be useful to quickly pull the model you are interested in rather than iterating through a local array.
In reply to your question, how do I store data?
Well, I use a combination of all three. Say for instance I called to an API for some data which I wanted to display there and then to the user, I would use a manager instance with an array to hold the data model.
But on the flipside if I wanted to store that data for later or if I needed to execute a complex query on it, I would store it on disk using Shark.
If however I just wanted to store whether or not the user had seen my on boarding flow I would just persist a boolean value into UserDefaults.
I hope this is detailed enough for you.

CoreData isn't strictly "in-memory". You can load objects into your data model and save them into their context, then they might actually be on disk and out of main memory, and they can easily be brought back via fetch requests.
Singletons, on the other hand, do typically stay in main memory all the time until the user terminates the app. If you have larger objects that you are storing in some data structure (e.g. full resolution images when all you really needed was a thumbnail), this can be quite a resource hog.

CoreData in-memory setup with MagicalRecord 3

Hello I'm using CoreData + MagicalRecord 3 to manage the data in my app. Until then everything was working fine, but then I realize in production than my app is freezing like hell !
So I started to investigate knowing about the fact that to not stuck the UI, it's better to have a main context and a background context and save stuff in background etc...
Nevertheless I have to question due to my setup. I use CoreData in-memory store system (for the best performance) and I don't care about storing the data on disk of my app, I'm fine with a volatile model that will be destroyed when the app is killed or in background for too long. I just want to be able to find my data from any view controller and without coupling.
So I have few questions :
1) If I would use 1 unique context, what would happen if I NEVER save it to the memory store ? For instance if I MR_createEntity then I retrieve this entity from the context and update it, is it updated everywhere or do I have to save it so it can be updated ? In other term was is the interest of saving for in-memory where you don't want to persist the data forever ?
2) If I use 1 unique context that I declare being background, if I display my screen before my data is finished to saved, the screen won't be able to find and display my data right ? Unless I use NSFetchResultController right ?

1) you want to save your data even with an in memory store for a couple of reasons. First, so that you can use core data properly in the case where you might change your mind and persist your data. Second, you'll likely want to access and process some data on different threads/queues. In that case, you'll have to use Core Data's data safety mechanisms for threads/queues. The store is the lowest level at which Core Data will sync data across threads (the old way). This may be less important if you use nested contexts to sync your data (the new way). But even with nested contexts, you'll need call save in order for your changes to merge across contexts. Core Data doesn't really like it when you save to a nil store.
2) You can make and use your own context for displaying data. NSFetchedResultsController does a lot of the leg work in listening for the correct notifications and making sure you're getting very specific updates for the data you asked for in the first place. NSFRC is not always necessary, but will certainly be the easiest way to start.

Core data, what concurrency model to use?

I am developing iOS an app which will gather big amounts of data from several sources (up to tens of thousands of objects, but simple objects, no images) and save it to my own database using core data. I then analyse this data and display the results to the user.
I want to know if there is any benefit to using a Main Queue Nsmanagedobjectcontext or if it is enough that I use a private one.
I also want to know what the benefit is of having several NSManagedObjectContext or if one is enough?
The concurrency model i am using currently only has one private queue nsmanagedobjectcontext connected to a persistant store coordinator. All the data analysis is performed on the private queue and then I simply pass the analyzed data to the main queue to display it. On older devices (iPhone 4) my application can sometimes crash when too much data is being loaded (i.e. downloaded from the external databases) at the same time, is this related to my choice of concurrency model?

Your current approach sounds fine. You only need a main thread context if you want the main thread to interact with the data, and in your case you don't so that's fine.
Your memory management is effectively unrelated and is more tied to how many things you have going on at once (it sounds like one) and how many objects you try to keep in main memory at any one time (it sounds like many) instead of faulting them out to the data store. This is what you need to look at / work on. Instruments can help you see how many objects you're keeping in memory.
At least call refreshObject:mergeChanges: with NO for merge changes to fault out any objects that you aren't using.
Also, remember that you're working on a mobile device and that processing up to tens of thousands of objects is a job better handled by a server...

From UIManagedDocument to traditional Core Data stack

I created a new App using UIManagedDocument. While on my devices everything is fine, I got a lot of bad ratings, because there are problems on other devices :(
After a lot of reading and testing, I decided to go back to the traditional Core Data stack.
But what is the best way to do this with an app, that is already in the app store?
How can I build this update? What should I take care of?
Thanks,
Stefan

I think you may be better off to determine your issues with UIManagedDocument and resolve them.
However, if you want to go to plain MOC, you only have a few things to worry about. The biggest is that the UIMD stores things in a file package, and depending on your options you may have to worry about change logs.
In the end, if you want a single sqlite file, and you want to reduce the possibility of confusion, you have a class that simply opens your UIManagedDocument, and fetches each object, then replicates it in the single sqlite file for your new MOC.
Now, you should not need a different object model, so you should not have any migration issues.
Then, just delete the file package that holds the UIManagedDocument, and only use your single file sqlite store.
Basically, on startup, you try to open the UIManagedDocument. If it opens, load every object and copy it into the new database. Then delete it.
From then, you should be good to go.
Note, however, that you may now experience some UI delays because all the database IO is happening on the main UI thread. To work around this, you may need to use a separate MOC, and coordinate changes via the normal COreData notification mechanisms. There are tons of documents, examples, and tutorials on that.
EDIT
Thanks for your answer. My problem with these issues is, that I'm not
able to reproduce them. All my Devices are working fine. But I got a
lot mails, about problems like this: - duplicate entries - no data
after stoping and restarting the app - some say, that the app works
fine for some days and stops working(no new data). These are all
strange things, that don't happen on my devices. So for me the best
way is to go back to plain MOC. My DB doesn't hold many user generated
data, all the data is loaded from a webservice, so it's no problem to
delete the data and start of using a new DB. – Urkman
Duplicate entries. That one sounds like the bug related to temporary/permanent IDs. There are lots of posts about that. Here is one: Core Data could not fullfil fault for object after obtainPermanantIDs
Not saving. Sounds like you are not using the right API for UIManagedDocument. You need to make sure to not save the MOC directly, and either use an undo manager or call updateChangeCount: to notify UIManagedDocument that it has dirty data that you want to be saved. Again, lots of posts about that as well. Search for updateChangeCount.
However, you know your app best, and it may just be better and easier to use plain Core Data.
Remember, if you are doing lots of imports from the web, to use a separate MOC, and have your main MOC watch for DidSave notifications to update itself with the newly imported data.

UIManagedDocument is a special kind of document, an UIDocument subclass, that stores its data using Core Data Framework. So it combines the power of document architecture and core data capabilities.
You can read more about document based architecture from Document Based App Programming Guide for iOS and I recommend WWDC2011 Storing Documents in iCloud using iOS5 session video. I also recommend Stanford CS193P: iPad and iPhone App Development (Fall 2011) Lecture 13.
What is created when you call saveToURL:forSaveOperation:completionHandler: is an implementation detail of UIManagedDocument and UIDocument and you should not really worry or depend on it. However in current implementation a folder containing an sqlite database file is being created.
No. All entities will be contained in a single database file also more generally called: a persistent store. It is possible to use more than one persistent store, but those are more advanced use cases and UIManagedDocument currently uses one.
UIManagedDocument's context refers to a NSManagedObjectContext from underlying Core Data Framework. UIManagedDocument actually operates two of those in parallel to spin off IO operations to a background thread. When it comes to the nature of a context itself here's a quote from Core Data Programming Guide:
You can think of a managed object context as an intelligent scratch pad. When you fetch objects from a persistent store, you bring temporary copies onto the scratch pad where they form an object graph (or a collection of object graphs). You can then modify those objects however you like. Unless you actually save those changes, however, the persistent store remains unaltered.
But it really is a good idea to take a look at the lectures and other material I posted above to get a general picture of the technologies used and their potential value to you as a developer in different situations.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart