I write an app that churns a great deal of data through Core Data. I clean up this data after the user has exited the app in the background. Because WAL checkpointing appears to be a major cause of my UI pauses, I would like to also force a WAL checkpoint. (Yes, I know about creating a second Core Data stack. That will also be done but this problem will still remain. I have existing experiments using a second stack but they haven't yet resulted in any appreciable advantage.) The Google reveals the following page, New Default Journaling Mode, which goes into a very modest discussion of how to force a checkpoint of a database before copying it elsewhere. My issue is I would like to force the checkpoint on the live database without tearing down my whole UI. My experiments re-adding the persistent store to the coordinator are to no avail. They result in an infinite loop.
Clearly, checkpointing can be done without affecting my existing MOCs and PSC because it already happens. I just want to force it at a well known time that doesn't affect my user's happiness.
That document's description may be modest, but that's how it's done. Core Data isn't really a SQLite wrapper, and it provides very limited direct access to SQLite. Passing options when adding the persistent store is the only option.
In short: you can't force a checkpoint on a live persistent store.
What you can do is use that method all the time, changing to rollback journal mode all the time instead of only for checkpoint purposes. By using the journal_mode option, you can switch SQLite to a different mode where checkpointing isn't needed. As long as you include
NSDictionary *options = #{NSSQLitePragmasOption:#{#"journal_mode":#"DELETE"}};
when adding the store, the problem doesn't exist.
If you want to retain wal mode, you can try using other SQLite pragmas in the option list to tune the checkpointing behavior. For example, the wal_autocheckpoint pragma tunes how often checkpoints occur. You might be able to get better results by adjusting that. You still can't invoke a checkpoint on demand, but you'll change the performance.
Related
Background story
I am developing a big iOS app. This app works under specific assumptions. The main of them is that app should work offline with internal storage which is a snapshot of last synchronized state of data saved on server. I decided to use CoreData to handle this storage. Every time app launches I check if WiFi connection is enabled and then try to synchronize storage with server. The synchronization can take about 3 minutes because of size of data.
The synchronization process consists of several stages and in each of them I:
fetch some data from the server (XML)
deserialize it
save it in Core Data
Problem
Synchronization process can be interrupted for several reasons (internet connection, server down, user leaving application, etc). This may cause data to be out-of-sync.
Let's assume that synchronization process has 5 stages and it breaks after third. It results in 3/5 of data being updated in internal storage and the rest being out of sync. I can't allow it because data are strongly connected to each other (business logic).
Goal
I don't know if it is possible but I'm thinking about implementing one solution. On start of synchronization process I would like to create snapshot (some kind of copy) of current state of Core Date and during synchronization process work on it. When synchronization process completes with success then this snapshot could overwrite current CoreData state. When synchronization interrupts then snapshot can be simply aborted. My internal storage will be secured.
Questions
How to create CoreData snapshot?
How to work with CoreData snapshot?
How to overwrite CoreDate state with snapshot?
Thanks in advice for any help. Code examples, if it is possible, will be appreciated.
EDIT 1
The size of data is too big to handle it with multiple CoreData's contexts. During synchronization I am saving current context multiple times to cleanup memory. If I do not do it, the application will crash with memory error.
I think it should be resolved with multiple NSPersistentStoreCoordinators using for example this method: link. Unfortunately, I don't know how to implement this.
You should do exactly what you said. Just create class (lets call it SyncBuffer) with methods "load", "sync" and "save".
The "load" method should read all entities from CoreData and store it in class variables.
The "sync" method should make all the synchronisation using class variables.
Finally the "save" method should save all values from class variables to CoreData - here you can even remove all data from CoreData and save brand new values from SyncBuffer.
A CoreData stack is composed at its core by three components: A context (NSManagedObjectContext) a model (NSManagedObjectModel) and the store coordinator (NSPersistentStoreCoordinator/NSPersistentStore).
What you want is to have two different contexts, that shares the same model but use two different stores. The store itself will be of the same type (i.e. an SQLite db) but use a different source file.
At this page you can see some documentation about the stack:
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/CoreData/InitializingtheCoreDataStack.html#//apple_ref/doc/uid/TP40001075-CH4-SW1
The NSPersistentContainer is a convenience class to initialise the CoreData stack.
Take the example of the initialisation of a NSPersistentContainer from the link: you can have the exact same code to initialise it, twice, but with the only difference that the two NSPersistentContainer use a different .sqlite file: i.e. you can have two properties in your app delegate called managedObjectContextForUI and managedObjectContextForSyncing that loads different .sqlite files. Then in your program you can use the context from one store to get current data to show to the user and you can use the context that use the other store with a different .sqlite if you are doing sync operations. When the sync operations are finally done you can eventually swap the two files and after clearing and reloading the NSPersistentContainer (this might be tricky, because you will want to invalidate and reload all managed objects: you are switching to an entirely new context) you can then show the newly synced data to the user and start syncing again on a new .sqlite file.
The way I understand the problem is that you wish to be able download a large "object graph". It is however so large that it cannot be loaded at once in memory, so you would have to break it in chunks and then merge it locally into to Core data.
If that is the case, I think that's not trivial. I am not sure I can think of direct solution without understanding the object relations and even then it may be really overwhelming.
An overly simplistic solution may be to generate the sqlite file on the backend then download it in chunks; it seems ugly, but it serves to separate the business logic from the sync, i.e. the sqlite file becomes the transport layer. So, I think the essence to the solution would be to find a way to physically represent the data you are syncing in a format that allows for splitting it in chunks and that can afterwards be merged into a sqlite file (if you insist on using Core data).
Please also note that as far as I know Amazon (https://aws.amazon.com/appsync/) and Realm (https://realm.io/blog/introducing-realm-mobile-platform/) provide background sync of you local database, but those are paid services and you would have to be careful not be locked in (should not depend on their libs in your model layer, instead have a translation layer).
I have a strange problem with Core Data in an iOS app where sometimes the WAL file becomes huge (~1GB). It appears there are other people with the problem (e.g. Core Data sqlite-wal file gets MASSIVE (>7GB) when inserting ~5000 rows).
My initial thought is to delete the WAL file at app launch. It seems from reading the sqlite documentation on the matter that this will be fine. But does anyone know of any downsides to doing this?
I'd of course like to get to the bottom of why the WAL file is growing so big, but I can't get to the bottom of it right now and want to put in a workaround while I dig deeper into the problem.
It's worth pointing out that my Core Data database is more of a cache. So it doesn't matter if I lose data that's in the WAL. What I really need to know is, will the database be completely corrupted if I delete the WAL? My suspicion is no, otherwise the WAL doesn't serve one of its purposes.
Couple of things:
You can certainly delete the WAL file. You will lose any committed transactions that haven't been checkpointed back to the main file. (Thus violating the "durability" part of ACID, but perhaps you don't care.)
You can control the size of the WAL file on disk with the journal_size_limit pragma (if it bothers you). You may want to manually checkpoint more often too. See "Avoiding Excessively Large WAL files" here: https://www.sqlite.org/wal.html
I dislike all the superstitious bashing of WAL mode. WAL mode is faster, more concurrent, and much simpler since it dispenses with the all the locking level shenanigans (and most "database is busy" problems) that go with rollback journals. WAL mode is the right choice in almost every situation. (The only place it is problematic is on flash filesystems that don't support shared memory-mapped access to files. In that case, the "unofficial" SQLITE_SHM_DIRECTORY compile directive can be used to move the .shm file to a different kind of filesystem -- e.g. tmpfs -- but this should not be a concern on iOS.)
It baffles me how many people here are suggesting it's safe to delete WAL files, without even bad looks in their direction.
The documentation explicitly lists this as one of the official ways to corrupt a database. It doesn't say deleting a hot WAL may cause you to lose most recent transactions or something benign like that. It says it may corrupt the database.
Why? Because an application may have crashed in the middle of a checkpointing operation. When this happens, the database file itself is in an invalid state unless paired with the new data contained in the WAL.
So the answer is a clear no. Don't delete WAL files.
What you can do to clear the file is call PRAGMA schema.wal_checkpoint(TRUNCATE);
I have been seeing quite a few negative reports on WAL in iOS 7. I have had to disable it on several projects until I have had time to explore the issues more throughly.
I would not delete the journal file but you could play with the option of vacuuming the SQLite file which will cause SQLite to "consume" the journal file. You can do this by adding the NSSQLiteManualVacuumOption as part of the options when you add the NSPersistentStore to the NSPersistentStoreCoordinator.
If that ends up being time consuming then I would suggest disabling WAL. I have not seen any ill effects to disabling it (yet).
WAL mode has problems, don't use it. Problems vary but the very large size your report is one, other problems include failure during migration (using NSPersistentStoreCoordinators migratePersistentStore) and failure during importing of iCloud transaction logs. So while there are reported benefits until these bugs are fixed its probably unwise to use WAL mode.
And NO you can't delete the Write Ahead Log, because that contains the most recent data.
Set the database to use rollback journal mode and I think you will find you no longer have these very large files when loading lots of data.
Here is an extract which explains how WAL works. Unless you can guarantee that your app has run a checkpoint I don't see how you can delete the WAL file without running the risk of deleting committed transactions.
How WAL Works
The traditional rollback journal works by writing a copy of the
original unchanged database content into a separate rollback journal
file and then writing changes directly into the database file. In the
event of a crash or ROLLBACK, the original content contained in the
rollback journal is played back into the database file to revert the
database file to its original state. The COMMIT occurs when the
rollback journal is deleted.
The WAL approach inverts this. The original content is preserved in
the database file and the changes are appended into a separate WAL
file. A COMMIT occurs when a special record indicating a commit is
appended to the WAL. Thus a COMMIT can happen without ever writing to
the original database, which allows readers to continue operating from
the original unaltered database while changes are simultaneously being
committed into the WAL. Multiple transactions can be appended to the
end of a single WAL file.
Checkpointing
Of course, one wants to eventually transfer all the transactions that
are appended in the WAL file back into the original database. Moving
the WAL file transactions back into the database is called a
"checkpoint".
Another way to think about the difference between rollback and
write-ahead log is that in the rollback-journal approach, there are
two primitive operations, reading and writing, whereas with a
write-ahead log there are now three primitive operations: reading,
writing, and checkpointing.
By default, SQLite does a checkpoint automatically when the WAL file
reaches a threshold size of 1000 pages. (The
SQLITE_DEFAULT_WAL_AUTOCHECKPOINT compile-time option can be used to
specify a different default.) Applications using WAL do not have to do
anything in order to for these checkpoints to occur. But if they want
to, applications can adjust the automatic checkpoint threshold. Or
they can turn off the automatic checkpoints and run checkpoints during
idle moments or in a separate thread or process.
There are quite good answers on this thread, but i'm adding this one to link to the Apple official QnA about journaling mode in iOS7 Core Data:
https://developer.apple.com/library/ios/qa/qa1809/_index.html
They give differents solutions:
To safely back up and restore a Core Data SQLite store, you can do the
following:
Use the following method of NSPersistentStoreCoordinator class, rather
than file system APIs, to back up and restore the Core Data store:
- (NSPersistentStore *)migratePersistentStore:(NSPersistentStore *)store toURL:(NSURL *)URL options:(NSDictionary *)options withType:(NSString *)storeType error:(NSError **)error
Note that this is the option we recommend.
Change to rollback journaling mode when adding the store to a
persistent store coordinator if you have to copy the store file.
Listing 1 is the code showing how to do this:
Listing 1 Use rollback journaling mode when adding a persistent store
NSDictionary *options = #{NSSQLitePragmasOption:#{#"journal_mode":#"DELETE"}}; if (! [persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType
configuration:nil
URL:storeURL
options:options
error:&error])
{
// error handling.
}
For a store that was loaded with the WAL mode, if both the main store file and the
corresponding -wal file
exist, using rollback journaling mode to add the store to a persistent
store coordinator will force Core Data to perform a checkpoint
operation, which merges the data in the -wal file to the store file.
This is actually the Core Data way to perform a checkpoint operation.
On the other hand, if the -wal file is not present, using this
approach to add the store won't cause any exceptions, but the
transactions recorded in the missing -wal file will be lost.
VERY IMPORTANT EDIT
If some of your users are on iOS 8.1 and you chose the first solution (the one Apple recommends), note that their many-to-many data relationships will be completely discarded. Lost. Deleted. In the entire migrated database.
This is a nasty bug apparently fixed in iOS 8.2. More info here http://mjtsai.com/blog/2014/11/22/core-data-relationships-data-loss-bug/
You should never delete the sqlite WAL file, it contains transactions that haven't been written to the actual sqlite file yet. Instead force the database to checkpoint, and then clean up the WAL file for you.
In CoreData the best way to do this is to open the database with the DELETE journal mode pragma. This will checkpoint and then delete the WAL file for you.
NSDictionary *options = #{ NSSQLitePragmasOption: #{ #"journal_mode": #"DELETE"}};
[psc addPersistentStoreWithType:NSSQLiteStoreType
configuration:nil
URL:_storeURL
options:options
error:&localError];
For sanity sake you should ensure you only have one connection to the persistent store when you do this, i.e. only one persistent store instance in a single persistent store coordinator.
FWIW in your particular case you may wish to use TRUNCATE or OFF for your initial database import, and switch to WAL for updates.
https://www.sqlite.org/pragma.html#pragma_journal_mode
So, I have been using MagicalRecord to develop an iPad app, and recently after moving to an auto-migrating store I have been experiencing some issues. I need to sync my .db file from one device over to another, so I need all of the data to be in the .db, but it seems like with WAL journaling mode enabled (the default for Magical Record auto-migrating stores) no matter how I save it only persists the changes to either the .db-wal or the .db-shm files. I switched to a normal sqlite store and everything worked fine. So, my question is, with WAL journaling enabled do I need to do anything special to actually get Core Data to save to the main database, or will I just have to disable it?
Change the journal mode. You have the Magical Record source, after all. Change the SQLite journal mode to DELETE, and the journal mode will be deleted after every transaction. Disabling journalling is a really bad idea, don't do that. But using a different mode should be fine.
Core Data does not offer any API for manipulating the journal once the persistent store is open. SQLite is an implementation detail, and Core Data doesn't expose the internal SQLite details. The closest you can get is the options parameter when setting up the Core Data stack, which is where you can change the journal mode (and where MR changes it).
The -wal file is part of the database; you must synchronize it together with the .db file.
Alternatively, you can copy the data to the main database file by executing a checkpoint.
I am creating an app where data needs to be displayed right away from the local datastore. I fire off a background thread when the app starts to determine if iCloud is available.
I looked everywhere and can't find the solution to this: when iCloud becomes available, can I change the "options" on the persistentStore to start using iCloud transactions?
I'm not sure what the proper approach is in this situation. Everything I try causes the application to crash.
Originally I had it so the iCloud checking wasn't in a background thread and the app worked fine, but occasionally timed out.
You have not to know when iCloud becomes available. You just work with data but you don't send them directly to iCloud. iOS does it instead you. So only it knows when and how it should send data.
No, you can't change the options on an NSPersistentStore object once it exists. You can only specify the options when adding the persistent store to the NSPersistentStoreCoordinator. The closest you could get to changing options would be to tear down the entire Core Data stack and start over with different options.
That wouldn't help, though, because:
Even if you have detected that iCloud is available (I'm guessing using NSFileManager, either via its ubiquityIdentityToken or by calling URLForUbiquityContainerIdentifier:), your call to addPersistentStoreWithType:configuration:URL:options:error: might still block for a while. If there's new data available in iCloud, it doesn't start downloading until you add the persistent store, and that method doesn't return until the download process is finished. And sometimes, iCloud just makes that method block for a while for no readily apparent reason.
If you let the user make any changes to the data while using non-iCloud options, those changes will not get automatically sent to the cloud later. Core Data only sends changes to iCloud when the data changes while iCloud is active-- which makes it generate a transaction. You'd have to load and re-save any changes the user made, or those changes would never make it to the cloud.
You have, unfortunately, hit on one of the major stumbling points when using Core Data with iCloud. You can't make the full data store available until Core Data finishes communicating with iCloud-- because your call to add the persistent store doesn't return until then. And you can't do anything to speed up that process. This is just one of the headaches you'll run into if you continue trying to use iCloud with Core Data.
Depending on the nature of your data you might be able to use two data stores, one purely local and one synced via iCloud. You could make the purely local data store available while the iCloud one tries to get its act together well enough to be useful. If you stick with one data store though, you're stuck with the delay.
I created a new App using UIManagedDocument. While on my devices everything is fine, I got a lot of bad ratings, because there are problems on other devices :(
After a lot of reading and testing, I decided to go back to the traditional Core Data stack.
But what is the best way to do this with an app, that is already in the app store?
How can I build this update? What should I take care of?
Thanks,
Stefan
I think you may be better off to determine your issues with UIManagedDocument and resolve them.
However, if you want to go to plain MOC, you only have a few things to worry about. The biggest is that the UIMD stores things in a file package, and depending on your options you may have to worry about change logs.
In the end, if you want a single sqlite file, and you want to reduce the possibility of confusion, you have a class that simply opens your UIManagedDocument, and fetches each object, then replicates it in the single sqlite file for your new MOC.
Now, you should not need a different object model, so you should not have any migration issues.
Then, just delete the file package that holds the UIManagedDocument, and only use your single file sqlite store.
Basically, on startup, you try to open the UIManagedDocument. If it opens, load every object and copy it into the new database. Then delete it.
From then, you should be good to go.
Note, however, that you may now experience some UI delays because all the database IO is happening on the main UI thread. To work around this, you may need to use a separate MOC, and coordinate changes via the normal COreData notification mechanisms. There are tons of documents, examples, and tutorials on that.
EDIT
Thanks for your answer. My problem with these issues is, that I'm not
able to reproduce them. All my Devices are working fine. But I got a
lot mails, about problems like this: - duplicate entries - no data
after stoping and restarting the app - some say, that the app works
fine for some days and stops working(no new data). These are all
strange things, that don't happen on my devices. So for me the best
way is to go back to plain MOC. My DB doesn't hold many user generated
data, all the data is loaded from a webservice, so it's no problem to
delete the data and start of using a new DB. – Urkman
Duplicate entries. That one sounds like the bug related to temporary/permanent IDs. There are lots of posts about that. Here is one: Core Data could not fullfil fault for object after obtainPermanantIDs
Not saving. Sounds like you are not using the right API for UIManagedDocument. You need to make sure to not save the MOC directly, and either use an undo manager or call updateChangeCount: to notify UIManagedDocument that it has dirty data that you want to be saved. Again, lots of posts about that as well. Search for updateChangeCount.
However, you know your app best, and it may just be better and easier to use plain Core Data.
Remember, if you are doing lots of imports from the web, to use a separate MOC, and have your main MOC watch for DidSave notifications to update itself with the newly imported data.
UIManagedDocument is a special kind of document, an UIDocument subclass, that stores its data using Core Data Framework. So it combines the power of document architecture and core data capabilities.
You can read more about document based architecture from Document Based App Programming Guide for iOS and I recommend WWDC2011 Storing Documents in iCloud using iOS5 session video. I also recommend Stanford CS193P: iPad and iPhone App Development (Fall 2011) Lecture 13.
What is created when you call saveToURL:forSaveOperation:completionHandler: is an implementation detail of UIManagedDocument and UIDocument and you should not really worry or depend on it. However in current implementation a folder containing an sqlite database file is being created.
No. All entities will be contained in a single database file also more generally called: a persistent store. It is possible to use more than one persistent store, but those are more advanced use cases and UIManagedDocument currently uses one.
UIManagedDocument's context refers to a NSManagedObjectContext from underlying Core Data Framework. UIManagedDocument actually operates two of those in parallel to spin off IO operations to a background thread. When it comes to the nature of a context itself here's a quote from Core Data Programming Guide:
You can think of a managed object context as an intelligent scratch pad. When you fetch objects from a persistent store, you bring temporary copies onto the scratch pad where they form an object graph (or a collection of object graphs). You can then modify those objects however you like. Unless you actually save those changes, however, the persistent store remains unaltered.
But it really is a good idea to take a look at the lectures and other material I posted above to get a general picture of the technologies used and their potential value to you as a developer in different situations.