Is it safe to delete sqlite's WAL file? - ios

I have a strange problem with Core Data in an iOS app where sometimes the WAL file becomes huge (~1GB). It appears there are other people with the problem (e.g. Core Data sqlite-wal file gets MASSIVE (>7GB) when inserting ~5000 rows).
My initial thought is to delete the WAL file at app launch. It seems from reading the sqlite documentation on the matter that this will be fine. But does anyone know of any downsides to doing this?
I'd of course like to get to the bottom of why the WAL file is growing so big, but I can't get to the bottom of it right now and want to put in a workaround while I dig deeper into the problem.
It's worth pointing out that my Core Data database is more of a cache. So it doesn't matter if I lose data that's in the WAL. What I really need to know is, will the database be completely corrupted if I delete the WAL? My suspicion is no, otherwise the WAL doesn't serve one of its purposes.

Couple of things:
You can certainly delete the WAL file. You will lose any committed transactions that haven't been checkpointed back to the main file. (Thus violating the "durability" part of ACID, but perhaps you don't care.)
You can control the size of the WAL file on disk with the journal_size_limit pragma (if it bothers you). You may want to manually checkpoint more often too. See "Avoiding Excessively Large WAL files" here: https://www.sqlite.org/wal.html
I dislike all the superstitious bashing of WAL mode. WAL mode is faster, more concurrent, and much simpler since it dispenses with the all the locking level shenanigans (and most "database is busy" problems) that go with rollback journals. WAL mode is the right choice in almost every situation. (The only place it is problematic is on flash filesystems that don't support shared memory-mapped access to files. In that case, the "unofficial" SQLITE_SHM_DIRECTORY compile directive can be used to move the .shm file to a different kind of filesystem -- e.g. tmpfs -- but this should not be a concern on iOS.)

It baffles me how many people here are suggesting it's safe to delete WAL files, without even bad looks in their direction.
The documentation explicitly lists this as one of the official ways to corrupt a database. It doesn't say deleting a hot WAL may cause you to lose most recent transactions or something benign like that. It says it may corrupt the database.
Why? Because an application may have crashed in the middle of a checkpointing operation. When this happens, the database file itself is in an invalid state unless paired with the new data contained in the WAL.
So the answer is a clear no. Don't delete WAL files.
What you can do to clear the file is call PRAGMA schema.wal_checkpoint(TRUNCATE);

I have been seeing quite a few negative reports on WAL in iOS 7. I have had to disable it on several projects until I have had time to explore the issues more throughly.
I would not delete the journal file but you could play with the option of vacuuming the SQLite file which will cause SQLite to "consume" the journal file. You can do this by adding the NSSQLiteManualVacuumOption as part of the options when you add the NSPersistentStore to the NSPersistentStoreCoordinator.
If that ends up being time consuming then I would suggest disabling WAL. I have not seen any ill effects to disabling it (yet).

WAL mode has problems, don't use it. Problems vary but the very large size your report is one, other problems include failure during migration (using NSPersistentStoreCoordinators migratePersistentStore) and failure during importing of iCloud transaction logs. So while there are reported benefits until these bugs are fixed its probably unwise to use WAL mode.
And NO you can't delete the Write Ahead Log, because that contains the most recent data.
Set the database to use rollback journal mode and I think you will find you no longer have these very large files when loading lots of data.
Here is an extract which explains how WAL works. Unless you can guarantee that your app has run a checkpoint I don't see how you can delete the WAL file without running the risk of deleting committed transactions.
How WAL Works
The traditional rollback journal works by writing a copy of the
original unchanged database content into a separate rollback journal
file and then writing changes directly into the database file. In the
event of a crash or ROLLBACK, the original content contained in the
rollback journal is played back into the database file to revert the
database file to its original state. The COMMIT occurs when the
rollback journal is deleted.
The WAL approach inverts this. The original content is preserved in
the database file and the changes are appended into a separate WAL
file. A COMMIT occurs when a special record indicating a commit is
appended to the WAL. Thus a COMMIT can happen without ever writing to
the original database, which allows readers to continue operating from
the original unaltered database while changes are simultaneously being
committed into the WAL. Multiple transactions can be appended to the
end of a single WAL file.
Checkpointing
Of course, one wants to eventually transfer all the transactions that
are appended in the WAL file back into the original database. Moving
the WAL file transactions back into the database is called a
"checkpoint".
Another way to think about the difference between rollback and
write-ahead log is that in the rollback-journal approach, there are
two primitive operations, reading and writing, whereas with a
write-ahead log there are now three primitive operations: reading,
writing, and checkpointing.
By default, SQLite does a checkpoint automatically when the WAL file
reaches a threshold size of 1000 pages. (The
SQLITE_DEFAULT_WAL_AUTOCHECKPOINT compile-time option can be used to
specify a different default.) Applications using WAL do not have to do
anything in order to for these checkpoints to occur. But if they want
to, applications can adjust the automatic checkpoint threshold. Or
they can turn off the automatic checkpoints and run checkpoints during
idle moments or in a separate thread or process.

There are quite good answers on this thread, but i'm adding this one to link to the Apple official QnA about journaling mode in iOS7 Core Data:
https://developer.apple.com/library/ios/qa/qa1809/_index.html
They give differents solutions:
To safely back up and restore a Core Data SQLite store, you can do the
following:
Use the following method of NSPersistentStoreCoordinator class, rather
than file system APIs, to back up and restore the Core Data store:
- (NSPersistentStore *)migratePersistentStore:(NSPersistentStore *)store toURL:(NSURL *)URL options:(NSDictionary *)options withType:(NSString *)storeType error:(NSError **)error
Note that this is the option we recommend.
Change to rollback journaling mode when adding the store to a
persistent store coordinator if you have to copy the store file.
Listing 1 is the code showing how to do this:
Listing 1 Use rollback journaling mode when adding a persistent store
NSDictionary *options = #{NSSQLitePragmasOption:#{#"journal_mode":#"DELETE"}}; if (! [persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType
configuration:nil
URL:storeURL
options:options
error:&error])
{
// error handling.
}
For a store that was loaded with the WAL mode, if both the main store file and the
corresponding -wal file
exist, using rollback journaling mode to add the store to a persistent
store coordinator will force Core Data to perform a checkpoint
operation, which merges the data in the -wal file to the store file.
This is actually the Core Data way to perform a checkpoint operation.
On the other hand, if the -wal file is not present, using this
approach to add the store won't cause any exceptions, but the
transactions recorded in the missing -wal file will be lost.
VERY IMPORTANT EDIT
If some of your users are on iOS 8.1 and you chose the first solution (the one Apple recommends), note that their many-to-many data relationships will be completely discarded. Lost. Deleted. In the entire migrated database.
This is a nasty bug apparently fixed in iOS 8.2. More info here http://mjtsai.com/blog/2014/11/22/core-data-relationships-data-loss-bug/

You should never delete the sqlite WAL file, it contains transactions that haven't been written to the actual sqlite file yet. Instead force the database to checkpoint, and then clean up the WAL file for you.
In CoreData the best way to do this is to open the database with the DELETE journal mode pragma. This will checkpoint and then delete the WAL file for you.
NSDictionary *options = #{ NSSQLitePragmasOption: #{ #"journal_mode": #"DELETE"}};
[psc addPersistentStoreWithType:NSSQLiteStoreType
configuration:nil
URL:_storeURL
options:options
error:&localError];
For sanity sake you should ensure you only have one connection to the persistent store when you do this, i.e. only one persistent store instance in a single persistent store coordinator.
FWIW in your particular case you may wish to use TRUNCATE or OFF for your initial database import, and switch to WAL for updates.
https://www.sqlite.org/pragma.html#pragma_journal_mode

Related

Will dets perform disk readings on lookup with ram_file option?

An option ram_file is described in DETS Erlang docs
open_file(name, args)
{ram_file, boolean()} - Whether the table is to be kept in RAM. Keeping the table in RAM can sound like an anomaly, but can enhance the performance of applications that open a table, insert a set of objects, and then close the table. When the table is closed, its contents are written to the disk file. Defaults to false.
this will perform save on disk after insert or update
What if i'll use
open - then lookup - then close?
I haven't checked in the docs, but I assume that this mean that the VM will open the file and potentially mmap it, so it is constantly kept in the memory and it will be synchronised with the file on the disk, but I think that the change still can end in the cache, so it can be not immediately written to disk. If you want to be sure that all changes have been flushed to the disk then use dets:sync/1 call on the table to force data to be flushed which is explicitly stated in the docs:
This also applies to tables that have been opened with flag ram_file set to true. In this case, the contents of the RAM file are flushed to disk.
It will not open nor close file after each lookup, but it will keep it opened until dets:close/1 will not be called on the given table. Opening and closing table for each lookup on the other hand can be expensive, so it would make whole usage of DETS a little bit meaningless.

Force a Core Data Checkpoint?

I write an app that churns a great deal of data through Core Data. I clean up this data after the user has exited the app in the background. Because WAL checkpointing appears to be a major cause of my UI pauses, I would like to also force a WAL checkpoint. (Yes, I know about creating a second Core Data stack. That will also be done but this problem will still remain. I have existing experiments using a second stack but they haven't yet resulted in any appreciable advantage.) The Google reveals the following page, New Default Journaling Mode, which goes into a very modest discussion of how to force a checkpoint of a database before copying it elsewhere. My issue is I would like to force the checkpoint on the live database without tearing down my whole UI. My experiments re-adding the persistent store to the coordinator are to no avail. They result in an infinite loop.
Clearly, checkpointing can be done without affecting my existing MOCs and PSC because it already happens. I just want to force it at a well known time that doesn't affect my user's happiness.
That document's description may be modest, but that's how it's done. Core Data isn't really a SQLite wrapper, and it provides very limited direct access to SQLite. Passing options when adding the persistent store is the only option.
In short: you can't force a checkpoint on a live persistent store.
What you can do is use that method all the time, changing to rollback journal mode all the time instead of only for checkpoint purposes. By using the journal_mode option, you can switch SQLite to a different mode where checkpointing isn't needed. As long as you include
NSDictionary *options = #{NSSQLitePragmasOption:#{#"journal_mode":#"DELETE"}};
when adding the store, the problem doesn't exist.
If you want to retain wal mode, you can try using other SQLite pragmas in the option list to tune the checkpointing behavior. For example, the wal_autocheckpoint pragma tunes how often checkpoints occur. You might be able to get better results by adjusting that. You still can't invoke a checkpoint on demand, but you'll change the performance.

Core Data sqlite-wal file gets MASSIVE (>7GB) when inserting ~5000 rows

I'm importing data into Core Data and find that the save operation is slow. Using the iOS simulator, I watch the sqlite-wal file grow and grow until its over 7GB in size.
I'm importing approx 5000 records with about 10 fields. This isn't a lot of data.
Each object I'm inserting has a to-one relation to various other objects (6 relations total). All of those records combined equal less than 20 fields. There are no images or any binary data or anything that I can see that would justify why the resulting size of the WAL file is so huge.
I read the sqlite docs describing the wal file and I don't see how this can happen. The source data isn't more than 50 MB.
My app is multi-threaded.
I create a managed object context in the background thread that performs the import (creates and saves the core data objects).
Without writing the code out here, has anyone encountered this? Anyone have a thought on what I should be checking. The code isn't super simple and all the parts would take time to input here so lets start with general ideas.
I'll credit anyone who gets me going in the right direction.
Extra info:
I've disabled the undo manager for the context as I don't need that (I think it's nil by default on iOS but I explicitly set it to nil).
I only call save after the entire loop is complete and all managed objects are in ram (ram goes up to 100 MB btw).
The loop and creation of the core data objects takes only 5 seconds or so. The save takes almost 3 minutes as it writes the the awl file.
It seems my comment to try using the old rollback(DELETE) journal mode rather than WAL journal mode fixed the problem. NOTE that there seem to be a range of problems when using WAL journal mode including the following:
this problem
problems with database migrations when using the migratePersistentStore API
problems with lightweight migrations
Perhaps we should start a Core Data WAL problems page and get a comprehensive list and ask Apple to fix the bugs.
Note that the default mode under OS X 10.9 and iOS 7 now uses WAL mode. To change this back add the following option
#{ NSSQLitePragmaOptions : #{ #"journal_mode" : #"DELETE" } }
All changed pages of a transaction get appended to the -wal file.
If you are importing multiple records, you should, if possible, use a single transaction for the entire import.
SQLite cannot do a full WAL checkpoint while some other connection is reading the database (which might just be some statement that you forgot to close).

Core Data WAL mode not persisting changes to .db, only .db-wal and .db-shm

So, I have been using MagicalRecord to develop an iPad app, and recently after moving to an auto-migrating store I have been experiencing some issues. I need to sync my .db file from one device over to another, so I need all of the data to be in the .db, but it seems like with WAL journaling mode enabled (the default for Magical Record auto-migrating stores) no matter how I save it only persists the changes to either the .db-wal or the .db-shm files. I switched to a normal sqlite store and everything worked fine. So, my question is, with WAL journaling enabled do I need to do anything special to actually get Core Data to save to the main database, or will I just have to disable it?
Change the journal mode. You have the Magical Record source, after all. Change the SQLite journal mode to DELETE, and the journal mode will be deleted after every transaction. Disabling journalling is a really bad idea, don't do that. But using a different mode should be fine.
Core Data does not offer any API for manipulating the journal once the persistent store is open. SQLite is an implementation detail, and Core Data doesn't expose the internal SQLite details. The closest you can get is the options parameter when setting up the Core Data stack, which is where you can change the journal mode (and where MR changes it).
The -wal file is part of the database; you must synchronize it together with the .db file.
Alternatively, you can copy the data to the main database file by executing a checkpoint.

What should I do after crashing when writing into a sqlite db (iOS)?

If an app crashes when writing into a sqlite db (or CoreData), sometimes the db file will be broken, after which initialisation of the db may fail to open.
What I'm doing now is deleting the db file if it fails to open, and copying a new one to be used.
I'm wondering what's the BEST WAY to deal with such situation?
Due to the atomic commit nature of SQLite, you should never experience database corruption, if you are, it could be due to enabling features such as "Write Caching" within iOS or in the hard drive itself, or could possibly even be caused by hardware failure.
SQLite maintains a journal file to rollback commits and return the database to a consistent state in the event of a power failure or other abrupt shutdown. If corruption occurs, it means that the OS responded to SQLite stating a write operation had completed when in actuality, it wasn't physically committed to the media yet. Please ensure Write Caching is disabled when using it in your App. For more information, please see the SQLite Atomic Commit reference.
Otherwise the common method people seem to "repair" a SQlite DB is to .dump the DB file into another. Like so echo ".dump" | sqlite old.db | sqlite new.db
Hope this helps...
[source]

Resources