The problem:
I have been using for some time now my own cache system by using NSFileManager. Normally the data I receive is JSON and I just save the dictionary directly into cache (in the Documents Folder). When I need it back I will just go get it. I also implement, sometimes when I feel it's better, a NSDictionary on the root Folder with keys/values for the path for a given resource. For example:
Resource about weather in Geneve 17/02/2013, so I would have a key called GE_17_02_2013 and the value the path to the NSDictionary with the information.
Normally I don't need to do any complex queries. But, somehow, and from what I have been reading, when you have a lot of data, you should stick with Core Data. In my case, I normally have a lot of data, but I never actually felt the application going down, or suffering in terms of performance. So my questions are:
In this case, where sometimes (the weather thing was just an
example) I need to just remove all the data (a Twitter feed, for
example) and replace it by a completely new stream of data, is Core
Data worth? I think removing all the data, and inserting (populating) it, is heavier than just store the NSDictionary and replacing the old one.
Sometimes it would envolve images, textFiles, etc and the
NSFileManager does it perfectly, so what advantages could Core
Data bring in this cases?
P.S: I just saw this post, where this kind of question is made and numbers prove which one is actually faster. Still, I would like as well an empiric answer.
Core Data is worth using in every scenario you described. In fact, if an app stores more than preferences, you should probably use Core Data. Here are some reasons, among which, you'll find answers to your own problems:
is definitely faster than the filesystem, even if you wipe out everything and write it again as you describe (so you don't benefit to much from caching). This is basically because you can coalesce your operations and only access the store when needed. So if you read, write and read, you can save only once, the rest is done in memory, which is, needless to say, very fast.
everything is versioned and you can migrate from one version to another easily (while keeping the content the user has on the device)
80% of your model operations come free. Like, when something changes, you can override the willSave managed object method and notify your controllers.
using cascade makes it trivial to delete even very complex object structures
while is a bad idea to keep images in the database, you can still keep them on the filesystem and have core data delete them automatically when the managed object that represents them is deleted
is flexible, in fact is so flexible that you could migrate your project from using the local filesystem to using a server with very little modifications by writing a custom data store.
the core data designer basically creates the model objects for you. You don't need to create your own model classes (which you would have to if using the filesystem)
In this case ... is Core Data worth it?
Yes, to the extent that you need something more centrally managed than trying to draw up your own file-system schema. Core Data, and its lower-level cousin SQL, are still the best choice for persistence that we have right now. Besides, the performance hit of using NSKeyed(Un)Archiver to keep serializing/deserializing a dictionary over and over again becomes increasingly noticeable with larger datasets.
I think removing all the data, and inserting (populating) it, is heavier
than just store the NSDictionary and replacing the old one.
Absolutely, yes. But, you don't have to think about cache turnover like that. If you imagine Core Data as a static model, you can use a cache layer to ferry data in and out of the store. Need that resource about the weather? Check the cache layer. If it's not in there, make the cache run a fetch request. Need to turn over the whole cache? Have the cache empty itself then run a request to mark every entity with some kind of flag to show they are invalid. The expensive deletion you're worrying about can be done by a background process when you see that all your new data has been safely interned in the cache.
Sometimes it would envolve images, textFiles, etc and the
NSFileManager does it perfectly, so what advantages could Core Data
bring in this cases?
Unfortunately, not many. For blobs of data (which is essentially what Core Data does in these situations), storage and fetches to and from Core Data can quickly get costly. They can also take up a noticeably larger space on disk if they aren't compressed (which further decreases performance). If you need a faster alternative, use a store more suited to the task like Tokyo Cabinet or LevelDB, and use the entities in the Core Data store as a kind of stand-in that would, say, contain the key to the resource in one of those relational databases.
Related
This question is not about the technical problem, but rather the approach.
I know two more or less common approaches to store the data received from the server in your app:
1) Using managers, data holders etc to store the data. They are most often some kind of singleton and are used to store the models received from the server. (E.g. - the array of the posts/places/users) Singletons are needed to be able to access the data from any screen. I think the majority of apps uses this approach.
2) Using Core Data (or maybe Realm) as in-memory storage. This approach avoids having singletons, but, I guess, it is a bit more complex (and crash risky) to maintain and support.
How do you store data and why?
P.S. Any answers would help. But big "thank you" for detailed ones, with reasons.
The reason people opt to use Core Data/Relam/Shark or any other iOS ORM is mainly for the purpose of persisting data between runs of the app.
Currently there are two ways of doing this, for single values and very small (not that I encourage it) objects you can use the UserDefaults to persist between app launches. For a approach closer to a database, infact in the case of Core Data and SharkORM, they are built on top of SQLite, you need to use an ORM.
Using a manager to store an array of a data models will only persist said models for the lifetime of the app. For example when the user force quits the app, restarts their device or in some circumstances when iOS terminates your app, all that data will be lost permanently. This is because it is stored in RAM which is volatile memory, rather than in a database on the disk itself.
Using a database layer even if you don't specifically require persistence between launches can have its advantages though; for instance SharkORM allows you to execute raw SQL queries on your objects if you don't want to use the built in powerful query builder. This can be useful to quickly pull the model you are interested in rather than iterating through a local array.
In reply to your question, how do I store data?
Well, I use a combination of all three. Say for instance I called to an API for some data which I wanted to display there and then to the user, I would use a manager instance with an array to hold the data model.
But on the flipside if I wanted to store that data for later or if I needed to execute a complex query on it, I would store it on disk using Shark.
If however I just wanted to store whether or not the user had seen my on boarding flow I would just persist a boolean value into UserDefaults.
I hope this is detailed enough for you.
CoreData isn't strictly "in-memory". You can load objects into your data model and save them into their context, then they might actually be on disk and out of main memory, and they can easily be brought back via fetch requests.
Singletons, on the other hand, do typically stay in main memory all the time until the user terminates the app. If you have larger objects that you are storing in some data structure (e.g. full resolution images when all you really needed was a thumbnail), this can be quite a resource hog.
I have an app with a lot of data (including NSMutableArrays, NSNumbers, various custom classes) that uses NSCoding protocol presently. However, I would like to implement an incremental saving system, to save time during the "saving process". The loading time is not important.
Is there any existing container that checks its members for "dirty" and only updates those values when writing to file; or better yet, a protocol that can be implemented to do the same; or any other simple, available way of doing this?
For large amount of data its better to change data model to Core Data. Otherwise, you may want to save changes after specific events, or, bad solution is to use NSTimer, to save all data every time you want to.
I am somewhat new to Core Data and have a general question.
In my current project, users can access data reported by various sensors in each county of my state. Each sensor is represented in a table view which gathers its data from a web service call. Calling the web service could take some time since this app may be used in rural areas with slow wireless connectivity. Furthermore, users will typically only need data from one or two of the state's 55 counties. Each county could have anywhere from 15 to 500 items returned by the web service. Since the sensor names and locations change rarely, I would like the app to cache the data from the web service call to make gathering the list of sensors locations faster (and offer a refresh button for cases where something has changed). The app already uses Core Data to store bookmarked sensor locations, so it is already set up in the app.
My issue is whether to use Core Data to cache the list of sensors, or to use a SqlLite data store. Since there js already a data model in place, I could simply add another entity to the model. However, I am concerned about whether this would introduce unnecessary overhead, or maybe none at all.
Being new to Core Data, it appears that all that is really happening is that objects are serialized and their properties added as fields in a SqlLite DB managed by Core Data. If this is the case, it seems there really would not be any overhead from using the Core Data store already in place.
Can anyone help clear this up for me? Thanks!
it appears that all that is really happening is that objects are
serialized and their properties added as fields in a SqlLite DB
managed by Core Data
You are right about that. Core Data does a lot more, but that's the basic functionality (if you tell it to use a SQLite store, which is what most people do).
As for the number of records you want to store in Core Data, that shouldn't be a problem. I'm working on a Core Data App right now that also stores over 20,000 records in Core Data and I still get very fast fetch times e.g. for auto completion while typing.
Core Data definitely adds some overhead, but if you only have few entites and relationships and are not creating/modifying objects in more than one context, it is negligible.
Being new to Core Data, it appears that all that is really happening is that objects are serialized and their properties added as fields in a SqlLite DB managed by Core Data. If this is the case, it seems there really would not be any overhead from using the Core Data store already in place.
That's not always the case. Core Data hides its storage implementation from the developer. It is sometimes a SQL db, but in other cases it can be a different data storage. If you need a comprehensive guide to CoreData, I recommend this objc.io article.
As #CouchDeveloper noted, CoreData is a disk io/CPU bound process. If you notice performance hits, throw it in a background thread (yes - this is a pretty big headache), but it will always be faster than the average network.
I've got an application that stores products in a Core Data file. These pruducts include images as "Transformable" data.
Now I tried adding some attributes using Lightweight migration. When I tested this with a small database it worked well but when I use a really large one with nearly 500 MB the application usually crashes because of low memory. Does anybody know how to solve this problem?
Thanks in advanced!
You'll have to use one of the other migration options. The automatic lightweight migration process is really convenient to use. But it has the drawback that it loads the entire data store into memory at once. Two copies, really, one for before migration and one for after.
First, can any of this data be re-created or re-downloaded? If so, you might be able to use a custom mapping model from the old version to the new one. With a custom mapping model you can indicate that some attributes don't get migrated, which reduces memory issues by throwing out that data. Then when migration is complete, recreate or re-download that data.
If that's not the case... Apple suggests a multiple pass technique using multiple mapping models. If you have multiple entity types that contribute to the large data store size, it might help. Basically you end up migrating different entity types in different passes, so you avoid the overhead of loading everything at once.
If that is not the case then (e.g. the bloat is all from instances of the same entity type), well, it's time to write your own custom migration code. This will involve setting up two Core Data stacks, one with the existing data and one with the new model. Run through the existing data store, creating new objects in the new store. If you do this in batches you'll be able to keep memory under control. The general approach would be:
Create new instances in the new model and copy attributes only. You can't set up relationships yet because related objects might not exist in the new data store. Keep a mutable dictionary mapping NSManagedObjectIDs from the old store to the new one, for use in the next step. To keep memory use low:
As soon as you have created a destination store object, free up the memory for the source object by using refreshObject:mergeChanges with NO for the second argument.
Every 10 instances (or 50, or whatever) save changes on the destination managed object context and then reset it. The interval is a balancing act-- do it too often and you'll slow down unnecessarily, do it too rarely and memory use rises.
Do a second pass where you set up relationships in the destination store. For each source object,
Find the corresponding destination object, using the object ID map you created
Run through the source object's relationships. For each one, find the corresponding destination object, also using the object ID map.
Set the destination object's relationship based on the result.
While you are at it consider why your data store is so big. Are you storing a bunch of binary data blobs in the data store? If so, make sure you're using the "Allows external storage" option in the new model.
I'm going through the Apress book Pro Core Data and it says the following:
...local caching of remote data can benefit from in-memory persistent
stores.
I fail to see how caching the data in an in-memory persistent store is any more useful than simply having your app's root view controller hang on to the data. Can someone elaborate more fully on the kinds of situations where an in-memory persistent store might be useful?
Your question indicates a misunderstanding of MVC. You've asked "why would it be faster for the model to cache data rather than a controller." Controller don't hold data at all, so it doesn't matter how fast it would or wouldn't be. The model holds data. And in a Core Data app, the model is tied to a persistent store.
The fact that persistent stores can be in memory makes coding extremely convenient, since callers don't have to worry about how the data is stored. In your example, callers would need to behave differently (deal with different classes) for data stored in a local store versus a remote store. Core Data abstracts that away, making it easy to move your store wherever you want it.
The benefit of using Core Data with an in-memory store, as compared to simply rolling out your own non-Core Data class hierarchy, is that you benefit from all the other features of Core Data that are not related to persisting data. These include tracking and undo support, relationship maintenance and change propagation, automatic validation, integration with standard UI components (e.g. NSFetchedResultsController), KVC/KVO, etc.
...local caching of remote data
The key work here is remote data, so you may not want to keep that data around between application launches. In this case an in-Memory store make sense.