What shouldn't I store in core data? - ios

So this is more of an application design question. But I think it can be 'answered' and not just discussed. :)
I'm using RestKit for an application we're building. It obviously makes it super easy to put stuff into either straight objects or core data objects.
In the specific instance I'm dealing with, we have comments, much like comments on a facebook post.
Now, the nicest thing about storing these comments in core data is that with NSFRC I can sort them super easily and deal with updating/inserting automatically into the right spots into the timeline. But there's a couple sticking points there as well.
For instance with infinite loading, I now have to manage loading the comments in between the new most recent comments and the old stored comments. (Maybe the first time I grabbed 25, but there has been 100 new comments since then. So I retrieve the latest 25 first, then have to have an auto load cell in between the new comments and the old until I run into those, then have to paginate any after.
Aside from that, then you are also storing potentially thousands of comments in core data. Maybe it's not a big deal for quite a long time, but eventually you might want to start cleaning up old comments with a GCD task.
So what are the leading thoughts on what to store in core data, and what to keep as transient objects. (Maybe storing those in a cache like NSCache or the new Tumblr cache https://github.com/tumblr/TMCache).
Edit
Ok maybe I should clarify a little here. I get the purpose of Core Data... for persisting across app restarts and having an object graph with relationships. I make plenty of use of it. I guess what I'm wondering about here is the grey areas where I would like things to persist for the sake of not always having to wait for a network call, and offline availability.
But much like stories and comments on facebook, there are always going to be a constant stream of new ones coming in, and you don't necessarily care about 300 comments on an old post. Someone could come back to view comments on their 'post' quite a few times, or someone may just be browsing 'posts' and comments casually, and never coming back to them.
So I'm just trying to consider the strategy for something like this where you have potentially lots of entities (comments) coming down from a service. Sometimes people will want to view them several times (their own 'post') and sometimes they are just browsing through. When trying to see how others do this, it seems some stuff it all into core data, some (like Facebook) seem to store 25-50 most recent in the db, and any beyond that are transient (they probably are clearing out older stories and comments regularly too.)

Core Data is not designed to be used as "dumb data storage", but rather object persistence. So, anything that you want to persist between uses of your app should go into Core Data.
If you are using Core Data properly, it will take care of all of your caching for you as well.
EDIT:
Anything that is going to change too often for your taste or that you just don't want to permanently store, NSCache may be a better option. If you don't think your user will look at it again tomorrow, leave those bits out of your persistence. (IMHO)

Create a scond repository. Either select a time period that is known as 'recent' or provide a preference for such. Periodically look at the primary repository and find objects now older than the recent range, and move those objects to the scond repository.
Then provide users the means to search in just recent or all.
If all they want are recent values the searches should be faster, and nothing is lost.

Related

Core Data ordeal

A few weeks ago, I decided to learn Core Data for my new project and apply it to my entire model. There was a steep learning curve, but eventually I got familiar with the stack and I'm now rather comfortable with at least the basic concepts and the few common pitfalls such as thread concurrency.
I have to say, the first few weeks after getting comfortable where pretty amazing. NSFetchedResultsController give you a good way to communicate between my model and my controllers. However the more I use Core Data, the more annoying it gets.
As a concrete example, my app fetches a few pieces of data from my server (the posts) which appear in a feed. Each post has an owner, of class User, which I also fetch from the server. Now, Core Data has been great for managing the realtionship between a post and a user. The relationship is updated automatically and getting the post's origin is as simple as calling post.owner. However, there are also inconveniences:
1.Core Data forces objects to the disk that I do not want forced to the disk. This is probably the main issue. With the posts, I do not want them to be forced to disk, and would rather make calls to the server again. Why? Because the more posts I store persistently, the more housekeeping there is to do. A post can be edited, deleted, flagged, etc... and keeping those posts locally means having to plan updates.
2.Having to constantly worry about concurrency of contexts, objects and the likes. I wrote an object factory that always returns objects on the right thread and the right context, but even then bugs occur here and there, which quickly becomes frustrating.
3.Decreased performance. Perhaps the least important one at this point, going from cached objects to Core Data has taken a (barely noticeable) toll on the performance of my application (most notably the feed).
So what are your recommendations regarding Core Data? Would you suggest a different approach to Core Data?
I was thinking of a hybrid caching + Core Data where I store the information I will actually use many times (such as users) persistently and then use the RAM for things like posts, or simply creating posts without an NSManagedContext. Input welcome!
Core Data forces objects to the disk that I do not want forced to the disk.
It does no such thing. If you don't want to save your Post objects to the persistent store, don't put them in Core Data and don't make them managed objects. Your User object can have a posts property even if the Post object is not managed by Core Data. Managed objects can have properties of any type, not only to other managed objects.
Having to constantly worry about concurrency of contexts, objects and the likes.
Concurrency is complex no matter how you model your data. It's a fundamentally complex problem. You're encountering it with Core Data because you're using Core Data. If you use something else, you'll deal with it there.
Decreased performance.
"Product" menu --> "Analyze" and run Instruments to find out why. There's no reason this should happen, and you have the tools to discover what's actually going on.

Is it good practice to store items retrieved from a web service as Core Data objects?

I've heard of iOS developers retrieving items from a RESTful web service and storing them as Core Data objects right away. I can see why that may be useful if you want to save or cache these items so the user can see them later (e.g. Facebook feed), but are there any other reasons to do so? I have items in my web service that are invalid within an hour, so caching is out of the question. If it's a good practice to do so, why?
For me, there are 2 reasons to stock datas in local:
Better UX: first, show old contents, then do an update in background for example, then update your application UI when new contents are availables.
Work offline whenever online mode is impossible.
Even if your items are invalid within an hour, if you do not cache items in local, your application has to call to webservice to retrieve these items, and it takes time.
Caching almost never hurts and CoreData is a very nice way to cache data which comes in as a pile of similar records.
I am one of those devs you mentioned who store almost anything using CoreData. Because I do, a lot of useful code and selfmade frameworks has summed up over time which make working with CoreData and RESTful apis a breeze. And if connecting an api to CoreData is just a matter of a few lines of code, there really isn't any reason not to.
While I cannot share my libraries, I'd strongly recommend taking a look at RestKit, which does pretty much the same - mapping a RESTful api to CoreData. And if you're not used to CoreData yet, fear not. It is a very powerful tool and getting used to it is definitely worth the while!

Keeping Core Data Objects in multiple stores

I'm developing an iOS application using Core Data. I want to have the persistent store located in a shared location, such as a network drive, so that multiple users can work on the data (at different times i.e. concurrency is not part of the question).
But I also want to offer the ability to work on the data "offline", i.e. by keeping a local persistent store on the iPad. So far, I read that I could do this to some degree by using the persistent store coordinator's migration function, but this seems to imply the old store is then invalidated. Furthermore, I don't necessarily want to move the complete store "offline", but just a part of it: going with the simple "company department" example that Apple offers, I want users to be able to check out one department, along with all the employees associated with that department (and all the attributes associated with each employee). Then, the users can work on the department data locally on their iPad and, some time later, synchronize those changes back to the server's persistent store.
So, what I need is to copy a core data object from one store to another, along with all objects referenced through relationships. And this copy process needs to also ensure that if an object already exists in the target persistent store, that it's overwritten rather than a new object added to the store (I am already giving each object a UID for another reason, so I might be able to re-use the UID).
From all I've seen so far, it looks like there is no simple way to synchronize or copy Core Data persistent stores, is that a fair assessment?
So would I really need to write a piece of code that does the following:
retrieve object "A" through a MOC
retrieve all objects, across all entities, that have a relationship to object "A"
instantiate a new MOC for the target persistent store
for each object retrieved, check the target store if the object exists
if the object exists, overwrite it with the attributes from the object retrieved in steps 1 & 2
if the object doesn't exist, create it and set all attributes as per object retrieved in steps 1 & 2
While it's not the most complicated thing in the world to do, I would've still thought that this requirement for "online / offline editing" is common enough for some standard functionality be available for synchronizing parts of persistent stores?
Your point of views greatly appreciated,
thanks,
da_h-man
I was just half-kidding with the comment above. You really are describing a pretty hard problem - it's very difficult to nail this sort of synchronization, and there's seldom, in any development environment, going to be a turn-key solution that will "just work". I think your pseudo-code description above is a pretty accurate description of what you'll need to do. Although some of the work of traversing the relationships and checking for existing objects can be generalized, you're talking about some potentially complicated exception handling situations - for example, if updating an object, and only 1 out 5 related objects is somehow out of date, do you throw away the update or apply part of it? You say "concurrency" is not a part of the question, but if multiple users can "check out" objects at the same time, unless you plan to have a locking mechanism on those, you would start having conflicts when trying to make updates.
Something to check into are the new features in Core Data for leveraging iCloud - I doubt that's going to help with your problem, but it's generally related.
Since you want to be out on the network with your data, another thing to consider is whether Core Data is the right fit to your problem in general. Since Core Data is very much a technology designed to support the UI and MVC pattern in general, if your data needs are not especially bound to the UI, you might consider another type of DB solution.
If you are in fact leveraging Core Data in significant ways beyond just modeling, in terms of driving your UI, and you want to stick with it, I think you are correct in your analysis: you're going to have to roll your own solution. I think it will be a non-trivial thing to build and test.
An option to consider is CouchDB and an iOS implementation called TouchDB. It would mean adopting more of a document-oriented (JSON) approach to your problem, which may in fact be suitable, based on what you've described.
From what I've seen so far, I reckon the best approach is RestKit. It offers a Core Data wrapper that uses JSON to move data between remote and local stores. I haven't fully tried it yet, but from what the documentation reads, it sounds quite powerful and ideally suited for my needs.
You definetly should check these things:
Parse.com - cloud based data store
PFIncrementalStore https://github.com/sbonami/PFIncrementalStore - subclass of NSIncrementalStore which allows your Persistent Store Coordinator to store data both locally and remotely (on Parse Cloud) at the same time
All this stuff are well-documented. Also Parse.com is going to release iOS local datastore SDK http://blog.parse.com/2014/04/30/take-your-app-offline-with-parse-local-datastore/ wich is going to help keep your data synced.

Most Efficient Way to Populate UITableView

I am developing an app which consists of a UINavigationController and UITableViews, there will be many items (50+) at the root view of the nav controller and maybe 30 rows in each of the detail views.
What is the most efficient way to populate the lists? Core Data or Plists?
There is scope within the specification to push updates to the lists on a monthly basis, so they always stay current. Would this affect the choice, which method is easier to bulk update?
Thanks
I would choose Core Data.
It is relatively easy to use; and it gives you more flexibility if the app needs to grow. Core Data can be backed by SQLLite, and thus can be quite performant. Also bulk updates is manageable.
Core Data is by far the best, especially since you want to be able to make updates to this data later on
Regarding updates. I wouldn't 'push' these out but rather have the app poll for them, perhaps on launch, then fetch anything new in the background.
Edit: Also with Core Data and using a NSFetchedResultsController it is very easy to smoothly animate in new records into a UITableView as they are added in the background to the data store
Imho, I would try to keep things simple, following the good old KISS principle.
In your current case, it seems that you just need to display read-only data, so all you need is the data (say a file, in plist format, or xml, or json, or csv, or whatever. just parse the file, populate your business objects, add them to an array. Use that array for your master and detail view. No need for core data here (asumming by 50+ you don't mean 50 - 50'000, because in that case, core data's memory management would help ;-)
If in the future you need to handle updates, you will either update the whole list, thus in fact just replace the old file (simple), or do incremental changes. I would only recommend to consider to start using core data in the latter case.
I'm personally using core data in a couple of projects, and I love it. But I wouldn't recommend it just because it's there, after all it brings overhead and complexity. If you want to use core data, you'll need to invest some time to understand it's concepts. Don't underestimate that, there's a lot of stuff to read and understand, and probably a couple of WTF moments (just look for core data questions here in SO).
Just to be clear: I don't want to talk you out of using core data, I'm just asking as your mother probably would: do you really need it?

Seperate Object over multiple models or encode in JSON?

sorry if the question sounds so weird, but I don' really know how else to put it.
Essentially, my application will a bunch of objects. Each objects has somekind of post/comment structure, the unique thing though is, that it is more or less static, so i figure out it would make no sense to put in every single post and comment into my database, because that would cause more database load? Instead of this, I was thinking about putting the JSON representation of the post with its comments, thus only causing one database access per object. I would then render the JSON object in the controller or view or something. Is this a valid solution?
No!
You loose all ability to query that data at no benefit unless you are at massive scale. The database's job is to pull that stuff out for you efficiently, and if you create the proper indexes and implement the proper caching strategies, you shouldn't have any issues with database load. You want to replace all the goodness of the Rails ORM with your own decidedly less useful version in the interest of a speed gain, waaay before you need it.
What if later you want to do a most popular comments sidebar widget? Or you want to page through the comments, regardless of the post they are associated with, in a table for moderation? What if you want your data to be searchable?
Don't sacrifice your ability to easily query and manipulate the data for premature optimization.
Though it sounds a good idea but I don't think that it will work in the long run thinking of what is going to happen when you have many comments on your posts. You will have to get the long string from the database and then add the new comment to it and then update it in the data. This will be very inefficient compared to just inserting one more comment in the table.
Also, just think what is going to happen, if at some point, you will have to give user the option to update the comment. Getting the particular comment from that long string and then update it will be a nightmare, don't you think?
In general you want to use JSON and the like as a bit of a last resort. Storing JSON in the db makes sense if your information isn't necessarily known ahead of time. It is not a substitute for proper data modelling and is not a win performance-wise.
To give you an idea where I am looking at using it in a project, in LedgerSMB we want to be able to have consultants track additional information on some db objects. Because we don't know what it will be in advance JSON makes a lot of sense. We don't expect to be searching on the data or support searches on the data but if we did that could be arranged using plv8js.

Resources