Core Data ordeal - ios

A few weeks ago, I decided to learn Core Data for my new project and apply it to my entire model. There was a steep learning curve, but eventually I got familiar with the stack and I'm now rather comfortable with at least the basic concepts and the few common pitfalls such as thread concurrency.
I have to say, the first few weeks after getting comfortable where pretty amazing. NSFetchedResultsController give you a good way to communicate between my model and my controllers. However the more I use Core Data, the more annoying it gets.
As a concrete example, my app fetches a few pieces of data from my server (the posts) which appear in a feed. Each post has an owner, of class User, which I also fetch from the server. Now, Core Data has been great for managing the realtionship between a post and a user. The relationship is updated automatically and getting the post's origin is as simple as calling post.owner. However, there are also inconveniences:
1.Core Data forces objects to the disk that I do not want forced to the disk. This is probably the main issue. With the posts, I do not want them to be forced to disk, and would rather make calls to the server again. Why? Because the more posts I store persistently, the more housekeeping there is to do. A post can be edited, deleted, flagged, etc... and keeping those posts locally means having to plan updates.
2.Having to constantly worry about concurrency of contexts, objects and the likes. I wrote an object factory that always returns objects on the right thread and the right context, but even then bugs occur here and there, which quickly becomes frustrating.
3.Decreased performance. Perhaps the least important one at this point, going from cached objects to Core Data has taken a (barely noticeable) toll on the performance of my application (most notably the feed).
So what are your recommendations regarding Core Data? Would you suggest a different approach to Core Data?
I was thinking of a hybrid caching + Core Data where I store the information I will actually use many times (such as users) persistently and then use the RAM for things like posts, or simply creating posts without an NSManagedContext. Input welcome!

Core Data forces objects to the disk that I do not want forced to the disk.
It does no such thing. If you don't want to save your Post objects to the persistent store, don't put them in Core Data and don't make them managed objects. Your User object can have a posts property even if the Post object is not managed by Core Data. Managed objects can have properties of any type, not only to other managed objects.
Having to constantly worry about concurrency of contexts, objects and the likes.
Concurrency is complex no matter how you model your data. It's a fundamentally complex problem. You're encountering it with Core Data because you're using Core Data. If you use something else, you'll deal with it there.
Decreased performance.
"Product" menu --> "Analyze" and run Instruments to find out why. There's no reason this should happen, and you have the tools to discover what's actually going on.

Related

managed objects vs. business objects

I'm trying to figure out how to use Core Data in my App. I already have in mind what the object graph would be like at runtime:
An Account object owns a TransactionList object.
A TransactionList object contains all the transactions of the account. Rather than being a flat list, it organizes transactions per day. So it contains a list of DailyTransactions objects sorted by date.
A DailyTransactions contains a list of Transaction objects which occur in a single day.
At first I thought Core Data was an ORM so I thought I might just need two tables: Account table and Transaction table which contained all transactions and set up the above object graph (i.e., organizing transactions per date and generating DailyTransactions objects, etc.) using application code at run time.
When I started to learn Core Data, however, I realized Core Data was more of an object graph manager than an ORM. So I'm thinking about using Core Data to implement above runtime object relationship directly (it's not clear to me what's the benefit but I believe Core Data must have some features that will be helpful).
So I'm thinking about a data model in Core Data like the following:
Acount <--> TransactionList -->> DailyTransactions -->> Transaction
Since I'm still learning Core Data, I'm not able to verify the design yet. I suppose this is the right way to use Core Data. But doesn't this put too many implementation details, instead of raw data, in persistent store? The issue with saving implementation details, I think, is that they are far more complex than raw data and they may contain duplicate data. To put it in another way, what exactly does the "data" in data model means, raw data or any useful runtime objects?
An alternative approach is to use Core Data as ORM by defining a data model like:
Account <-->> Transactions
and setting up the runtime object graph using application code. This leads to more complex application code but simpler database design (I understand user doesn't need to deal with database directly when using Core Data, but still it's good to have a simpler system). That said, I doubt this is not the right way to use Cord Data.
A more general question. I did little database programming before, but I had the impression that there was usually a business object layer above plain old data object layer in server side programming framework like J2EE. In those architectures, objects that encapsulate application business are not same as the objects loaded from database. It seems that's not the case with Core Data?
Thanks for any explanations or suggestions in advance.
(Note: the example above is an simplification. A transaction like transfer involves two accounts. I ignore that detail for simplification.)
Now that I read more about the Core Data, I'll try to answer my own question since no one did it. I hope this may help other people who have the same confusion as I did. Note the answer is based on my current (limited) understanding.
1. Core Data is an object graph manager for data to be persistently stored
There are a lot articles on the net emphasizing that Core Data manages object graph and it's not an ORM or database. While they might be technically correct, they unfortunately cause confusion to beginner like me. In my opinion, it's equally important to point out that objects managed by Core Data are not arbitrary runtime objects but those that are suitable for being saved in database. By suitable it means all these objects conform to principles of database schema design.
So, what' a proper data model is very much a database design question (it's important to point out this because most articles try to ask their readers to forget about database).
For example, in the account and transactions example I gave above, while I'd like to organize transactions per day (e,g., putting them in a two-level list, first by date, then by transaction timestamp) at runtime. But the best practice in database design is to save all transactions in a single table and generating the two-level list at runtime using application code (I believe so).
So the data model in Core Data should be like:
Account <->> Transaction
The question left is where I can add the code to generate the runtime structure (e.g., two-level list) I'd like to have. I think it's to extend Account class.
2. Constraints of Core Data
The fact that Core Data is designed to work with database (see 1) explains why it has some constraints on the data model design (i.e., attribute can't be of an arbitrary type, etc.).
While I don't see anyone mentioned this on the net, personally I think relationship in Core Data is quite limited. It can't be of a custom type (e.g, class) but has to be a variable (to-one) or an array (to-many) at run time. That makes it far less expressive. Note: I guess it's so due to some technical reason. I just hope it could be a class and hence more flexible.
For example, in my App I actually have complex logic between Account and its Transaction and want to encapsulate it into a single class. So I'm thinking to introduce an entity to represent the relationship explicitly:
Account <->> AccountTranstionMap <-> Transaction
I know it's odd to do this in Core Data. I'll see how it works and update the answer when I finish my app. If someone knows a better way to not do this, please let me know!
3. Benefits of Core Data
If one is writing a simple App, (for example, an App that data modal change are driven by user and hence occurs in sequence and don't have asynchronous data change from iCloud), I think it's OK to ignore all the discussions about object graph vs ORM, etc. and just use the basic features of Core Data.
From the documents I have read so far (there are still a lot I haven't finished), the benefits of Core Data includes automatic mutual reference establishment and clean up, live and automatically updated relationship property value, undo, etc. But if your App is not complex, it might be easier to implement these features using application code.
That said, it's interesting to learn a new technology which has limitation but at the same time can be very powerful in more complex situations. BTW, just curious, is there similar framework like Core Data on other platforms (either open source or commercial)? I don't think I read about similar things before.
I'll leave the question open for other answers and comments :) I'll update my answer when I have more practical experience with Core Data.

UIManagedDocument vs NSManagedObject: Performance

I'm trying to write a iOS note taking app that is blazingly fast for a large number of notes and that syncs without ever blocking the UI. (Don't worry, it's just a learning project, I know there are a billion note apps for iOS). I have decided to use Core Data (mostly because of the excellent posts by Brent Simmons about Vesper). I know UIManagedDocument can do async reads and writes and has a lot of functionality built in, so I'm wondering if there is any information on which would be faster for a fairly simple notes app. I can't really find a lot of information about people using UIManagedDocuments for anything other than a centralized, basically singleton, persistent store. Is it suitable for 1000s of documents? Would it be faster or slower than just a database of NSManagedObjects? It seems like most information I can find about Core Data is oriented towards people using NSManagedObject, so any information about UIManagedDocuments being used in production apps would be really helpful. At this point, the only thing I can think of is to just write the whole app both ways, load 10,000 notes into it, and see what happens.
Update
To clarify, I'm not learning iOS development and Objective-C, the "learning project" mostly means that I've never used Core Data and would like to learn how to write a really performant Core Data application.
UIManagedDocument is designed/intended for document based applications. One UIManagedDocument instance per document. If you are not building a document based application then you should not be using UIManagedDocument.
Everything that people like about UIManagedDocument can be accomplished with very little effort using the Core Data stack directly. UIManagedDocument abstracts you away from what your persistence layer is doing. Something you really do not want.
If you want a high performance Core Data application you do not want to be using UIManagedDocument. You will run into issues with it. It will do things at random times and cause performance issues.
You are far, far, better off learning the framework properly.
In the case of Vesper, those are not documents; they are too small. Think of documents as Word files, or Excel files. Large complicated data structures that are 100% isolated from each other.
Also, whether you use a UIManagedDocument or not, you will be using NSManagedObject instances. NSManagedObject, NSManagedObjectContext, NSPersistentStoreCoordinator are all foundational objects in Core Data. UIManagedDocument is just an abstraction layer on top.
Finally, Core Data is not a database. Thinking of it that way will get you into a jam. Core Data is an object model that can persist to disk and one of the persistence formats happens to be SQLite.
Update (Running Into Problems)
UIManagedDocument is an abstraction on top of Core Data. To use UIManagedDocument you actually need to learn more about Core Data than if you just used the primary Core Data stack.
UIManagedDocument uses a parent/child context internally. Don't understand that yet? See the point above. What it also means is that your requests for it to save are "taken under advisement" as opposed to being saved right then and there. This can lead to unexpected results if you don't understand the point of it or don't want it to save when it feels like it.
UIManagedDocument uses asynchronous saves and at most you can request that it save. Doesn't mean it is going to save now, nor does it mean you can easily stop and wait for the save to complete. You need to trust that it will complete it. In addition, it may decide to save at an inopportune moment.
When you start looking for performance gains with Core Data you tend to want to build the stack in a very specific way to maximize the benefit to your application. That is application dependent and with the abstracts in UIManagedDocument you get limited very quickly.
Even in a situation where I was building a document based application I would still not use UIManagedDocument. Just to much behind the curtain.
Performance is likely to come down to how the rest of your code is implemented, and not necessarily the difference between a UIManagedDocument or a NSManagedObject.
Think of UIManagedDocument as a specific niche implementation of CoreData, that already has the parts of CoreData you want baked into it's structure to save you (the developer) a little bit of time writing code. It's been purpose built for handling UIDocuments, and multi-threading.
Under the hood, it's likely that UIManagedDocument is using CoreData as well as you would (assuming you know what you're doing), but it's theoretically possible you could cut some corners by knowing the exact and inane details of your implementation.
If you're new to CoreData, or GCD and NSOperationQueue, then you'll likely save a ton of developer time by leveraging UIManagedDocument instead of rolling your own.
A very apt analogy would be to using a NSFetchedResultsController to run your UITableView, rather then rolling your own CoreData and UITableView implementation.
If you're new to objective-C, I'd recommend you bang out something functional with UIManagedDocument at first. Later you get lost in the weeds of dispatch_asynch() and NSFetchRequest, and pick up a few milliseconds of performance here and there.
Cheers!
Just store a plain text file on the disk for each note.
Keep a separate database (using sqlite or perhaps just -[NSDictionary writeToFile:atomically]) to store metadata, such as the last user modification date and the last server sync date for each note.
Periodically talk to the server, asking it for a list of stuff that has changed since the last time you did a sync, and sending any data on your end that has changed in the same time period.
This should be perfectly fast as long as you have less than a million note files. Do all network and filesystem operations on an NSOperationQueue.

What shouldn't I store in core data?

So this is more of an application design question. But I think it can be 'answered' and not just discussed. :)
I'm using RestKit for an application we're building. It obviously makes it super easy to put stuff into either straight objects or core data objects.
In the specific instance I'm dealing with, we have comments, much like comments on a facebook post.
Now, the nicest thing about storing these comments in core data is that with NSFRC I can sort them super easily and deal with updating/inserting automatically into the right spots into the timeline. But there's a couple sticking points there as well.
For instance with infinite loading, I now have to manage loading the comments in between the new most recent comments and the old stored comments. (Maybe the first time I grabbed 25, but there has been 100 new comments since then. So I retrieve the latest 25 first, then have to have an auto load cell in between the new comments and the old until I run into those, then have to paginate any after.
Aside from that, then you are also storing potentially thousands of comments in core data. Maybe it's not a big deal for quite a long time, but eventually you might want to start cleaning up old comments with a GCD task.
So what are the leading thoughts on what to store in core data, and what to keep as transient objects. (Maybe storing those in a cache like NSCache or the new Tumblr cache https://github.com/tumblr/TMCache).
Edit
Ok maybe I should clarify a little here. I get the purpose of Core Data... for persisting across app restarts and having an object graph with relationships. I make plenty of use of it. I guess what I'm wondering about here is the grey areas where I would like things to persist for the sake of not always having to wait for a network call, and offline availability.
But much like stories and comments on facebook, there are always going to be a constant stream of new ones coming in, and you don't necessarily care about 300 comments on an old post. Someone could come back to view comments on their 'post' quite a few times, or someone may just be browsing 'posts' and comments casually, and never coming back to them.
So I'm just trying to consider the strategy for something like this where you have potentially lots of entities (comments) coming down from a service. Sometimes people will want to view them several times (their own 'post') and sometimes they are just browsing through. When trying to see how others do this, it seems some stuff it all into core data, some (like Facebook) seem to store 25-50 most recent in the db, and any beyond that are transient (they probably are clearing out older stories and comments regularly too.)
Core Data is not designed to be used as "dumb data storage", but rather object persistence. So, anything that you want to persist between uses of your app should go into Core Data.
If you are using Core Data properly, it will take care of all of your caching for you as well.
EDIT:
Anything that is going to change too often for your taste or that you just don't want to permanently store, NSCache may be a better option. If you don't think your user will look at it again tomorrow, leave those bits out of your persistence. (IMHO)
Create a scond repository. Either select a time period that is known as 'recent' or provide a preference for such. Periodically look at the primary repository and find objects now older than the recent range, and move those objects to the scond repository.
Then provide users the means to search in just recent or all.
If all they want are recent values the searches should be faster, and nothing is lost.

Core Data requirement in a server side json app

I am trying to build an app for my website. I was under the impression that all I had to do was get and put json from the server to my phone. But then today I came across Core Data ( I am new to iOS).
So my question is, that to make a faster iOS app, is it a normal practice to fetch json data and save it as core data? Would apps like Facebook, Twitter follow this approach? or is just fetching and parsing json is a normal practice, and core data is not needed.
I am sorry if the question is dumb.
It is normal to retrieve data from a server (XML or JSON) and keep it in memory, if the memory foot print is reasonable. If you are talking about hundreds upon thousands of rows from a database, then persistent storage, with a dedicated data model(s) is probably the best choice; you can read it when needed.
If your needs are such that a complex data model is needed, one-to-many and/or many-to-many relationships, then consider Core Data (or SQLite directly).
You define your needs first, then try to define the data model that fits your needs (custom objects or maybe just a few instances of NSDictionary), then decide how that data needs to persist and how you plan on interacting with that data.
A few starting points:
Core Data Overview - Shoud help you decide if you should use it
RestKit - Just a suggestion
Tutorial on Data Persistence
Good luck.
I remember facing this anomaly not so long ago.
As already pointed out in some of the comments, it depends on your needs.
Not all data are eligible to be saved in core data after retrieving it from the web side. You might have integrity issues with that. To do the checks for large chunks of data might have even severe overheads. But if you feel that certain data are not likely to change very often then you can employ this technique for some portions.
If you decide to stick with Request/Fetch data, be sure you process the requests using NSOperation, GCD or NSThread, in order to avoid UI freezes.
Although, they are used for same purpose, they all have advantages and disadvantage, plz check out this topic on NSOperation vs Grand Central Dispatch
I hope this helps.

Keeping Core Data Objects in multiple stores

I'm developing an iOS application using Core Data. I want to have the persistent store located in a shared location, such as a network drive, so that multiple users can work on the data (at different times i.e. concurrency is not part of the question).
But I also want to offer the ability to work on the data "offline", i.e. by keeping a local persistent store on the iPad. So far, I read that I could do this to some degree by using the persistent store coordinator's migration function, but this seems to imply the old store is then invalidated. Furthermore, I don't necessarily want to move the complete store "offline", but just a part of it: going with the simple "company department" example that Apple offers, I want users to be able to check out one department, along with all the employees associated with that department (and all the attributes associated with each employee). Then, the users can work on the department data locally on their iPad and, some time later, synchronize those changes back to the server's persistent store.
So, what I need is to copy a core data object from one store to another, along with all objects referenced through relationships. And this copy process needs to also ensure that if an object already exists in the target persistent store, that it's overwritten rather than a new object added to the store (I am already giving each object a UID for another reason, so I might be able to re-use the UID).
From all I've seen so far, it looks like there is no simple way to synchronize or copy Core Data persistent stores, is that a fair assessment?
So would I really need to write a piece of code that does the following:
retrieve object "A" through a MOC
retrieve all objects, across all entities, that have a relationship to object "A"
instantiate a new MOC for the target persistent store
for each object retrieved, check the target store if the object exists
if the object exists, overwrite it with the attributes from the object retrieved in steps 1 & 2
if the object doesn't exist, create it and set all attributes as per object retrieved in steps 1 & 2
While it's not the most complicated thing in the world to do, I would've still thought that this requirement for "online / offline editing" is common enough for some standard functionality be available for synchronizing parts of persistent stores?
Your point of views greatly appreciated,
thanks,
da_h-man
I was just half-kidding with the comment above. You really are describing a pretty hard problem - it's very difficult to nail this sort of synchronization, and there's seldom, in any development environment, going to be a turn-key solution that will "just work". I think your pseudo-code description above is a pretty accurate description of what you'll need to do. Although some of the work of traversing the relationships and checking for existing objects can be generalized, you're talking about some potentially complicated exception handling situations - for example, if updating an object, and only 1 out 5 related objects is somehow out of date, do you throw away the update or apply part of it? You say "concurrency" is not a part of the question, but if multiple users can "check out" objects at the same time, unless you plan to have a locking mechanism on those, you would start having conflicts when trying to make updates.
Something to check into are the new features in Core Data for leveraging iCloud - I doubt that's going to help with your problem, but it's generally related.
Since you want to be out on the network with your data, another thing to consider is whether Core Data is the right fit to your problem in general. Since Core Data is very much a technology designed to support the UI and MVC pattern in general, if your data needs are not especially bound to the UI, you might consider another type of DB solution.
If you are in fact leveraging Core Data in significant ways beyond just modeling, in terms of driving your UI, and you want to stick with it, I think you are correct in your analysis: you're going to have to roll your own solution. I think it will be a non-trivial thing to build and test.
An option to consider is CouchDB and an iOS implementation called TouchDB. It would mean adopting more of a document-oriented (JSON) approach to your problem, which may in fact be suitable, based on what you've described.
From what I've seen so far, I reckon the best approach is RestKit. It offers a Core Data wrapper that uses JSON to move data between remote and local stores. I haven't fully tried it yet, but from what the documentation reads, it sounds quite powerful and ideally suited for my needs.
You definetly should check these things:
Parse.com - cloud based data store
PFIncrementalStore https://github.com/sbonami/PFIncrementalStore - subclass of NSIncrementalStore which allows your Persistent Store Coordinator to store data both locally and remotely (on Parse Cloud) at the same time
All this stuff are well-documented. Also Parse.com is going to release iOS local datastore SDK http://blog.parse.com/2014/04/30/take-your-app-offline-with-parse-local-datastore/ wich is going to help keep your data synced.

Resources