Where can I find concrete examples of advanced CoreData concurrency? By advanced I mean operations on contexts and NSManagedObjects that run simultaneously on two or more threads and every thread can both read and change objects. Each objects saves contexts and listen for changes in another threads, everyone merges changes properly, nothing crashes, there are no inconsistency exceptions, everything is done as it should be.
I read official Apple document about Core Data concurrency, now I'm looking for code examples, tutorials, books, or at least some more detailed information on how to handle this type of scenario.
there is a really nice blog post from cocoanetics:
Multi Context Core Data
and i have created a github repo for the Async Saving Example:
Multi Context Core Data GitHub
Related
A few weeks ago, I decided to learn Core Data for my new project and apply it to my entire model. There was a steep learning curve, but eventually I got familiar with the stack and I'm now rather comfortable with at least the basic concepts and the few common pitfalls such as thread concurrency.
I have to say, the first few weeks after getting comfortable where pretty amazing. NSFetchedResultsController give you a good way to communicate between my model and my controllers. However the more I use Core Data, the more annoying it gets.
As a concrete example, my app fetches a few pieces of data from my server (the posts) which appear in a feed. Each post has an owner, of class User, which I also fetch from the server. Now, Core Data has been great for managing the realtionship between a post and a user. The relationship is updated automatically and getting the post's origin is as simple as calling post.owner. However, there are also inconveniences:
1.Core Data forces objects to the disk that I do not want forced to the disk. This is probably the main issue. With the posts, I do not want them to be forced to disk, and would rather make calls to the server again. Why? Because the more posts I store persistently, the more housekeeping there is to do. A post can be edited, deleted, flagged, etc... and keeping those posts locally means having to plan updates.
2.Having to constantly worry about concurrency of contexts, objects and the likes. I wrote an object factory that always returns objects on the right thread and the right context, but even then bugs occur here and there, which quickly becomes frustrating.
3.Decreased performance. Perhaps the least important one at this point, going from cached objects to Core Data has taken a (barely noticeable) toll on the performance of my application (most notably the feed).
So what are your recommendations regarding Core Data? Would you suggest a different approach to Core Data?
I was thinking of a hybrid caching + Core Data where I store the information I will actually use many times (such as users) persistently and then use the RAM for things like posts, or simply creating posts without an NSManagedContext. Input welcome!
Core Data forces objects to the disk that I do not want forced to the disk.
It does no such thing. If you don't want to save your Post objects to the persistent store, don't put them in Core Data and don't make them managed objects. Your User object can have a posts property even if the Post object is not managed by Core Data. Managed objects can have properties of any type, not only to other managed objects.
Having to constantly worry about concurrency of contexts, objects and the likes.
Concurrency is complex no matter how you model your data. It's a fundamentally complex problem. You're encountering it with Core Data because you're using Core Data. If you use something else, you'll deal with it there.
Decreased performance.
"Product" menu --> "Analyze" and run Instruments to find out why. There's no reason this should happen, and you have the tools to discover what's actually going on.
The only (and most recent) results I'm finding about best practices are here: https://developer.apple.com/library/ios/documentation/Cocoa/Conceptual/CoreData/Articles/cdConcurrency.html
However, the very top of the page says,
"Important: Best practices for concurrency with Core Data have changed dramatically since this document was written; please note that this chapter does not represent current recommendations."
Where can I find more current documentation for concurrency with core data?
The best discussion is now under 'Concurrency' within the NSManagedObjectContext documentation.
My summary:
Thread confinement is still required. The big changes introduced by iOS 5/OS X v10.7 were that contexts can now have other contexts as parents and can manage their own serial queues.
Changes are automatically migrated from children to parent upon a save. That's now what save means. Only if your parent is the persistent store are you actually committing to disk.
So all that stuff about synchronising by notifications is what Apple doesn't want you to follow. All of those mechanisms are still available but Apple has pulled the most common patterns directly into the framework.
The Core Data Programming Guide has been updated for iOS 9/OS X El Capitan. See https://developer.apple.com/library/prerelease/ios/documentation/Cocoa/Conceptual/CoreData/Concurrency.html.
I also found WWDC 2014 session 225 What's new in Core Data (at 22:50) very helpful in understanding both current and historic concurrency methods.
Thread confinement has been obsoleted. You can see this in the header for NSManagedObjectContext:
NSConfinementConcurrencyType = 0x00, /* this option is obsolete and not recommended for new code. */
When a context is created with -init, it calls the initializer -initWithConcurrencyType: with the argument NSConfinementConcurrencyType. This is the threading model described in the Core Data Programming Guide section on concurrency, that has been obsolete and not recommended for some time. In the words of one Core Data engineer "It just didn't work.".
Unfortunately that Core Data Programming Guide has not been updated to describe the current recommended best practices for concurrency and other advancements.
But hey, at least it's not telling you to use locking!
The Incremental Store Programming Guide has been updated recently. It describes how to implement an NSIncrementalStore, and in doing so does a very good job of explaining some of Core Data's internals. For example, it describes what a fault is and how faults are fired far better than the Core Data programming guide ever did. The Core Data release notes for the past several years have included some updated information about best practices, and there are several tech notes that are relevant to Core Data.
The best information in the last several years has been the yearly "What's New In Core Data" sessions at WWDC. For concurrency, you should check out these WWDC sessions:
WWDC 2011 What's new in Core Data on MacOS X. The MacOS X session was a little more detailed than the iOS session.
and
WWDC 2012 Core Data Best Practices
I would encourage you to use the "Feedback" button on the Core Data Programming Guide pages, or file a radar bug asking for the documentation to be updated.
The guide that I've been following is here: http://www.cocoanetics.com/2012/07/multi-context-coredata/
Even though it was written in 2012 it still seems to match what i've seen currently being used. It helps you set up a main parent context with NSMainQueueConcurrencyType, multiple background contexts with NSPrivateQueueConcurrencyType, and also has a background writer context that has the persistent store.
Xcode 6 and Yosemite also seem to have an improvement in core data debugging: http://oleb.net/blog/2014/06/core-data-concurrency-debugging/. This was useful in understanding how to use performBlock at the right places.
I'm trying to write a iOS note taking app that is blazingly fast for a large number of notes and that syncs without ever blocking the UI. (Don't worry, it's just a learning project, I know there are a billion note apps for iOS). I have decided to use Core Data (mostly because of the excellent posts by Brent Simmons about Vesper). I know UIManagedDocument can do async reads and writes and has a lot of functionality built in, so I'm wondering if there is any information on which would be faster for a fairly simple notes app. I can't really find a lot of information about people using UIManagedDocuments for anything other than a centralized, basically singleton, persistent store. Is it suitable for 1000s of documents? Would it be faster or slower than just a database of NSManagedObjects? It seems like most information I can find about Core Data is oriented towards people using NSManagedObject, so any information about UIManagedDocuments being used in production apps would be really helpful. At this point, the only thing I can think of is to just write the whole app both ways, load 10,000 notes into it, and see what happens.
Update
To clarify, I'm not learning iOS development and Objective-C, the "learning project" mostly means that I've never used Core Data and would like to learn how to write a really performant Core Data application.
UIManagedDocument is designed/intended for document based applications. One UIManagedDocument instance per document. If you are not building a document based application then you should not be using UIManagedDocument.
Everything that people like about UIManagedDocument can be accomplished with very little effort using the Core Data stack directly. UIManagedDocument abstracts you away from what your persistence layer is doing. Something you really do not want.
If you want a high performance Core Data application you do not want to be using UIManagedDocument. You will run into issues with it. It will do things at random times and cause performance issues.
You are far, far, better off learning the framework properly.
In the case of Vesper, those are not documents; they are too small. Think of documents as Word files, or Excel files. Large complicated data structures that are 100% isolated from each other.
Also, whether you use a UIManagedDocument or not, you will be using NSManagedObject instances. NSManagedObject, NSManagedObjectContext, NSPersistentStoreCoordinator are all foundational objects in Core Data. UIManagedDocument is just an abstraction layer on top.
Finally, Core Data is not a database. Thinking of it that way will get you into a jam. Core Data is an object model that can persist to disk and one of the persistence formats happens to be SQLite.
Update (Running Into Problems)
UIManagedDocument is an abstraction on top of Core Data. To use UIManagedDocument you actually need to learn more about Core Data than if you just used the primary Core Data stack.
UIManagedDocument uses a parent/child context internally. Don't understand that yet? See the point above. What it also means is that your requests for it to save are "taken under advisement" as opposed to being saved right then and there. This can lead to unexpected results if you don't understand the point of it or don't want it to save when it feels like it.
UIManagedDocument uses asynchronous saves and at most you can request that it save. Doesn't mean it is going to save now, nor does it mean you can easily stop and wait for the save to complete. You need to trust that it will complete it. In addition, it may decide to save at an inopportune moment.
When you start looking for performance gains with Core Data you tend to want to build the stack in a very specific way to maximize the benefit to your application. That is application dependent and with the abstracts in UIManagedDocument you get limited very quickly.
Even in a situation where I was building a document based application I would still not use UIManagedDocument. Just to much behind the curtain.
Performance is likely to come down to how the rest of your code is implemented, and not necessarily the difference between a UIManagedDocument or a NSManagedObject.
Think of UIManagedDocument as a specific niche implementation of CoreData, that already has the parts of CoreData you want baked into it's structure to save you (the developer) a little bit of time writing code. It's been purpose built for handling UIDocuments, and multi-threading.
Under the hood, it's likely that UIManagedDocument is using CoreData as well as you would (assuming you know what you're doing), but it's theoretically possible you could cut some corners by knowing the exact and inane details of your implementation.
If you're new to CoreData, or GCD and NSOperationQueue, then you'll likely save a ton of developer time by leveraging UIManagedDocument instead of rolling your own.
A very apt analogy would be to using a NSFetchedResultsController to run your UITableView, rather then rolling your own CoreData and UITableView implementation.
If you're new to objective-C, I'd recommend you bang out something functional with UIManagedDocument at first. Later you get lost in the weeds of dispatch_asynch() and NSFetchRequest, and pick up a few milliseconds of performance here and there.
Cheers!
Just store a plain text file on the disk for each note.
Keep a separate database (using sqlite or perhaps just -[NSDictionary writeToFile:atomically]) to store metadata, such as the last user modification date and the last server sync date for each note.
Periodically talk to the server, asking it for a list of stuff that has changed since the last time you did a sync, and sending any data on your end that has changed in the same time period.
This should be perfectly fast as long as you have less than a million note files. Do all network and filesystem operations on an NSOperationQueue.
I have just began learning Core data. When it comes to multithreading, some blogs say that in this case we should use children contexts (by creating a context and setting its parent) and just invoke the performBlock: method. However some other blogs say that we should avoid this approach since it has introduced many bugs.
I have just began developing an application that manipulates a large data base and the project manager voted for Core data (instead of SQLLite).
Could any one please give me some directions. Should i use the children contexts strategy (introduced since iOS 5) or is there a better way to perform multithreading with Core Data ?
Thanks.
Should i use the children contexts strategy (introduced since iOS 5)
or is there a better way to perform multithreading with Core Data ?
In addition to the concept you mentioned, Managed Object Contexts have built-in concurrency support without parent contexts (see https://developer.apple.com/library/ios/releasenotes/DataManagement/RN-CoreData/index.html).
If you create one using initWithConcurrencyType:, you can use performBlock: and performBlockAndWait: and the threading will be handled for you, assuming you follow the basic patterns outlined in the link above. The parent/child context approach can help you with synchronization.
There's also an NSOperation-based approach outlined here: http://www.objc.io/issue-2/common-background-practices.html. I personally wouldn't use it, because the built-in APIs are sufficient, but the article is very well written and should give you a good idea of what's going on.
How you implement this depends on the needs of your app.
some other blogs say that we should avoid this approach since it has
introduced many bugs.
I would ignore them, and focus on writing clean code for yourself. There are plenty of apps that use multithreading + Core Data without bugs.
I am trying to build an app for my website. I was under the impression that all I had to do was get and put json from the server to my phone. But then today I came across Core Data ( I am new to iOS).
So my question is, that to make a faster iOS app, is it a normal practice to fetch json data and save it as core data? Would apps like Facebook, Twitter follow this approach? or is just fetching and parsing json is a normal practice, and core data is not needed.
I am sorry if the question is dumb.
It is normal to retrieve data from a server (XML or JSON) and keep it in memory, if the memory foot print is reasonable. If you are talking about hundreds upon thousands of rows from a database, then persistent storage, with a dedicated data model(s) is probably the best choice; you can read it when needed.
If your needs are such that a complex data model is needed, one-to-many and/or many-to-many relationships, then consider Core Data (or SQLite directly).
You define your needs first, then try to define the data model that fits your needs (custom objects or maybe just a few instances of NSDictionary), then decide how that data needs to persist and how you plan on interacting with that data.
A few starting points:
Core Data Overview - Shoud help you decide if you should use it
RestKit - Just a suggestion
Tutorial on Data Persistence
Good luck.
I remember facing this anomaly not so long ago.
As already pointed out in some of the comments, it depends on your needs.
Not all data are eligible to be saved in core data after retrieving it from the web side. You might have integrity issues with that. To do the checks for large chunks of data might have even severe overheads. But if you feel that certain data are not likely to change very often then you can employ this technique for some portions.
If you decide to stick with Request/Fetch data, be sure you process the requests using NSOperation, GCD or NSThread, in order to avoid UI freezes.
Although, they are used for same purpose, they all have advantages and disadvantage, plz check out this topic on NSOperation vs Grand Central Dispatch
I hope this helps.