Core Data — Find-or-Create Efficiently

Core Data — Find-or-Create Efficiently - ios

According to Apple's documentation(link)—
There are many situations where you may need to find existing objects
(objects already saved in a store) for a set of discrete input values.
A simple solution is to create a loop, then for each value in turn
execute a fetch to determine whether there is a matching persisted
object and so on. This pattern does not scale well. If you profile
your application with this pattern, you typically find the fetch to be
one of the more expensive operations in the loop (compared to just
iterating over a collection of items). Even worse, this pattern turns
an O(n) problem into an O(n^2) problem.
It is much more efficient—when possible—to create all the managed
objects in a single pass, and then fix up any relationships in a
second pass. For example, if you import data that you know does not
contain any duplicates (say because your initial data set is empty),
you can just create managed objects to represent your data and not do
any searches at all. Or if you import "flat" data with no
relationships, you can create managed objects for the entire set and
weed out (delete) any duplicates before save using a single large IN
predicate.
Question 1: Considering that my data I'm importing doesn't have any relationships, how do I implement what is described in the last line.
If you do need to follow a find-or-create pattern—say because you're
importing heterogeneous data where relationship information is mixed
in with attribute information—you can optimize how you find existing
objects by reducing to a minimum the number of fetches you execute.
How to accomplish this depends on the amount of reference data you
have to work with. If you are importing 100 potential new objects, and
only have 2000 in your database, fetching all of the existing and
caching them may not represent a significant penalty (especially if
you have to perform the operation more than once). However, if you
have 100,000 items in your database, the memory pressure of keeping
those cached may be prohibitive.
You can use a combination of an IN predicate and sorting to reduce
your use of Core Data to a single fetch request.
Example code:
// Get the names to parse in sorted order.
NSArray *employeeIDs = [[listOfIDsAsString componentsSeparatedByString:#"\n"]
sortedArrayUsingSelector: #selector(compare:)];
// create the fetch request to get all Employees matching the IDs
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setEntity:
[NSEntityDescription entityForName:#"Employee" inManagedObjectContext:aMOC]];
[fetchRequest setPredicate: [NSPredicate predicateWithFormat: #"(employeeID IN %#)", employeeIDs]];
// Make sure the results are sorted as well.
[fetchRequest setSortDescriptors:
#[ [[NSSortDescriptor alloc] initWithKey: #"employeeID" ascending:YES] ]];
// Execute the fetch.
NSError *error;
NSArray *employeesMatchingNames = [aMOC executeFetchRequest:fetchRequest error:&error];
You end up with two sorted arrays—one with the employee IDs passed
into the fetch request, and one with the managed objects that matched
them. To process them, you walk the sorted lists following these
steps:
Get the next ID and Employee. If the ID doesn't match the Employee ID,
create a new Employee for that ID. Get the next Employee: if the IDs
match, move to the next ID and Employee.
Question 2: In the above example, I get two sorted arrays as described above. Considering the worst case scenario where all the objects that are to be inserted are present in the store, I don't see anyway that I can solve the problem in O(n) time. Apple describes the two steps as above but that is an O(n^2) job. For any kth element in the input array, there might or might not exist an element that matches it in the first k elements in the output array. So in the worst case, the complexity will be O(nC2) = O(n^2).
So, what I believe Apple is doing is making sure that fetch only processes once even though there are O(n^2) checks required. If so, then I'll go with this; but is there any other way of doing this efficiently.
Please understand, that I don't want to fetch again and again - fetch once for an input array of size 100 identifiers.

Ad. 1 The fact of having relationships isn't important here. This explanation only says that if you download your data from e.g. a remote server and your items have some IDs, then you can fetch them all from the persistent store in one request, instead of fetching each object in a separate request.
Ad. 2
Apple describes the two steps as above but that is an O(n^2) job.
It's not. Please read these lines carefully:
To process them, you walk the sorted lists following these steps:
Get the next ID and Employee. If the ID doesn't match the Employee ID,
create a new Employee for that ID. Get the next Employee: if the IDs
match, move to the next ID and Employee.
You walk the arrays/lists simultaneously, so you never have to make this check: "there might or might not exist an element that matches it in the first k elements in the output array." You don't need to check previous elements as they're sorted and they certainly won't contain the object you're interested in.

If anyone is looking for the original Apple documentation there is snapshot here:
http://web.archive.org/web/20150908024050/https://developer.apple.com/library/mac/documentation/cocoa/conceptual/coredata/articles/cdimporting.html

Related

Core Data sectionNameKeyPath with Relationship Attribute Performance Issue

I have a Core Data Model with three entities:
Person, Group, Photo with relationships between them as follows:
Person <<-----------> Group (one to many relationship)
Person <-------------> Photo (one to one)
When I perform a fetch using the NSFetchedResultsController in a UITableView, I want to group in sections the Person objects using the Group's entity name attribute.
For that, I use sectionNameKeyPath:#"group.name".
The problem is that when I'm using the attribute from the Group relationship, the NSFetchedResultsController fetches everything upfront in small batches of 20 (I have setFetchBatchSize: 20) instead of fetching batches while I'm scrolling the tableView.
If I use an attribute from the Person entity (like sectionNameKeyPath:#"name") to create sections everything works OK: the NSFetchResultsController loads small batches of 20 objects as I scroll.
The code I use to instantiate the NSFetchedResultsController:
- (NSFetchedResultsController *)fetchedResultsController {
if (_fetchedResultsController) {
return _fetchedResultsController;
}
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:[Person description]
inManagedObjectContext:self.managedObjectContext];
[fetchRequest setEntity:entity];
// Specify how the fetched objects should be sorted
NSSortDescriptor *groupSortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"group.name"
ascending:YES];
NSSortDescriptor *personSortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"birthName"
ascending:YES
selector:#selector(localizedStandardCompare:)];
[fetchRequest setSortDescriptors:[NSArray arrayWithObjects:groupSortDescriptor, personSortDescriptor, nil]];
[fetchRequest setRelationshipKeyPathsForPrefetching:#[#"group", #"photo"]];
[fetchRequest setFetchBatchSize:20];
NSError *error = nil;
NSArray *fetchedObjects = [self.managedObjectContext executeFetchRequest:fetchRequest error:&error];
if (fetchedObjects == nil) {
NSLog(#"Error Fetching: %#", error);
}
_fetchedResultsController = [[NSFetchedResultsController alloc] initWithFetchRequest:fetchRequest
managedObjectContext:self.managedObjectContext sectionNameKeyPath:#"group.name" cacheName:#"masterCache"];
_fetchedResultsController.delegate = self;
return _fetchedResultsController;
}
This is what I get in Instruments if I create sections based on "group.name" without any interaction with the App's UI:
And this is what I get (with a bit of scrolling on UITableView) if sectionNameKeyPath is nil:
Please, can anyone help me out on this issue?
EDIT 1:
It seems that I get inconsistent results from the simulator and Instruments: when I've asked this question, the app was starting in the simulator in about 10 seconds (by Time Profiler) using the above code.
But today, using the same code as above, the app starts in the simulator in 900ms even if it makes a temporary upfront fetch for all the objects and it's not blocking the UI.
I've attached some fresh screenshots:
EDIT 2:
I reset the simulator and the results are intriguing: after performing an import operation and quitting the app the first run looked like this:
After a bit of scrolling:
Now this is what happens on a second run:
After the fifth run:
EDIT 3:
Running the app the seventh time and eight time, I get this:

This is your stated objective: "I need the Person objects to be grouped in sections by the relationship entity Group, name attribute and the NSFetchResultsController to perform fetches in small batches as I scroll and not upfront as it is doing now."
The answer is a little complicated, primarily because of how an NSFetchedResultsController builds sections, and how that affects the fetching behavior.
TL;DR; To change this behavior, you would need to change how NSFetchedResultsController builds sections.
What is happening?
When an NSFetchedResultsController is given a fetch request with pagination (fetchLimit and/or fetchBatchSize), several things happen.
If no sectionNameKeyPath is specified, it does exactly what you expect. The fetch returns an proxy array of results, with "real" objects for the first fetchBathSize number of items. So for example if you have setFetchBatchSize to 2, and your predicate matches 10 items in the store, the results contain the first two objects. The other objects will be fetched separately as they are accessed. This provides a smooth paginated response experience.
However, when a sectionNameKeyPath is specified, the fetched results controller has to do a bit more. To compute the sections it needs to access that key path on all the objects in the results. It enumerates the 10 items in the results in our example. The first two have already been fetched. The other 8 will be fetched during enumeration to get the key path value needed to build the section information. If you have a lot of results for your fetch request, this can be very inefficient. There are a number of public bugs concerning this functionality:
NSFetchedResultsController initially takes too long to set up sections
NSFetchedResultsController ignores fetchLimit property
NSFetchedResultsController, Table Index, and Batched Fetch Performance Issue
... And several others. When you think about it, this makes sense. To build the NSFetchedResultsSectionInfo objects requires the fetched results controller to see every value in the results for the sectionNameKeyPath, aggregate them to the unique union of values, and use that information to create the correct number of NSFetchedResultsSectionInfo objects, set the name and index title, know how many objects in the results a section contains, etc. To handle the general use case there is no way around this. With that in mind, your Instruments traces may make a lot more sense.
How can you change this?
You can attempt to build your own NSFetchedResultsController that provides an alternative strategy for building NSFetchedResultsSectionInfo objects, but you may run into some of the same problems. For example, if you are using the existing fetchedObjects functionality to access members of the fetch results, you will encounter the same behavior when accessing objects that are faults. Your implementation would need a strategy for dealing with this (it's doable, but very dependant on your needs and requirements).
Oh god no. What about some kind of temporary hack that just makes it perform a little better but doesn't fix the problem?
Altering your data model will not change the above behavior, but can change the performance impact slightly. Batch updates will not have any significant effect on this behavior, and in fact will not play nicely with a fetched results controller. It may be much more useful to you, however, to instead set the relationshipKeyPathsForPrefetching to include your "group" relationship, which may improve the fetching and faulting behavior significantly. Another strategy may be to perform another fetch to batch fault these objects before you attempt to use the fetched results controller, which will populate the various levels of Core Data in-memory caches in a more efficient manner.
The NSFetchedResultsController cache is primarily for section information. This prevents the sections from having to be completely recalculated on each change (in the best case), but can actually make the initial fetch to build the sections take much longer. You will have to experiment to see if the cache is worthwhile for your use case.
If your primary concern is that these Core Data operations are blocking user interaction, you can offload them from the main thread. NSFetchedResultsController can be used on a private queue (background) context, which will prevent Core Data operations from blocking the UI.

Based on my experience a way to achieve your goal is to denormalize your model. In particular, you could add a group attribute in your Person entity and use that attribute as sectionNameKeyPath. So, when you create a Person you should also pass the group it belongs to.
This denormalization process is correct since it allows you to avoid fetching of related Group objects since not necessary. A cons could be that if you change the name of a group, all the persons associated with that name must change, on the contrary you can have incorrect values.
The key aspect here is the following. You need to have in mind that Core Data is not a
relational database. The model should not designed as a database schema, where normalization could take place, but it should be designed from the perspective of how the data are presented and used in the user interface.
Edit 1
I cannot understand your comment, could you explain better?
What I've found very intriguing though is that even if the app is
performing a full upfront fetch in the simulator, the app loads in
900ms (with 5000 objects) on the device despite the simulator where it
loads much slower.
Anyway, I would be interested in knowing details about your Photo entity. If you pre-fetch photo the overall execution could be influenced.
Do you need to pre-fetch a Photo within your table view? Are they thumbs (small photos)? Or normal images? Do you take advantage of External Storage Flag?
Adding an additional attribute (say group) to the Person entity could not be a problem. Updating the value of that attribute when the name of a Group object changes it's not a problem if you perform it in background. In addition, starting from iOS 8 you have available a batch update as described in Core Data Batch Updates.

After almost a year since I've posted this question, I've finally found the culprits that enable this behaviour (which slightly changed in Xcode 6):
Regarding the inconsistent fetch times: I was using a cache and at the time I was back and forth with opening, closing and resetting the simulator.
Regarding the fact that everything was fetched upfront in small batches without scrolling (in Xcode 6's Core Data Instruments that's not the case anymore - now it's one, big fetch which takes entire seconds):
It seems that setFetchBatchSize does not work correctly with parent/child contexts. The issue was reported back in 2012 and it seems that it's still there http://openradar.appspot.com/11235622.
To overcome this issue, I created another independent context with an NSMainQueueConcurrencyType and set its persistence coordinator to be the same that my other contexts are using.
More about issue #2 here: https://stackoverflow.com/a/11470560/1641848

CoreData getting objects based on a distinct property

I've had a problem for a while and I have hacked together a solution but I am revisiting it in the hopes of finding a real solution. Unfortunately that is not happening. In Core Data I've got a bunch of RSS articles. The user can subscribe to individual channels within a single feed. The problem is that some feed providers post the exact same article in multiple channels of the same feed. So the user ends up getting 2+ versions of the same article. I want to keep all articles in case the user unsubscribes from a channel that contains one copy but stays subscribed to another channel with a duplicate, but I only want to show a single article in the list of available articles.
To identify duplicates, I create a hash value of the article text content and store it as a property on the Article entity in Core Data (text_hash). My original thinking was that I would be able to craft a fetch request that could get the articles based on a unique match on this property, something like an SQL query. That turns out not to be the case (I was just learning Core Data at the time).
So to hack up a solution, I fetch all the articles, I make an empty set, i enumerate the fetch results, checking if the hash is in the set. If it is, I ignore it, if it isn't, i add it to the set and I add the article id to an array. When I'm finished, I create a predicate based on the article ids and do another fetch.
This seems really wasteful and clumsy, not only am i fetching twice and enumerating the results, since the final predicate is based on the individual article ids, I have to re-run it every time I add a new article.
It works for now but I am going to work on a new version of this app and I would like to make this better if at all possible. Any help is appreciated, thanks!

You could use propertiesToGroupBy like so:
NSFetchRequest *fr = [NSFetchRequest fetchRequestWithEntityName:#"Article"];
fr.propertiesToGroupBy = #[#"text_hash"];
fr.resultType = NSDictionaryResultType;
NSArray *articles = [ctx executeFetchRequest:fr error:nil];

Core-data one to many relationship loses his relationship after fetching new objects [duplicate]

This question already has an answer here:
Core data relationship lost after fetching more objects into the entities
(1 answer)
Closed 9 years ago.
I've asked this question before. But i'm opening a new one because I have some other insights now. First of all this is how my core data model looks like.
Now when I fetch my first appointments into my model. Everything works oké. But the problem comes when I load up new appointments. Then the previous appointments location relation goes to NULL. The strange things is that the location relationship only works with the appointments that are last loaded in.
I'm using restkit for mapping my JSON into my core-data model. And this is how I made the relationship.
[locationMapping addPropertyMapping:[RKRelationshipMapping relationshipMappingFromKeyPath:#"appointments" toKeyPath:#"appointments" withMapping:appointmentMapping]];
Can anybody help me with this problem ?

First of all, your model is horrible (no offense). You should create LabelData, Data and VerplichtData entities. These should have to-one relationships to Location / Appointment. Location and Appointment should have to-many relationships to LabelData, Data and VerplichtData.
You should probably follow Mundis advice and not use rest kit, it will probably make debugging lots easier. Apple has a pretty decent strategy for importing data in a smart way (i.e. fast and without duplication). Here a copy-paste from the docs in case the link dies:
Implementing Find-or-Create Efficiently
A common technique when importing data is to follow a "find-or-create" pattern, where you set up some data from which to create a managed object, determine whether the managed object already exists, and create it if it does not.
There are many situations where you may need to find existing objects (objects already saved in a store) for a set of discrete input values. A simple solution is to create a loop, then for each value in turn execute a fetch to determine whether there is a matching persisted object and so on. This pattern does not scale well. If you profile your application with this pattern, you typically find the fetch to be one of the more expensive operations in the loop (compared to just iterating over a collection of items). Even worse, this pattern turns an O(n) problem into an O(n^2) problem.
It is much more efficient—when possible—to create all the managed objects in a single pass, and then fix up any relationships in a second pass. For example, if you import data that you know does not contain any duplicates (say because your initial data set is empty), you can just create managed objects to represent your data and not do any searches at all. Or if you import "flat" data with no relationships, you can create managed objects for the entire set and weed out (delete) any duplicates before save using a single large IN predicate.
If you do need to follow a find-or-create pattern—say because you're importing heterogeneous data where relationship information is mixed in with attribute information—you can optimize how you find existing objects by reducing to a minimum the number of fetches you execute. How to accomplish this depends on the amount of reference data you have to work with. If you are importing 100 potential new objects, and only have 2000 in your database, fetching all of the existing and caching them may not represent a significant penalty (especially if you have to perform the operation more than once). However, if you have 100,000 items in your database, the memory pressure of keeping those cached may be prohibitive.
You can use a combination of an IN predicate and sorting to reduce your use of Core Data to a single fetch request. Suppose, for example, you want to take a list of employee IDs (as strings) and create Employee records for all those not already in the database. Consider this code, where Employee is an entity with a name attribute, and listOfIDsAsString is the list of IDs for which you want to add objects if they do not already exist in a store.
First, separate and sort the IDs (strings) of interest.
// get the names to parse in sorted order
NSArray *employeeIDs = [[listOfIDsAsString componentsSeparatedByString:#"\n"]
sortedArrayUsingSelector: #selector(compare:)];
Next, create a predicate using IN with the array of name strings, and a sort descriptor which ensures the results are returned with the same sorting as the array of name strings. (The IN is equivalent to an SQL IN operation, where the left-hand side must appear in the collection specified by the right-hand side.)
// Create the fetch request to get all Employees matching the IDs.
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setEntity:
[NSEntityDescription entityForName:#"Employee" inManagedObjectContext:aMOC]];
[fetchRequest setPredicate: [NSPredicate predicateWithFormat:#"(employeeID IN %#)", employeeIDs]];
// make sure the results are sorted as well
[fetchRequest setSortDescriptors:
#[[[NSSortDescriptor alloc] initWithKey: #"employeeID" ascending:YES]]];
Finally, execute the fetch.
NSError *error;
NSArray *employeesMatchingNames = [aMOC executeFetchRequest:fetchRequest error:&error];
You end up with two sorted arrays—one with the employee IDs passed into the fetch request, and one with the managed objects that matched them. To process them, you walk the sorted lists following these steps:
Get the next ID and Employee. If the ID doesn't match the Employee ID, create a new Employee for that ID.
Get the next Employee: if the IDs match, move to the next ID and Employee.
Regardless of how many IDs you pass in, you only execute a single fetch, and the rest is just walking the result set.
The listing below shows the complete code for the example in the previous section.
// Get the names to parse in sorted order.
NSArray *employeeIDs = [[listOfIDsAsString componentsSeparatedByString:#"\n"]
sortedArrayUsingSelector: #selector(compare:)];
// create the fetch request to get all Employees matching the IDs
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setEntity:
[NSEntityDescription entityForName:#"Employee" inManagedObjectContext:aMOC]];
[fetchRequest setPredicate: [NSPredicate predicateWithFormat: #"(employeeID IN %#)", employeeIDs]];
// Make sure the results are sorted as well.
[fetchRequest setSortDescriptors:
#[ [[NSSortDescriptor alloc] initWithKey: #"employeeID" ascending:YES] ]];
// Execute the fetch.
NSError *error;
NSArray *employeesMatchingNames = [aMOC executeFetchRequest:fetchRequest error:&error];

Correct way to accessing attributes of a class modeled in core data

Let's say a Recipe object has an NSSet of one or more Ingredients, and that the same relationship is modeled in core data.
Given a recipe, what id the correct way to access its ingredients?
In this example it seems natural to use recipe.ingredients, but I could equally use an NSFetchRequest for Ingredient entities with an NSPredicate to match by recipe.
Now let's say I want only the ingredients that are 'collected'. This is less clear cut to me - should I use a fetch request for ingredients with a predicate restricting by recipe and collected state? Or loop through recipe.ingredients?
At the other end of the scale, perhaps I need only ingredients from this recipe that also appear in other recipes. Now, the fetch request seems more appealing.
What is the correct general approach? Or is it a case by case scenario? I am interested in the impact on:
Consitancy
Readability
Performance
Robustness (for example, it is easy to make an error in a fetch request that the compiler cannot catch).

Let's go through these in order.
Getting the ingredients for a specific Recipe, when you already have a reference: Use recipe.ingredients every time.
Getting the ingredients for a specific Recipe that have a specific value (e.g. a Boolean flag value): Easiest is probably to start with recipe.ingredients as above and then use something like NSSet's objectsPassingTest to filter them. Most elegant is to set up a fetched property on Recipe that just returns these ingredients with no extra code (the syntax may not be immediately obvious, see a previous answer I wrote for details). These two probably perform about equally. Least appealing is a fetch request.
Getting ingredients that appear in multiple recipe instances: Probably a fetch request for the Ingredient entity where the predicate is something like recipe in %#, and the %# is replaced by a list of Recipe instances.

Some basic info:
*memory operations are ~100-1000 times faster then disk operations.
*A fetch request execution is always a trip to the store (disk), and so, degrade performance.
In your case, you have a "small" set of objects that need to be queried for information.
simply iterating over them using the recipe.ingredients set would fault them one by one, each access will be a trip to the store (fault resolution).
In this case, use prefetching (either in the request, set the setRelationshipKeyPathsForPrefetching: to prefetch the ingredients relationship or execute a fetch request that fetch the set with the appropriate predicate).
If you need specific data only, then use the fetch request approach to retrieve only the data you need.
if you intend to repeatedly access the relationship for queries and info, just fetch the entire set by using prefetching, and query in-memory.
My point is:
Think of the approach that minimize your disk access (in any case you need at least 1 access).
If your data is too large to fit in memory, or to be queried in memory, perform a fetch to get only the data you need.
Now:
1.Consistency - Pick a method you find comfortable and stick with it (i use prefetching)
2.Readability - Using a property is much more readable then executing a query, however it is less efficient if not using prefetching.
3.Performance - Disk access degrade performance, but is unavoidable in some situations
4.Robustness - A fetch request show that you know what is best for your data usage. use it wisely.
To make sure you are minimising disk access, turn SQLite debug on
(-com.apple.CoreData.SQLDebug)
Edit:
Faulting behaviour

In this example it seems natural to use recipe.ingredients, but I
could equally use an NSFetchRequest for Ingredient entities with an
NSPredicate to match by recipe.
Why would you do the latter when you can do the former? You already have the recipe, and it already has a set of ingredients, so there's no need to look at all the ingredients and filter out just those that are related to the recipe that you already have.
Now let's say I want only the ingredients that are 'collected'. This
is less clear cut to me - should I use a fetch request for ingredients
with a predicate restricting by recipe and collected state? Or loop
through recipe.ingredients?
Apply the predicate to the recipe's ingredients:
NSPredicate *isCollected = [NSPredicate predicateWithFormat:#"collected == YES"];
NSSet *collectedIngredients = [recipe.ingredients filteredSetUsingPredicate:isCollected];
At the other end of the scale, perhaps I need only ingredients from
this recipe that also appear in other recipes. Now, the fetch request
seems more appealing.
Again, using a fetch request here seems wasteful because you already have easy access to the set of ingredients that could be in the final result, and that's potentially a much smaller set than the set of all ingredients. Use the same approach as above, but change the predicate to test the recipes associated with each ingredient. Something like:
NSPredicate *p = [NSPredicate predicateWithFormat:#"recipes > 1"];
NSSet *i = [recipe.ingredients filteredSetUsingPredicate:p];
What is the correct general approach?
Fetch requests are a good way to search through all instances of a given entity. You're always going to start with a fetch request to get some objects to work with. But when the objects you want are somehow related to an object that you already have you can (and should) use those relationships to get what you want.

Core Data: What should I use as a Sort Descriptor with ordered set (ios5) in a many to many relationship

- I have an Item Entity and a Tag Entity.
- Items can have multiple Tags and Tags can be linked to multiple Items (many to many relationship).
- The relationship is an "Ordered Relationship" (using Ordered relationship in IOS5) both ways.
I want to fetch all child tags for a given item
Im using the following fetch request:
NSFetchRequest* request = [NSFetchRequest fetchRequestWithEntityName:#"Item"];
// Fetch all items that have a given tag
Tag* myTag = ....;
request.predicate = [NSPredicate predicateWithFormat:#"ANY tag == %#", myTag];
// This is my attempt to get the correct sort ordering (it crashes)
NSSortDescriptor *sortDescriptor = [NSSortDescriptor sortDescriptorWithKey:#"tag"
ascending:YES];
request.sortDescriptors = #[sortDescriptor];
The above sort descriptor returns data (which I assume is in some order) but then crashes:
[_NSFaultingMutableOrderedSet compare:]: unrecognized selector sent to instance 0x10b058f0
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason:
'-[_NSFaultingMutableOrderedSet compare:]: unrecognised selector sent to instance 0x10b058f0'
If I give an empty sort descriptors array then I dont get a crash but the result is not ordered.
How do I correctly implement the sort descriptor in a many to many relationship using the ordered relationship feature in ios5?

I think you may be misapprehending what an ordered relationship means in Core Data. As per this question, "ordered" doesn't mean "with regards to some property" - instead, "ordered" means "preserves the user's order" (or another specific order that would otherwise seem arbitrary). Some examples:
An ordered relationship would be good for steps in a recipe (the example from Apple's Core Data Release Notes) - the steps couldn't necessarily be ordered with respect to any of their properties (e.g. their text), and would have to maintain an "index" or "order" property on their own if Core Data didn't do the ordering.
An ordered relationship would not be good for a list of names - the user could choose to sort by first or last names, for example, and the collection of names has no real intrinsic ordering internally. The only thing that matters is display order - the data itself doesn't need to maintain that order. (The rule of thumb in Apple's docs is that if a user could conceivably choose between sort specifiers, the relationship shouldn't be ordered in Core Data.)
In your case, you try to apply a sort descriptor to the set of ordered sets (the values for the tags property) that you're fetching. This is the cause of your crash - the NSOrderedSet subclass (_NSFaultingMutableOrderedSet, in your case) doesn't implement the -compare: method that the sort descriptor would use to sort them, and so you get that exception.
I think a good solution for you would be to not use a sort descriptor at all; instead, if you want to change the preserved ordering for display to the user, get the NSOrderedSet value for tags on some Item, then get an array out of it with your own sort using -sortedArrayUsingComparator:. This will keep the data in the right order in Core Data but let you reorder it yourself programmatically.
At a higher level, you may also want to consider whether or not you really need order in your relationships. It incurs a significant performance penalty, and all you really get back is the original order in which tags were added to items (or vice versa). If you intend to change that order every time you display the tags, then you're taking that performance hit for no real benefit - make your relationships unordered, then either continue sorting programmatically or use a sort descriptor on an aggregate key, such as tags.#count, as suggested in the comments.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart