CoreData Sum Performance - ios

I have some theoretical question to ask about Core Data and sum function.
I try to sum values from Core Data table with three ways.
fetch all and use expression to sum it up :
NSArray * array1 = [self getAll:self.managedObjectContext];
int sum = [[array1 valueForKeyPath:#"#sum.sum"] intValue];
fetch all and use for loop:
int sum2 = 0;
NSArray * array2 = [self getAll:self.managedObjectContext];
for (Test * t in array2) {
sum2 = sum2 + [t.sum intValue];
}
let Core Data sum it.
NSArray * array = [self getAllGroupe:self.managedObjectContext];
NSDictionary * i = [array objectAtIndex:0];
id j = [i objectForKey:#"sum"];
(NSArray *)getAllGroupe:(NSManagedObjectContext*)Context{
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"Test"
inManagedObjectContext:Context];
NSExpressionDescription* ex = [[NSExpressionDescription alloc] init];
[ex setExpression:[NSExpression expressionWithFormat:#"#sum.sum"]];
[ex setExpressionResultType:NSDecimalAttributeType];
[ex setName:#"sum"];
[fetchRequest setPropertiesToFetch:[NSArray arrayWithObjects:ex, nil]];
[fetchRequest setResultType:NSDictionaryResultType ];
NSError *error;
[fetchRequest setEntity:entity];
NSArray *fetchedObjects = [Context executeFetchRequest:fetchRequest error:&error];
return fetchedObjects;
}
surprisingly the
way was the slowest (for 1.000.000 data --> 19.2 s), the
way was faster (for 1.000.000 data --> 3.54 s) and the
way was the fastest (for 1.000.000 data --> 0.3 s)
Why is this?
If I understand right even core data need to go through all 1.000.000 datas and sum it. Is this because use more cores if there are available?

No CoreData doesn't do the summing on it's own - it delegates that to it's backing sqllite database which is optimized for things like that.
Basically CoreData sends a select SUM(sum) from table; to it's db and it's performed there.

Related

Group By for Core Data Results

I am using Core Data and generally, I have a Game, a Game Phase, and points scored for different types of actions (lets say pointsA, pointsB).
Each game consists of two phases and there are hence for each player a total points per phase and then per game (phase 1 + phase 2).
My Score Entity in Core Data has:
Player (Relationship to player),
Game (Relationship to game),
Phase (attribute),
PointsA (attribute),
PointsB (attribute).
So each player has a record for a Score in a Phase in a Game.
In order to get in fetch all points for a given player AGGREGATED BY GAME (so SQL equivalent of "Group By"). I managed to use this code and IT WORKS:
CODE:
NSError *error;
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
fetchRequest.entity = [NSEntityDescription entityForName:#"Score" inManagedObjectContext:self.managedObjectContext];
fetchRequest.predicate=[NSPredicate predicateWithFormat:#"player == %#",_currentPlayer];
NSExpressionDescription* ex = [[NSExpressionDescription alloc] init];
[ex setExpression:[NSExpression expressionWithFormat:#"#sum.pointsA"]];
[ex setExpressionResultType:NSDecimalAttributeType];
[ex setName:#"pointsA"];
NSExpressionDescription* ex2 = [[NSExpressionDescription alloc] init];
[ex2 setExpression:[NSExpression expressionWithFormat:#"#sum.pointsB"]];
[ex2 setExpressionResultType:NSDecimalAttributeType];
[ex2 setName:#"pointsB"];
[fetchRequest setPropertiesToFetch:[NSArray arrayWithObjects:#"Game",ex, ex2,nil]];
[fetchRequest setPropertiesToGroupBy:[NSArray arrayWithObjects:#"Game",nil]];
[fetchRequest setResultType:NSDictionaryResultType ];
results= [self.managedObjectContext executeFetchRequest:fetchRequest error:&error];
My question is: Suppose I have MANY MORE POINTS TYPES, as in pointsC, PointsD, etc.. (let's say scores for many more different kinds of actions). Do I have to use a SEPARATE NSExpressionDescription (ex and ex2 above) for all of these ?
Is this really how long winded it is in Core Data? Is there a quicker way?
I am relatively new to Core Data.
For those who are wondering how to parse through the results set:
for (id Res in results) {
NSLog(#"pointsA: %# ", Res[#"pointsA"] );
NSLog(#"pointsB: %# ", [Res valueForKey:#"pointsB"] );
//both work
}
OK, answering my own question: I guess a more elegant way would be to feed in an array of items with the Core Data attribute names, which can then be as long as you want:
(NSArray*)items contains pointsA, pointsB, pointsC, pointsD, all the way to z and beyond if you so require.
///Code///
-(void)fetchRes:(NSArray*)items
{
NSError *error;
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
fetchRequest.entity = [NSEntityDescription entityForName:#"Score" inManagedObjectContext:self.managedObjectContext];
fetchRequest.predicate=[NSPredicate predicateWithFormat:#"player == %#",_currentPlayer];
NSMutableArray *propsArray=[[NSMutableArray alloc]initWithObjects:#"Game", nil];
for (int i=0;i<items.count;i++)
{
NSString *desc=[[NSString alloc]initWithFormat:#"#sum.%#",items[i]];
NSExpressionDescription* ex3 = [[NSExpressionDescription alloc] init];
[ex3 setExpression:[NSExpression expressionWithFormat:desc]];
[ex3 setExpressionResultType:NSDecimalAttributeType];
[ex3 setName:items[i]];
[propsArray insertObject:ex3 atIndex:i+1];
}
[fetchRequest setPropertiesToFetch:propsArray];
[fetchRequest setPropertiesToGroupBy:[NSArray arrayWithObjects:#"Game",nil]];
[fetchRequest setResultType:NSDictionaryResultType ];
results=[self.managedObjectContext executeFetchRequest:fetchRequest error:&error] ;
}
Not sure what the performance / memory consequences are ?

Is it possible to use a group by count in the havingPredicate for a CoreData fetch (for dupe detection)?

For reference, the problem I'm trying to solve is efficiently finding and removing duplicates in a table that could have a lot of entries.
The table I am working with is called PersistedDay with a dayString object in it (it's a string. :-P). There are more columns that aren't relevant to this question. I'd like to find any PersistedDay's that have duplicates.
In SQL, this is one of the efficient ways you can do that (FYI, I can do this query on the CoreData backing SQLite DB):
SELECT ZDAYSTRING FROM ZPERSISTEDDAY GROUP BY ZDAYSTRING HAVING COUNT(ZDAYSTRING) > 1;
This returns ONLY the dayStrings that have duplicates and you can then get all of the fields for those objects by querying using the resulting day strings (you can use it as a sub query to do it all in one request).
NSFetchRequest seems to have all of the required pieces to do this too, but it doesn't quite seem to work. Here's what I tried to do:
NSManagedObjectContext *context = [self managedObjectContext];
NSFetchRequest *request = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"PersistedDay" inManagedObjectContext:context];
[request setEntity:entity];
NSPropertyDescription* dayStringProperty = entity.propertiesByName[#"dayString"];
request.propertiesToFetch = #[dayStringProperty];
request.propertiesToGroupBy = #[dayStringProperty];
request.havingPredicate = [NSPredicate predicateWithFormat: #"dayString.#count > 1"];
request.resultType = NSDictionaryResultType;
NSArray *results = [context executeFetchRequest:request error:NULL];
That doesn't work. :-P If I try that I get an error "Unsupported function expression count:(dayString)" when trying to do the fetch. I don't think the dayString in "dayString.#count" even matters in that code above...but, I put it in for clarity (SQL count just operates on the grouped rows).
So, my question is: is this possible and, if so, what is the syntax to do it? I couldn't find anything in the CoreData docs to indicate how to do this.
I found one similar SO posts that I now unfortunately can't find again that was about running a count in a having clause (I don't think there was a group by). But, the poster gave up and did it a different way after not finding a solution. I'm hoping this is more explicit so maybe someone has an answer. :)
For reference, this is what I am doing for now that DOES work, but requires returning almost all the rows since there are very few duplicates in most cases:
NSManagedObjectContext *context = [self managedObjectContext];
NSFetchRequest *request = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"PersistedDay"
inManagedObjectContext:context];
[request setEntity:entity];
NSPropertyDescription* dayStringProperty = entity.propertiesByName[#"dayString"];
// Get the count of dayString...
NSExpression *keyPathExpression = [NSExpression expressionForKeyPath: #"dayString"]; // Does not really matter
NSExpression *countExpression = [NSExpression expressionForFunction: #"count:" arguments: [NSArray arrayWithObject:keyPathExpression]];
NSExpressionDescription *expressionDescription = [[NSExpressionDescription alloc] init];
[expressionDescription setName: #"dayStringCount"];
[expressionDescription setExpression: countExpression];
[expressionDescription setExpressionResultType: NSInteger32AttributeType];
request.propertiesToFetch = #[dayStringProperty, expressionDescription];
request.propertiesToGroupBy = #[dayStringProperty];
request.resultType = NSDictionaryResultType;
NSArray *results = [context executeFetchRequest:request error:NULL];
I then have to loop over the result and only return the results that have dayStringCount > 1. Which is what the having clause should do. :-P
NOTE: I know CoreData isn't SQL. :) Just would like to know if I can do the equivalent type of operation with the same efficiency as SQL.
Yes it is possible. You cannot reference count as key path, however you can reference it as variable. Just like in SQL. In my example I have cities created with duplicate names.
let fetchRequest = NSFetchRequest(entityName: "City")
let nameExpr = NSExpression(forKeyPath: "name")
let countExpr = NSExpressionDescription()
let countVariableExpr = NSExpression(forVariable: "count")
countExpr.name = "count"
countExpr.expression = NSExpression(forFunction: "count:", arguments: [ nameExpr ])
countExpr.expressionResultType = .Integer64AttributeType
fetchRequest.resultType = .DictionaryResultType
fetchRequest.sortDescriptors = [ NSSortDescriptor(key: "name", ascending: true) ]
fetchRequest.propertiesToGroupBy = [ cityEntity.propertiesByName["name"]! ]
fetchRequest.propertiesToFetch = [ cityEntity.propertiesByName["name"]!, countExpr ]
// filter out group result and return only groups that have duplicates
fetchRequest.havingPredicate = NSPredicate(format: "%# > 1", countVariableExpr)
Complete playground file at:
https://gist.github.com/pronebird/cca9777af004e9c91f9cd36c23cc821c
Best I can come up with is:
NSError* error;
NSManagedObjectContext* context = self.managedObjectContext;
NSEntityDescription* entity = [NSEntityDescription entityForName:#"Event" inManagedObjectContext:context];
// Construct a count group field
NSExpressionDescription* count = [NSExpressionDescription new];
count.name = #"count";
count.expression = [NSExpression expressionWithFormat:#"count:(value)"];
count.expressionResultType = NSInteger64AttributeType;
// Get list of all "value" fields (only)
NSPropertyDescription* value = [entity propertiesByName][#"value"];
NSFetchRequest* request = [[NSFetchRequest alloc] initWithEntityName:#"Event"];
request.propertiesToFetch = #[ value, count];
request.propertiesToGroupBy = #[ value ];
request.resultType = NSDictionaryResultType;
NSArray* values = [context executeFetchRequest:request error:&error];
// Filter count > 1
values = [values filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:#"count > 1"]];
// slice to get just the values
values = [values valueForKeyPath:#"value"];
But that's not really much different from what you're using.
The best way finding duplicates in Core Data depends on your data. According to Efficiently Importing Data and assuming that you have to import less than 1000 PersistedDays, I suggest this solution:
NSFetchRequest* fetchRequest = [NSFetchRequest new];
[fetchRequest setEntity:[NSEntityDescription entityForName:#"PersistedDay" inManagedObjectContext:myMOC]];
[fetchRequest setSortDescriptors:#[[NSSortDescriptor sortDescriptorWithKey:#"dayString" ascending:NO]]];
NSArray* persistedDays = [myMOC executeFetchRequest:fetchRequest error:nil];
for (NSUInteger i = persistedDays.count - 1; i > 0; --i) {
PersistedDay *currentDay = persistedDays[i];
PersistedDay *nextDay = persistedDays[i-1];
if ([currentDay.dayString isEqualToString:nextDay.dayString]) {
/* Do stuff/delete with currentDay */
}
}
For speed up can index dayString in Core Data.
You also can reduce the the data set if you remember a timestamp or a date of the last duplicate clean up:
[fetchRequest setPredicate:[NSPredicate predicateWithFormat:#"importDate > %#", lastDuplicateCleanUp];

Validate ManagedObject

I have an SearchStringTooltip ManagedObject. With property #dynamic tooltipText; (NSString)
I need to dynamically add new tooltips in database, but I need only unique values (insensitive).
They could come in more than 100 per one time; and every time I check for unique..
It looks like:
if (newTooltips.count == 0)
return;
NSEntityDescription *entity = [NSEntityDescription
entityForName:#"SearchStringTooltip"
inManagedObjectContext:self.moc];
NSFetchRequest *request = [NSFetchRequest new];
[request setEntity:entity];
for (NSString *name in newTooltips) {
[request setPredicate:[NSPredicate predicateWithFormat:#"tooltipText = %#", name]]; //like = (=) + time *2(sometimes *3) ofcourse i know i need like.. Its insensitive
NSInteger count = [self.moc countForFetchRequest:request error:nil]; //But its is very expensive operation expensive
if (count > 0) {
continue;
}
DBSearchStringTooltip *tooltip = [NSEntityDescription insertNewObjectForEntityForName:#"SearchStringTooltip"
inManagedObjectContext:self.moc];
tooltip.tooltipText = name;
}
How can I do it more cheaply for memory? There can be > 10 000 tooltips for check unique... And I have to check them all.
Quick answer and I am sure more efficient. Use this code snippet instead of your loop:
[request setPredicate:[NSPredicate predicateWithFormat:#"tooltipText IN %#", newTooltips]];
[request setResultType:NSDictionaryResultType];
[request setPropertiesToFetch:#[#"tooltipText"]];
NSArray* result = [self.moc executeFetchRequest:request error:nil];
NSArray* existingTooltipNames = [result valueForKey:#"tooltipText"];
NSMutableOrderedSet* itemsToAdd = [NSMutableOrderedSet orderedSetWithArray:newTooltips];
[itemsToAdd minusSet:[NSSet setWithArray:existingTooltipNames]];
for (NSString *name in itemsToAdd) {
DBSearchStringTooltip *tooltip = [NSEntityDescription insertNewObjectForEntityForName:#"SearchStringTooltip" inManagedObjectContext:self.moc];
tooltip.tooltipText = name;
}
You may try to fetch only distinct tooltipText via NSDictionaryResultType result type of the fetch request and setReturnsDistinctResults:YES. Then search the results in memory. This could be much faster then 10k fetches, as far as fetch always hits the disk.
You may find more info here.

Core Data equivalent for sqlite query

I use Core Data for an iPhone app.
There is "Flight" entity with a "start" and "duration" property.
The flights are listed on a paginated view, where I need to sum the duration per page and the duration rollup sum.
In native sqlite following solution works:
select sum(pg.zduration) from (select zduration,zstart from zflight order by zstart limit %i,%i) as pg",offset,limit
So on first page, with a page size of 5, I get duration sum and same rollup duration with offset=0 and limit=5.
On second page, I get the duration sum with offset=5 and limit=5. The rollup sum with offset=0 and limit=10.
And so on..
Now the Question:
How would I solve that with Core Data, NSExpression, NSExpressionDescription and NSFetchRequest instead of sqlite? Of course, I would not like to load all flight objects in memory...
So I am able to caculate the duration for all flights:
NSFetchRequest *request = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"Flight" inManagedObjectContext:self.managedObjectContext];
[request setEntity:entity];
[request setResultType:NSDictionaryResultType];
NSSortDescriptor *startSortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"start"
ascending:YES];
NSArray *sortDescriptors = [[NSArray alloc] initWithObjects:startSortDescriptor, nil];
[request setSortDescriptors:sortDescriptors];
request.fetchOffset=onPage*pageSize;//does not help, cause offset and limit are applied to the result
request.fetchLimit=pageSize;//does not help, cause offset and limit are applied to the result
NSExpression *keyPathExpression = [NSExpression expressionForKeyPath:#"duration"];
NSExpression *sumExpression = [NSExpression expressionForFunction:#"sum:" arguments:[NSArray arrayWithObject:keyPathExpression]];
// Create an expression description using the minExpression and returning a date.
NSExpressionDescription *expressionDescription1 = [[NSExpressionDescription alloc] init];
[expressionDescription1 setName:#"durationSum"];
[expressionDescription1 setExpression:sumExpression];
[expressionDescription1 setExpressionResultType:NSInteger64AttributeType];
[request setPropertiesToFetch:[NSArray arrayWithObjects:expressionDescription1,nil]];
// Execute the fetch.
NSError *error = nil;
NSArray *objects = [self.managedObjectContext executeFetchRequest:request error:&error];
if(error!=nil){
[NSException raise:#"Sum Page Duration failed" format:#"%# Error:%#", [[error userInfo] valueForKey:#"reason"],error];
}
if (objects!=nil && [objects count] > 0) {
return (NSNumber*)[[objects objectAtIndex:0] valueForKey:#"durationSum"];
}
return 0;
As you said, the limit and offset set on the fetch request are applied to the result and NSExpression won't work well in this case. You could operate on the returned objects, after they've been offset and limited by the fetch request, using a collection operator rather than NSExpression, e.g.:
NSNumber *durationSum = [objects valueForKeyPath:#"#sum.duration"];

Core Data - Fetch doesn't deliver the same result

So I'm adding a ListItem into ListName(There is a one to many relationship setted up) in a Class A
ListItem *newItem = [NSEntityDescription insertNewObjectForEntityForName:#"ListItem"
inManagedObjectContext:self.context];
//setting some attributes...
[listName addListItemsObject:newItem];
[self.context save:&error];
After that Class B is via a delegate methode called
There I want to get the data out of Core Data, BUT...If I'm fetching all ListName, the ListItems are not up to date(for example only 5 items instead of 6). If I fetch all ListItems then there are all there(6 out of 6).
What is wrong with my code...I need to get all ListNames though
NSError *error;
NSFetchRequest *req = [[NSFetchRequest alloc] init];
if(context == nil)
NSLog(#"context is nil");
NSEntityDescription *descr = [NSEntityDescription entityForName:#"ListName" inManagedObjectContext:self.context];
[req setEntity:descr];
NSSortDescriptor *sort = [[NSSortDescriptor alloc]initWithKey:#"lastModified" ascending:NO];
[req setSortDescriptors:[NSArray arrayWithObject:sort]];
NSArray * results = [self.context executeFetchRequest:req error:&error];
self.listNames = [results mutableCopy];
if ([results count] > 0) {
ListName *test = [results objectAtIndex:0];
[test.listItems count];
NSLog(#"item count on list %i", [test.listItems count]);
//wrong result
NSFetchRequest *newReq = [[NSFetchRequest alloc] init];
NSEntityDescription *descr = [NSEntityDescription entityForName:#"ListItem" inManagedObjectContext:self.context];
[newReq setEntity:descr];
NSArray * results2 = [self.context executeFetchRequest:newReq error:&error];
NSLog(#"item count on items %i", [results2 count]);
//right result
}
Given your data model and code, there is no reason that the count of ListItems in both places as to be the same because the counts are of two different sets of objects that do not necessarily overlap.
The first count is given by this code:
ListName *test = [results objectAtIndex:0];
[test.listItems count];
… which returns the count of ListItems objects in the relationship of a single, particular and unique ListName object. You may have one ListName object or you might have hundreds each of which could have an arbitrary number of related ListItems objects. This code will only count those related to he first ListName object returned.
The second count is given by:
NSFetchRequest *newReq = [[NSFetchRequest alloc] init];
NSEntityDescription *descr = [NSEntityDescription entityForName:#"ListItem" inManagedObjectContext:self.context];
[newReq setEntity:descr];
NSArray * results2 = [self.context executeFetchRequest:newReq error:&error];
NSLog(#"item count on items %i", [results2 count]);
… which returns an unfiltered array containing every instance of ListItems in the persistent store regardless of what relationships they have.
There is no particular reason to expect the first count to ever agree with the second because it will only do so when (1) you have a single ListNames object in the store and (2) every existing ListItems object is in that ListNames.listNames relationship.
Make sure not confuse entities and managed objects.
BTW, you should almost always use reciprocal relationships e.g. if you have ListNames.listItems you should have a reciprocal ListItems.listName.
A simple reset has helped

Resources