Breeze: When child entities have been deleted by someone else, they still appear after reloading the parent - breeze

We have a breeze client solution in which we show parent entities with lists of their children. We do hard deletes on some child entities. Now when the user is the one doing the deletes, there is no problem, but when someone else does, there seems to be no way to invalidate the children already loaded in cache. We do a new query with the parent and expanding to children, but breeze attaches all the other children it has already heard of, even if the database did not return them.
My question: shouldn't breeze realize we are loading through expand and thus completely remove all children from cache before loading back the results from the db? How else can we accomplish this if that is not the case?
Thank you

Yup, that's a really good point.
Deletion is simply a horrible complication to every data management effort. This is true no matter whether you use Breeze or not. It just causes heartache up and down the line. Which is why I recommend soft deletes instead of hard deletes.
But you don't care what I think ... so I will continue.
Let me be straight about this. There is no easy way for you to implement a cache cleanup scheme properly. I'm going to describe how we might do it (with some details neglected I'm sure) and you'll see why it is difficult and, in perverse cases, fruitless.
Of course the most efficient, brute force approach is to blow away the cache before querying. You might as well not have caching if you do that but I thought I'd mention it.
The "Detached" entity problem
Before I continue, remember the technique I just mentioned and indeed all possible solutions are useless if your UI (or anything else) is holding references to the entities that you want to remove.
Oh, you'll remove them from cache alright. But whatever is holding references to them now will continue to have a reference to an entity object which is in a "Detached" state - a ghost. Making sure that doesn't happen is your responsibility; Breeze can't know and couldn't do anything about it if it did know.
Second attempt
A second, less blunt approach (suggested by Jay) is to
apply the query to the cache first
iterate over the results and for each one
detach every child entity along the "expand" paths.
detach that top level entity
Now when the query succeeds, you have a clear road for it to fill the cache.
Here is a simple example of the code as it relates to a query of TodoLists and their TodoItems:
var query = breeze.EntityQuery.from('TodoLists').expand('TodoItems');
var inCache = manager.executeQueryLocally(query);
inCache.slice().forEach(function(e) {
inCache = inCache.concat(e.TodoItems);
});
inCache.slice().forEach(function(e) {
manager.detachEntity(e);
});
There are at least four problems with this approach:
Every queried entity is a ghost. If your UI is displaying any of the queried entities, it will be displaying ghosts. This is true even when the entity was not touched on the server at all (99% of the time). Too bad. You have to repaint the entire page.
You may be able to do that. But in many respects this technique is almost as impractical as the first. It means that ever view is in a potentially invalid state after any query takes place anywhere.
Detaching an entity has side-effects. All other entities that depend on the one you detached are instantly (a) changed and (b) orphaned. There is no easy recovery from this, as explained in the "orphans" section below.
This technique wipes out all pending changes among the entities that you are querying. We'll see how to deal with that shortly.
If the query fails for some reason (lost connection?), you've got nothing to show. Unless you remember what you removed ... in which case you could put those entities back in cache if the query fails.
Why mention a technique that may have limited practical value? Because it is a step along the way to approach #3 that could work
Attempt #3 - this might actually work
The approach I'm about to describe is often referred to as "Mark and Sweep".
Run the query locally and calculate theinCache list of entities as just described. This time, do not remove those entities from cache. We WILL remove the entities that remain in this list after the query succeeds ... but not just yet.
If the query's MergeOption is "PreserveChanges" (which it is by default), remove every entity from the inCache list (not from the manager's cache!) that has pending changes. We do this because such entities must stay in cache no matter what the state of the entity on the server. That's what "PreserveChanges" means.
We could have done this in our second approach to avoid removing entities with unsaved changes.
Subscribe to the EntityManager.entityChanged event. In your handler, remove the "entity that changed" from the inCache list because the fact that this entity was returned by the query and merged into the cache tells you it still exists on the server. Here is some code for that:
var handlerId = manager.entityChanged.subscribe(trackQueryResults);
function trackQueryResults(changeArgs) {
var action = changeArgs.entityAction;
if (action === breeze.EntityAction.AttachOnQuery ||
action === breeze.EntityAction.MergeOnQuery) {
var ix = inCache.indexOf(changeArgs.entity);
if (ix > -1) {
inCache.splice(ix, 1);
}
}
}
If the query fails, forget all of this
If the query succeeds
unsubscribe: manager.entityChanged.unsubscribe(handlerId);
subscribe with orphan detection handler
var handlerId = manager.entityChanged.subscribe(orphanDetector);
function orphanDetector(changeArgs) {
var action = changeArgs.entityAction;
if (action === breeze.EntityAction.PropertyChange) {
var orphan = changeArgs.entity;
// do something about this orphan
}
}
detach every entity that remains in the inCache list.
inCache.slice().forEach(function(e) {
manager.detachEntity(e);
});
unsubscribe the orphan detection handler
Orphan Detector?
Detaching an entity can have side-effects. Suppose we have Products and every product has a Color. Some other user hates "red". She deletes some of the red products and changes the rest to "blue". Then she deletes the "red" Color.
You know nothing about this and innocently re-query the Colors. The "red" color is gone and your cleanup process detaches it from cache. Instantly every Product in cache is modified. Breeze doesn't know what the new Color should be so it sets the FK, Product.colorId, to zero for every formerly "red" product.
There is no Color with id=0 so all of these products are in an invalid state (violating referential integrity constraint). They have no Color parent. They are orphans.
Two questions: how do you know this happened to you and what do your do?
Detection
Breeze updates the affected products when you detach the "red" color.
You could listen for a PropertyChanged event raised during the detach process. That's what I did in my code sample. In theory (and I think "in fact"), the only thing that could trigger the PropertyChanged event during the detach process is the "orphan" side-effect.
What do you do?
leave the orphan in an invalid, modified state?
revert to the equally invalid former colorId for the deleted "red" color?
refresh the orphan to get its new color state (or discover that it was deleted)?
There is no good answer. You have your pick of evils with the first two options. I'd probably go with the second as it seems least disruptive. This would leave the products in "Unchanged" state, pointing to a non-existent Color.
It's not much worse then when you query for the latest products and one of them refers to a new Color ("banana") that you don't have in cache.
The "refresh" option seems technically the best. It is unwieldy. It could easily cascade into a long chain of asynchronous queries that could take a long time to finish.
The perfect solution escapes our grasp.
What about the ghosts?
Oh right ... your UI could still be displaying the (fewer) entities that you detached because you believe they were deleted on the server. You've got to remove these "ghosts" from the UI.
I'm sure you can figure out how to remove them. But you have to learn what they are first.
You could iterate over every entity that you are displaying and see if it is in a "Detached" state. YUCK!
Better I think if the cleanup mechanism published a (custom?) event with the list of entities you detached during cleanup ... and that list is inCache. Your subscriber(s) then know which entities have to be removed from the display ... and can respond appropriately.
Whew! I'm sure I've forgotten something. But now you understand the dimensions of the problem.
What about server notification?
That has real possibilities. If you can arrange for the server to notify the client when any entity has been deleted, that information can be shared across your UI and you can take steps to remove the deadwood.

It's a valid point but for now we don't ever remove entities from the local cache as a result of a query. But.. this is a reasonable request, so please add this to the breeze User Voice. https://breezejs.uservoice.com/forums/173093-breeze-feature-suggestions
In the meantime, you can always create a method that removes the related entities from the cache before the query executes and have the query (with expand) add them back.

Related

How to optimize performance of Results change listeners in Realm (Swift) with a deep hierarchy?

We're using Realm (Swift binding currently in version 3.12.0) from the earliest days in our project. In some early versions before 1.0 Realm provided change listeners for Results without actually giving changeSets.
We used this a lot in order to find out if a specific Results list changed.
Later the guys at Realm exchanged this API with changeSet providing methods. We had to switch and are now mistreating this API just in order to find out if anything in a specific List changed (inserts, deletions, modifications).
Together with RxSwift we wrote our own implementation of Results change listening which looks like this:
public var observable: Observable<Base> {
return Observable.create { observer in
let token = self.base.observe { changes in
if case .update = changes {
observer.onNext(self.base)
}
}
observer.onNext(self.base)
return Disposables.create(with: {
observer.onCompleted()
token.invalidate()
})
}
}
When we now want to have consecutive updates on a list we subscribe like so:
someRealm.objects(SomeObject.self).filter(<some filter>).rx.observable
.subscribe(<subscription code that gets called on every update>)
//dispose code missing
We wrote the extension on RealmCollection so that we can subscribe to List type as well.
The concept is equal to RxRealm's approach.
So now in our App we have a lot of filtered lists/results that we are subscribing to.
When data gets more and more we notice significant performance losses when it comes to seeing a change visually after writing something into the DB.
For example:
Let's say we have a Car Realm Object class with some properties and some 1-to-n and some 1-to-1 relationships. One of the properties is a Bool, namely isDriving.
Now we have a lot of cars stored in the DB and bunch of change listeners with different filters listing to changes of the cars collection (collection observers listening for changeSets in order to find out if the list was changed).
If I take one car of some list and set the property of isDriving from false to true (important: we do writes in the background) ideally the change listener fires fast and I have the nearly immediate correct response to my write on the main thread.
Added with edit on 2019-06-19:
Let's make the scenario still a little more real:
Let's change something down the hierarchy, let's say the tires manufacturer's name. Let's say a Car has a List<Tire>, a Tire has a Manufacturer and a Manufacturer has aname.
Now we're still listing toResultscollection changes with some more or less complex filters applied.
Then we're changing the name of aManufacturer` which is connected to one of the tires which are connected to one of the cars which is in that filtered list.
Can this still be fast?
Obviously when the length of results/lists where change listeners are attached to gets longer Realm's internal change listener takes longer to calculate the differences and fires later.
So after a write we see the changes - in worst case - much later.
In our case this is not acceptable. So we are thinking through different scenarios.
One scenario would be to not use .observe on lists/results anymore and switch to Realm.observe which fires every time anything did change in the realm, which is not ideal, but it is fast because the change calculation process is skipped.
My question is: What can I do to solve this whole dilemma and make our app fast again?
The crucial thing is the threading stuff. We're always writing in the background due to our design. So the writes itself should be very fast, but then that stuff needs to synchronize to the other threads where Realms are open.
In my understanding that happens after the change detection for all Results has run through, is that right?
So when I read on another thread, the data is only fresh after the thread sync, which happens after all notifications were sent out. But I am not sure currently if the sync happens before, that would be more awesome, did not test it by now.

Fix uneccessary copy of NSManagedObject

I'm sorry the title may mislead you, since I'm not so good at English. Let me describe my problem as below (You may skip to the TL;DR version at the bottom of this question).
In Coredata, I design a Product entity. In app, I download products from a server. It return JSON string, I defragment it then save to CoreData.
After sometimes has passed, I search a product from that server again, having some interaction with server. Now, I call the online product XProduct. This product may not exist in CoreData, and I also don't want to save it to CoreData since it may not belong to this system (it come from other warehouse, not my current warehouse).
Assume this XProduct has the same properties as Product, but not belong to CoreData, the developer from before has designed another Object, the XProduct, and copy everything (the code) from Product. Wow. The another difference between these two is, XProduct has some method to interact with server, like: - (void)updateStock:(NSInteger)qty;
Now, I want to upgrade the Product properties, I'll have to update the XProduct also. And I have to use these two separately, like:
id product = anArrayContainsProducts[indexPath.row];
if ([product isKindOfClass:[XProduct class]] {
// Some stuff with the xproduct
}
else {
// Probably the same display to the cell.
}
TL;DR
Basically, I want to create a scenario like this:
Get data from server.
Check existed in CoreData.
2 == true => add to array (also may update some data from server).
2 == false => create object (contains same structure as NSManagedObject from JSON dictionary => add to array.
The object created in step 4 will never exist in CoreData.
Questions
How can I create an NSManagedObject without having it add to NSMangedObjectContext and make sure the app would run fine?
If 1 is not encouragement, please suggest me a better approach to this. I really don't like to duplicate so many codes like that.
Update
I was thinking about inheritance (XProduct : Product) but it still make XProduct the subclass of NSManagedObject, so I don't think that is a good approach.
There are a couple of possibilities that might work.
One is just to create the managed objects but not insert them into a context. When you create a managed object, the context argument is allowed to be nil. For example, calling insertNewObjectForEntityForName(_:inManagedObjectContext:) with no context. That gives you an instance of the managed object that's not going to be saved. They have the same lifetime as any other object.
Another is to use a second Core Data stack for these objects, with an in-memory persistent store. If you use NSInMemoryStoreType when adding the persistent store (instead of NSSQLiteStoreType), you get a complete, working Core Data stack. Except that when you save changes, they only get saved in memory. It's not really persistent, since it disappears when the app exits, but aside from that it's exactly the same as any other Core Data stack.
I'd probably use the second approach, especially if these objects have any relationships, but either should work.

What's the point of self.managedObjectContext == nil in NSManagedObject prepareForDeletion?

I have a Reminder entity that needs to update its date property whenever a certain entity B is deleted. I've spent some days coding thinking I could do some useful things in my managed object subclass on deletion time. I tried
- (void)willSave
{
if (self.isDeleted)
// use self.managedObjectContext
}
The context was nil. Relationships were also torn down there. Fair enough.
So... I started writing cumbersome code for prepareForDeletion to circumvent the fact that the object hadn't been deleted yet, but then Core Data throws self.managedObjectContext == nil in my face. The documentation says that this is where I do stuff "before relationships are torn down". So what is the point in self.managedObjectContext == nil if self.relationshipA.managedObjectContext is accessible (as the docs suggest)? And more importantly, why does my not yet deleted object not have its context?
I read a comment here regarding that problem
its not 'fault' as much as it is a 'disown', the context has disowned your object (he was deleted and save was committed to the database) and so your object was disowned. don't save in methods that are changing and object as the save should probably be committed/saved after the operation anyway. – Dan Shelly May 21 at 19:05
My code was:
[moc deleteObject:obj]
[moc save:NULL]
When I removed the save operation my self.managedObjectContext existed in prepareForDeletion. That is, until auto-save, when it was nil again. Probably because the parent context also deleted it, followed by a save by the UIManagedDocument.
I'm starting to think that my only options are to make a custom delete method (that works until Core Data cascades a deletion, in which case it won't be called), or make a new class that listens to NSManagedObjectContextDidSaveNotification.
Update:
The user wants to keep in touch with a person, and wants to be reminded after a certain interval (stored in ContactWish) if no contact has been made. What I'm trying to accomplish is that when the latest ContactOccasion for a certain person is deleted, the corresponding occasion->person->wish->reminder gets updated (using the interval).
Since this is a learning experience for me I wanted to find out the right way (one that works with cascade deletion etc.) and not just call for an update manually from every place in my code where I do [MOContext deleteObject:occasion]. Suggestions are welcome.
(the reminder entity has also been prepared for more manual use)
Would it not be much more logical to have the Reminder entity manage its date property? It could "listen" (maybe via changedValues:) to its relationship entities being deleted and perform the update.
This seems more consistent, as the B entity should not really be concerned with the logic of the Reminder entity updates.
Edit
Pursuant to the discussion below and based on my opinion that you cannot load up the database cascade delete model too much with update logic:
Rather than react to a deletion you can introduce an attribute that you set and listen to in order to do the changes.
I really do not see how relying on core data delete mechanisms is easier or more elegant than just writing your own "deleteOccasion" method that handles this logic.

which is the difference betwwen this two ways to refresh the dbContext?

I am using EF 4.4 and I would like to update many entities, but some other user can modified many of the entities that the first user is modified. So I get a concurrency exception. Other case is that the first user tries to add many new registers and other user added some of them meanwhile. So I have an exception that exists some of the registers (unique constraint).
I would like to ensure that the first user finish his operation add only the registers that does no exists yet (add all his entities except the entities that are added by the second user).
To do that, I need to update the entities in my dbContext so I see that there at least two options.
First, in the catch when I capture the update exception, I can do:
ex.Entries.Single().Reload();
The second option is:
myContext.Entry<MyTable>(instance).Reload();
I guess that the second option only refreshes the entity that I use as parameter, but if the problem is that I need to refresh many entities, how can I do that?
What really does the first option, Single().Reload?
When you do
ex.Entries.Single().Reload();
you are sure that the offending entity is refreshed. What is does is taking the one and only (Single) entity from the DbUpdateConcurrencyException.Entries that could not be saved to the database (in case of a concurrency exception this is always exactly one).
When you do
myContext.Entry(instance).Reload();
You are not sure that you refresh the right entity unless you know that only one entity had changes before SaveChanges was called. If you save an entity with child entities any one of them can cause a concurrency problem.
In EF 6.x (6.1.3), below code will let you find all the changes; the way you asked in your question!
try
{
var listOfRefreshedObj = db.ChangeTracker.Entries().Select(x => x.Entity).ToList();
var objContext = ((IObjectContextAdapter)your_db_context).ObjectContext;
objContext.Refresh(System.Data.Entity.Core.Objects.RefreshMode.ClientWins, listOfRefreshedObj);
await db.Entry(<yourentity>).ReloadAsync();
return Content(HttpStatusCode.<code>, "<outputmessage>"); ;
}
catch (Exception e)
{
return Content(HttpStatusCode.<code>, "<exception>");
}
Explaination:
Query Entries in the ChangeTracker and store them in a list
var listOfRefreshedObj = db.ChangeTracker.Entries().Select(x => x.Entity).ToList();
Next is to refresh the context. In some cases (row is removed etc.), this will throw an exception which you can catch. RefreshMode.ClientWins tells EF to accept all client units as modified when next update occurs. In some cases, you might want to prompt the users with the changes and let them decide. RefreshMode Enumeration. An example is here ObjectContext.Refresh Method Example
objContext.Refresh(System.Data.Entity.Core.Objects.RefreshMode.ClientWins, listOfRefreshedObj);
You're probably doing this whole thing after you receive DbUpdateConcurrencyException anyways!

Locking before save with fixed concurrencymode

I'm learning about concurrency in conjunction with EF4.0 and have a question about the locking pattern used.
Say I configure a fixed concurrency mode on a version number property.
Now say I fetch a record (entity) from the database (context) and edit some property. Version gets incremented and when SaveChanges is called on its context. If the current database (context) version matches the version of the original record (entity) the save continues, otherwise an OptimisticConcurrencyException gets thrown by EF.
Now, my point of interest is the following: between the check of the versions there's always a small period of time, however small, it is there. So in theory someone else could've just updated the record between the comparison and the actual save, thus possibly corrupting the data.
How does this get solved? It feels as if the problem just gets pushed forward.
There is no period of time between checking versions and updating record because the database command looks like:
UPDATE SomeTable
SET SomeColumn = 'SomeValue'
WHERE Id = #Id AND Version = #OldVersion
SELECT ##ROWCOUNT
The check and update is one atomic operation. Rowcount will return 0 if no record with Id = #Id and Version = #OldVersion exists and that zero is translated to the exception.
This can (and probably is) solved using locking hints.
For SQL Server, EF can query (SELECT) from the database WITH UPDLOCK.
This tells the Database Engine that, you want to read a/several records, and nobody else can change those records until you perform an update thereafter.
If you want to see this for yourself, check out the Sql Server Profiler which will show you the queries in real-time.
Hope that helps.
CAVEAT: I can't say for sure that this is the way EF handles this scenario because I haven't checked myself but, certainly if you were going to do it yourself, this is one way to do it.

Resources