How to optimize performance of Results change listeners in Realm (Swift) with a deep hierarchy?

How to optimize performance of Results change listeners in Realm (Swift) with a deep hierarchy? - ios

We're using Realm (Swift binding currently in version 3.12.0) from the earliest days in our project. In some early versions before 1.0 Realm provided change listeners for Results without actually giving changeSets.
We used this a lot in order to find out if a specific Results list changed.
Later the guys at Realm exchanged this API with changeSet providing methods. We had to switch and are now mistreating this API just in order to find out if anything in a specific List changed (inserts, deletions, modifications).
Together with RxSwift we wrote our own implementation of Results change listening which looks like this:
public var observable: Observable<Base> {
return Observable.create { observer in
let token = self.base.observe { changes in
if case .update = changes {
observer.onNext(self.base)
}
}
observer.onNext(self.base)
return Disposables.create(with: {
observer.onCompleted()
token.invalidate()
})
}
}
When we now want to have consecutive updates on a list we subscribe like so:
someRealm.objects(SomeObject.self).filter(<some filter>).rx.observable
.subscribe(<subscription code that gets called on every update>)
//dispose code missing
We wrote the extension on RealmCollection so that we can subscribe to List type as well.
The concept is equal to RxRealm's approach.
So now in our App we have a lot of filtered lists/results that we are subscribing to.
When data gets more and more we notice significant performance losses when it comes to seeing a change visually after writing something into the DB.
For example:
Let's say we have a Car Realm Object class with some properties and some 1-to-n and some 1-to-1 relationships. One of the properties is a Bool, namely isDriving.
Now we have a lot of cars stored in the DB and bunch of change listeners with different filters listing to changes of the cars collection (collection observers listening for changeSets in order to find out if the list was changed).
If I take one car of some list and set the property of isDriving from false to true (important: we do writes in the background) ideally the change listener fires fast and I have the nearly immediate correct response to my write on the main thread.
Added with edit on 2019-06-19:
Let's make the scenario still a little more real:
Let's change something down the hierarchy, let's say the tires manufacturer's name. Let's say a Car has a List<Tire>, a Tire has a Manufacturer and a Manufacturer has aname.
Now we're still listing toResultscollection changes with some more or less complex filters applied.
Then we're changing the name of aManufacturer` which is connected to one of the tires which are connected to one of the cars which is in that filtered list.
Can this still be fast?
Obviously when the length of results/lists where change listeners are attached to gets longer Realm's internal change listener takes longer to calculate the differences and fires later.
So after a write we see the changes - in worst case - much later.
In our case this is not acceptable. So we are thinking through different scenarios.
One scenario would be to not use .observe on lists/results anymore and switch to Realm.observe which fires every time anything did change in the realm, which is not ideal, but it is fast because the change calculation process is skipped.
My question is: What can I do to solve this whole dilemma and make our app fast again?
The crucial thing is the threading stuff. We're always writing in the background due to our design. So the writes itself should be very fast, but then that stuff needs to synchronize to the other threads where Realms are open.
In my understanding that happens after the change detection for all Results has run through, is that right?
So when I read on another thread, the data is only fresh after the thread sync, which happens after all notifications were sent out. But I am not sure currently if the sync happens before, that would be more awesome, did not test it by now.

Related

Prevent redundant operations in RxSwift

I'm starting my adventure with RxSwift, having small experience with React in js already. I think that my problem is common, but I'm not sure how to describe it in concise abstract way, so instead I will describe it on the example.
I'm building iOS app showing some charts. The part of interest consist of ChartAreaController, ChartInfoController, both embedded in ChartController. First controller is the area showing some graph(based on rx chartData property), and the second one among others will have a slider for user to restrict show x-value (rx selectedXRange property) which is restricted to be between some min and max. The min/max value is defined by the current chart data.
Behavior when slider change updates chart is defined in ChartController:
override func viewDidLoad() {
super.viewDidLoad()
(...)
chartInfoController.selectedXRange.asObservable()
.subscribe(onNext: { [unowned self] selectedXRange in
(...)
let chartData = self.filterChartData(from: self.rawChartData, in: selectedXRange)
self.chartAreaController.chartData.accept(chartData)
}).disposed(by: disposeBag)
The filterChartData() method just filters out data that is not in the range, but for the sake of the argument we can assume it is very costly and I don't want it to run twice when it is not necessary.
When user changes the chart he or she wants to show, the new data arrives from server (again ChartController):
private func handleNewData(_ rawChartData: ChartData) {
self.rawChartData = rawChartData
guard let allowedXRange = rawChartData.xRange() else { return }
let selectedXRange = chartInfoController.selectedXRange.value
let newSelectedXRange = calculateSelectedXRange(currentSelectedDays: selectedDaysRange, availableDaysRange: daysRange)
let chartData = filterChartData(from: rawChartData, in: selectedXRange)
self.chartInfoController.allowedXRange = allowedXRange //this line is not crucial
self.chartInfoController.selectedXRange.accept(newSelectedXRange)
self.chartAreaController.chartData.accept(rawChartData)
}
So upon new chart data arrival it may be the case that the currently selected xRange must be trimmed because of the new min/max values of the data. So the side effect of the method will be changing the selectedXRange and in turn running the subscription I pasted earlier. So when new data arrives the chartData is updated twice and I don't want it to happen.
Of course I can comment out last line of the handleNewData() method, but I don't like it very much, since main reason for existence of the handleNewData() is to set chartData, and with the line commented out it's goal would be achieved because of the side effect of the method (which is updating the slider). Not acceptable.
To chartData I added throttle anyways, because fast moving slider will result in many updates and this solves my problem partially(chartData updated only once). But as you may remember the filterChartData() method is costly, and this part will still be running twice.
So the one question is, if my general layout of tackling the problem is OK, or should it be handled way different? At this point I came to conclusion that I'm looking for some way of temporary disabling particular subscription on selectedXRange (without damaging other subscriptions to that variable). Temporary meaning:
(...)
//disable subscription
self.chartInfoController.selectedXRange.accept(newSelectedXRange)
self.chartAreaController.chartData.accept(rawChartData)
//enable subscription
(...)
This seem legit to me, since ChartController as an owner of the subscription and changer of the values may want to disable the subscription whenever it suits him(it?).
Does RxSwift support something like this? If not, then I think I can achieve it myself e.g. via bool property in ChartController, or via adding the subscription to separate disposeBag, which I would dispose and then recreate the subscription. But if it's good thing to do? For example bool solution may be prone to be ill handled when there is some error, and dispose/recreate may be somehow costly, and it may be the case that disposal was not intended to be used like that.
Is there a better practice to handle such situations? As I said I think the problem is common so I hope there is a canonical solution to it :) Thanks for any answer, sorry for the lengthy post.

So the one question is, if my general layout of tackling the problem is OK, or should it be handled way different?
A properly written UI input element observable will only fire when the user makes a change to the UI, not when the program makes a change. For example:
textField.rx.text.orEmpty.subscribe(onNext: { print($0) }) will only print a value when the user types in the textField, not when you call textField.text = "foo" or from a binding .bind(to: textfield.rx.text).
If you wrote the ChartInfoController, I suggest you modify it to work the way the other UI elements do. If you didn't write it, submit an issue to the developer/maintainer.
Does RxSwift support something like [temporarily disabling particular subscription]?
It depends on what you mean by "temporarily disabling". It doesn't support silently unsubscribing and resubscribing but there are plenty of operators that will filter out some events they receive while passing others along. For example filter, throttle, debounce, ignoreElements... There's a lot of them that do that.
Is there a better practice to handle such situations?
Then best solution is mentioned above.

When We have multiple subscriptions to the same Observable, it will re-execute for each subscription.
To stop re-execute for each subscription. RxSwift has several operators for this: share(), replay(), replayAll(), shareReplay(), publish(), and even shareReplayLatestWhileConnected().
read more at (RxSwift: share vs replay vs shareReplay)

GameplayKit - Confusion about sending messages between components

I'm diving into GameplayKit with Spritekit and from what I gather, you subclass GKEntity and then start adding GKComponents to that entity. The entity would more or less just be a bag of components that fill some function.
The part I'm confused about is communication between components. How do you keep them decoupled. For example, lets say I have a HealthComponent class, and I add that component to a PlayerEntity, and an EnemyEntity. I also have a HealthBarComponent but I only want a health bar to appear above the player. When the player takes damage, that information needs to be updated in the HealthBarComponent.
So how should that information be sent? I see there is a class called GKComponentSystem in the documentation. I'm not 100% on how that should be used.
Another question is.. when the player's health reaches zero he should regenerate while the enemy should stay dead. When the player runs out of lives, the game ends.
The health system for both an enemy and player would be roughly the same, but the events around death would be totally different for each. I'm not following how to use a component system while keeping the unique behavior of each entity.
some pseudo-code would be great

It looks like this framework works a bit differently than others I've seen, in that systems only work on a single type of component rather than on entities with groups of component types.
In the GamePlay-Kit framework you can either loop through and update your entities manually (which in turns updates each entities components) or create a class which inherits from GKComponentSystem. Then when you update the system it updates all the components you have added to it so long as their class type matches the type you initialized the system with
To handle your health bar issue I would say make a HealthBarComponent which, during its update, retrieves the HealthComponent from its owner entity, reads the health value & renders itself every frame. But you only add a HealthBarComponent to your player entity.
You can retrieve the owner entity from a component and vice versa (see GKComponent.entity & GKEntity.components) so you could update your health bar like this:
/*Note: I'm not an ios developer, this is pseudocode*/
HealthBarComponent.update(){
int healthValue = self.entity.components.get(HealthComponent).value;
//Now render your health bar accordingly or hold onto this for later
}
To handle the player death issue I would think your best bet would be to have two different types of Health components (PlayerHealth & EnemyHealth) with two different corresponding systems. That or have a single Health component but 2 separate 'Death' components. Yeah it seems redundant but I'm having a hard time thinking of a better way inside of this framework. In my experience you either spend your time worrying about keeping everything completely decoupled and reusable or you build a game :)

Components can access their entities via the self.entity property. From there, you can query other components to pass data to via the entity componentForClass property:
guard let moveComponent = entity?.componentForClass(MoveComponent.self) else {
fatalError("A MovementComponent's entity must have a MoveComponent")
}
The iOS Games by Tutorials book has an excellent tutorial that covers GameplayKit entities and components.

Notifications are nice for this kind of problem. There's no direct object coupling at all, and it's very extensible if more than one object ends up needing to know (e.g. your health bar component, a higher-level game object that determines game over, maybe even enemy/ally AI to behave differently when health is low).
You could have a notification named "playerHealthChanged" and have your health-bar component and other objects register independently to respond to that event. (You likely need to let the HealthComponent instance know whether it should post this notification, so only the player posts - perhaps it could use isKind(of:) on its entity, or just have a bool field to enable posting, set to true for the player instance).
I usually put all notification name definitions in a single module so any class can access them:
let kPlayerHealthChangedNotification = Notification.Name("playerHealthChanged")
Here's the way the component would post the notification (you could pass an object other than self if desired):
NotificationCenter.default.post(name: kPlayerHealthChangedNotification, object:self)
Then the objects that care about player health changes can register like so in their init code - create a handler function, than add self as an observer for the notification:
#objc func onPlayerHealthChanged(_ notification:Notification) {
// do whatever is needed; notification.object
// has the object from the post
}
// put this in a setup method - init or similar:
NotificationCenter.default.addObserver(self,
selector: #selector(onPlayerHealthChanged(_:)),
name: kPlayerHealthChangedNotification
object:nil)
Here's the documentation: https://developer.apple.com/documentation/foundation/notifications

Breeze: When child entities have been deleted by someone else, they still appear after reloading the parent

We have a breeze client solution in which we show parent entities with lists of their children. We do hard deletes on some child entities. Now when the user is the one doing the deletes, there is no problem, but when someone else does, there seems to be no way to invalidate the children already loaded in cache. We do a new query with the parent and expanding to children, but breeze attaches all the other children it has already heard of, even if the database did not return them.
My question: shouldn't breeze realize we are loading through expand and thus completely remove all children from cache before loading back the results from the db? How else can we accomplish this if that is not the case?
Thank you

Yup, that's a really good point.
Deletion is simply a horrible complication to every data management effort. This is true no matter whether you use Breeze or not. It just causes heartache up and down the line. Which is why I recommend soft deletes instead of hard deletes.
But you don't care what I think ... so I will continue.
Let me be straight about this. There is no easy way for you to implement a cache cleanup scheme properly. I'm going to describe how we might do it (with some details neglected I'm sure) and you'll see why it is difficult and, in perverse cases, fruitless.
Of course the most efficient, brute force approach is to blow away the cache before querying. You might as well not have caching if you do that but I thought I'd mention it.
The "Detached" entity problem
Before I continue, remember the technique I just mentioned and indeed all possible solutions are useless if your UI (or anything else) is holding references to the entities that you want to remove.
Oh, you'll remove them from cache alright. But whatever is holding references to them now will continue to have a reference to an entity object which is in a "Detached" state - a ghost. Making sure that doesn't happen is your responsibility; Breeze can't know and couldn't do anything about it if it did know.
Second attempt
A second, less blunt approach (suggested by Jay) is to
apply the query to the cache first
iterate over the results and for each one
detach every child entity along the "expand" paths.
detach that top level entity
Now when the query succeeds, you have a clear road for it to fill the cache.
Here is a simple example of the code as it relates to a query of TodoLists and their TodoItems:
var query = breeze.EntityQuery.from('TodoLists').expand('TodoItems');
var inCache = manager.executeQueryLocally(query);
inCache.slice().forEach(function(e) {
inCache = inCache.concat(e.TodoItems);
});
inCache.slice().forEach(function(e) {
manager.detachEntity(e);
});
There are at least four problems with this approach:
Every queried entity is a ghost. If your UI is displaying any of the queried entities, it will be displaying ghosts. This is true even when the entity was not touched on the server at all (99% of the time). Too bad. You have to repaint the entire page.
You may be able to do that. But in many respects this technique is almost as impractical as the first. It means that ever view is in a potentially invalid state after any query takes place anywhere.
Detaching an entity has side-effects. All other entities that depend on the one you detached are instantly (a) changed and (b) orphaned. There is no easy recovery from this, as explained in the "orphans" section below.
This technique wipes out all pending changes among the entities that you are querying. We'll see how to deal with that shortly.
If the query fails for some reason (lost connection?), you've got nothing to show. Unless you remember what you removed ... in which case you could put those entities back in cache if the query fails.
Why mention a technique that may have limited practical value? Because it is a step along the way to approach #3 that could work
Attempt #3 - this might actually work
The approach I'm about to describe is often referred to as "Mark and Sweep".
Run the query locally and calculate theinCache list of entities as just described. This time, do not remove those entities from cache. We WILL remove the entities that remain in this list after the query succeeds ... but not just yet.
If the query's MergeOption is "PreserveChanges" (which it is by default), remove every entity from the inCache list (not from the manager's cache!) that has pending changes. We do this because such entities must stay in cache no matter what the state of the entity on the server. That's what "PreserveChanges" means.
We could have done this in our second approach to avoid removing entities with unsaved changes.
Subscribe to the EntityManager.entityChanged event. In your handler, remove the "entity that changed" from the inCache list because the fact that this entity was returned by the query and merged into the cache tells you it still exists on the server. Here is some code for that:
var handlerId = manager.entityChanged.subscribe(trackQueryResults);
function trackQueryResults(changeArgs) {
var action = changeArgs.entityAction;
if (action === breeze.EntityAction.AttachOnQuery ||
action === breeze.EntityAction.MergeOnQuery) {
var ix = inCache.indexOf(changeArgs.entity);
if (ix > -1) {
inCache.splice(ix, 1);
}
}
}
If the query fails, forget all of this
If the query succeeds
unsubscribe: manager.entityChanged.unsubscribe(handlerId);
subscribe with orphan detection handler
var handlerId = manager.entityChanged.subscribe(orphanDetector);
function orphanDetector(changeArgs) {
var action = changeArgs.entityAction;
if (action === breeze.EntityAction.PropertyChange) {
var orphan = changeArgs.entity;
// do something about this orphan
}
}
detach every entity that remains in the inCache list.
inCache.slice().forEach(function(e) {
manager.detachEntity(e);
});
unsubscribe the orphan detection handler
Orphan Detector?
Detaching an entity can have side-effects. Suppose we have Products and every product has a Color. Some other user hates "red". She deletes some of the red products and changes the rest to "blue". Then she deletes the "red" Color.
You know nothing about this and innocently re-query the Colors. The "red" color is gone and your cleanup process detaches it from cache. Instantly every Product in cache is modified. Breeze doesn't know what the new Color should be so it sets the FK, Product.colorId, to zero for every formerly "red" product.
There is no Color with id=0 so all of these products are in an invalid state (violating referential integrity constraint). They have no Color parent. They are orphans.
Two questions: how do you know this happened to you and what do your do?
Detection
Breeze updates the affected products when you detach the "red" color.
You could listen for a PropertyChanged event raised during the detach process. That's what I did in my code sample. In theory (and I think "in fact"), the only thing that could trigger the PropertyChanged event during the detach process is the "orphan" side-effect.
What do you do?
leave the orphan in an invalid, modified state?
revert to the equally invalid former colorId for the deleted "red" color?
refresh the orphan to get its new color state (or discover that it was deleted)?
There is no good answer. You have your pick of evils with the first two options. I'd probably go with the second as it seems least disruptive. This would leave the products in "Unchanged" state, pointing to a non-existent Color.
It's not much worse then when you query for the latest products and one of them refers to a new Color ("banana") that you don't have in cache.
The "refresh" option seems technically the best. It is unwieldy. It could easily cascade into a long chain of asynchronous queries that could take a long time to finish.
The perfect solution escapes our grasp.
What about the ghosts?
Oh right ... your UI could still be displaying the (fewer) entities that you detached because you believe they were deleted on the server. You've got to remove these "ghosts" from the UI.
I'm sure you can figure out how to remove them. But you have to learn what they are first.
You could iterate over every entity that you are displaying and see if it is in a "Detached" state. YUCK!
Better I think if the cleanup mechanism published a (custom?) event with the list of entities you detached during cleanup ... and that list is inCache. Your subscriber(s) then know which entities have to be removed from the display ... and can respond appropriately.
Whew! I'm sure I've forgotten something. But now you understand the dimensions of the problem.
What about server notification?
That has real possibilities. If you can arrange for the server to notify the client when any entity has been deleted, that information can be shared across your UI and you can take steps to remove the deadwood.

It's a valid point but for now we don't ever remove entities from the local cache as a result of a query. But.. this is a reasonable request, so please add this to the breeze User Voice. https://breezejs.uservoice.com/forums/173093-breeze-feature-suggestions
In the meantime, you can always create a method that removes the related entities from the cache before the query executes and have the query (with expand) add them back.

Breeze entity manager "hasChanges" property - how do I find out what is triggering the "true" state?

I have the following event handler in my datacontext:
manager.hasChangesChanged.subscribe(function (eventArgs) {
hasChanges(eventArgs.hasChanges);
});
and in Chrome I've set a break point on the "haschanges(eventArg.haschanges);" line.
The moment I load my app and the process of fetching data begins, this breakpoint is hit. It then proceeds to be repeatedly hit and the "hasChanges" property varies between "true" and "false" many times.
I know from further debug breakpoints that a simple query that "expands" a related table via its navigation property triggers a visit to my "hasChangesChanged" event handler.
What I don't know - as the "eventArgs" is so big and complex - is exactly which of my 5 or so related entities being retrieved is triggering the "true" on the "hasChanges" property. Is there a property within the eventArgs I can inspect to determine which current entity has caused the trip to the hasChangesChanged event handler?
I'm puzzled about why any of what I'm doing is setting "hasChanges" to true as all I do in the first instance is retrieve data. As far as I'm aware, nothing is changed whatsoever at the point the entity manager is convinced that something has changed.
To elaborate, my app prefetches lots of data used for a tree structure at the point where it is sitting waiting for first input from the user. As the user has not had an opportunity of touching anything in the app by this point, why would breeze think that any of the entities concerned have been changed when they've simply been read in from the database?

Use the EntityManager.entityChanged event if you want fine grained information about what has changed. This event gives much more detail but is fired much more often.
http://www.breezejs.com/sites/all/apidocs/classes/EntityManager.html

Is it OK to open a DB4o file for query, insert, update multiple times?

This is the way I am thinking of using DB4o. When I need to query, I would open the file, read and close:
using (IObjectContainer db = Db4oFactory.OpenFile(Db4oFactory.NewConfiguration(), YapFileName))
{
try
{
List<Pilot> pilots = db.Query<Pilot>().ToList<Pilot>();
}
finally
{
try { db.Close(); }
catch (Exception) { };
}
}
At some later time, when I need to insert, then
using (IObjectContainer db = Db4oFactory.OpenFile(Db4oFactory.NewConfiguration(), YapFileName))
{
try
{
Pilot pilot1 = new Pilot("Michael Schumacher", 100);
db.Store(pilot1);
}
finally
{
try { db.Close(); }
catch (Exception) { };
}
}
In this way, I thought I will keep the file more tidy by only having it open when needed, and have it closed most of the time. But I keep getting InvalidCastException
Unable to cast object of type 'Db4objects.Db4o.Reflect.Generic.GenericObject' to type 'Pilot'
What's the correct way to use DB4o?

No, it's not a good idea to work this way. db4o ObjectContainers are intended to be kept open all the time your application runs. A couple of reasons:
db4o maintains a reference system to identify persistent objects, so it can do updates when you call #store() on an object that is already stored (instead of storing new objects) . This reference system is closed when you close the ObjectContainer, so updates won't work.
Class Metadata would have to be read from the database file every time you reopen it. db4o would also have to analyze the structure of all persistent classes again, when they are used. While both operations are quite fast, you probably don't want this overhead every time you store a single object.
db4o has very efficient caches for class and field indexes and for the database file itself. If you close and reopen the file, you take no advantage of them.
The way you have set up your code there could be failures when you work with multiple threads. What if two threads would want to open the database file at exactly the same time? db4o database files can be opened only once. It is possible to run multiple transactions and multiple threads against the same open instance and you can also use Client/Server mode if you need multiple transactions.
Later on you may like to try Transparent Activation and Transparent Persistence. Transparent Activation lazily loads object members when they are first accessed. Transparent Persistence automatically stores all objects that were modified in a transaction. For Transparent Activation (TA) and Transparent Persistence (TP) to work you certainly have to keep the ObjectContainer open.
You don't need to worry about constantly having an open database file. One of the key targets of db4o is embedded use in (mobile) devices. That's why we have written db4o in such a way that you can turn your machine off at any time without risking database corruption, even if the file is still open.
Possible reasons why you are getting a GenericObject back instead of a Pilot object:
This can happen when the assembly name of the assembly that contains the Pilot object has changed between two runs, either because you let VisualStudio autogenerate the name or because you changed it by hand.
Maybe "db4o" is part of your assembly name? One of the recent builds was too agressive at filtering out internal classes. This has been fixed quite some time ago. You may like to download and try the latest release, "development" or "production" should both be fine.
In a presentation I once did I have once seen really weird symptoms when db4o ObjectContainers were opened in a "using" block. You probably want to work without that anyway and keep the db4o ObjectContainer open all the time.

It is ok to reopen the database multiple times. The problem would be performance and loosing the "identity". Also you can't keep a reference to a result of a query and try to iterate it after closing the db (based on you code, looks like you want to do that).
GenericObjects are instantiated when the class cannot be found.
Can you provide a full, minimalist, sample that fails for you?
Also, which db4o version are you using?
Best

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart