How should I use a storage provider with Orleans - storage

I'm a newbie in Orleans. I'd like to know how I can use the grain storage feature in Orleans. Should I use it like a message queue? Does it store my state temporary
and keep the data available even it throw exceptions or the
server crashed.
Thanks!

Grains that extend the Grain<T> class and are annotated with a [StorageProvider] attribute, will write their current state to the specified provider when you call base.WriteStateAsync().
If the grain is deactivated for any reason (including server crashed) then upon reactivation the grain will be initialized with the state that was last saved down.
I like to think of it as a cache, rather than a queue. Hope that helps, and, like the previous poster said, read the documentation, it's useful.

I had written a couple of articles to guide you step-by-step into getting used to the Storage Provider API and setting up your persistence store:
Introduction to Grain Persistence with Microsoft Orleans
Orleans Grain Persistence with the ADO .NET Storage Provider
Basically, Orleans gives you a very simple API (image taken from the first article above):
Your grain will inherit from Grain<T>, where T is your own class containing the state that you want to persist. The State property from Grain<T> lets you access it and read/modify state. The remaining async methods let you save changes to the persistence store, read them back, or clear the state. You typically don't need to read the state; it is done automatically when the grain gets activated.
There are no message queues involved. When you call one of these three methods, they will use the underlying storage provider to talk to whatever database you are using. This may fail due to store-specific errors (e.g. deadlocks), or due to an InconsistentStateException that is the result of a failed optimistic concurrency control check.
Whatever storage provided you decide to use (e.g. SQL Server, Azure Table Storage, in-memory, etc) must be configured via either XML config or code, and given a name. This name is then used in a [StorageProvider] attribute that goes over the grain class; in this way, the grain knows what storage provider to use when doing its persistence work (you could have various in your system).
The details of how all this is done are a bit lengthy to include here (which is why I wrote articles on the subject). You can find more information about this either in my articles linked above, or the Grain Persistence documentation.

Related

Syncing of memory and database objects upon changes in objects in memory

I am currently implementing a web application in .net core(C#) using entity framework. While working on the project, I actually encountered quite a few challenges but I will start with the one which I think are most important. My questions are as follows:
Instead of frequent loading data from the database, I am having a set of static objects which is a mirror of the data in the database. However, it is tedious and error prone when I want to ensure any changes, i.e., adding/deleting/modifying of objects are being saved to the database at real time. Is there any good example or advice that I can refer to improve my approach to do this?
Another thing is that value of some objects' properties will be changed on the fly according to the value of some other objects' properties. Something like a spreadsheet where a cell's value will be changed automatically if the value in the cell that the formula is referring to changes. I do not have a solution to do this yet. Appreciate if anyone has any example that I can refer to. But this will add another layer of complexity to sync the changes of the objects in memory to database.
At the moment, I am unsure if there is any better approach. Appreciate if anyone can help. Thanks!
Basically, you're facing a problem that's called eventual consistency. Something changes and two or more systems need to be aware at the same time. The problem here is that both changes need to be applied in order to consider the operation successful. If either one fails, you need to know.
In your case, I would use the Azure Service Bus. You can create queues and put messages on a queue. An Azure Function would handle these queue messages. You would create two queues, one for database updates, and one for the in-memory update (I think changing this to a cache service may be something to think off). Now the advantage of these queues is that you can easily drop messages on these queues from anywhere. Because you mentioned the object is going to evolve, you may need to update these objects either in the database or in memory (cache).
Once you've done that, I'd create a topic, with two subscriptions. One forwarding messages to Queue 1, and the other to Queue 2. This will solve your primary problem. In case an object changes, just send it to the topic. Both changes (database and memory) will be executed automagically.
The only problem you have now, it that you mentioned you wanted to update the database in real-time. With this scenario, you're going to have to leave that.
Also, you need to make sure you have proper alerts in place for the queues so in case you did miss a message, or your functions didn't handle it well enough, you'll receive an alert to check & correct errors.
I'm totally agree with #nineedm's and answer, but there are also other solutions.
If you introduce cache, you will always face cache revalidation problem - you have to mark cache as invalid when data were changed. Sometimes it is easy, depending on nature of cached data and how often data are changed.
If you have just single application, MemoryCache can be enough with proper specified expiration options.
If there is a cluster - you have to look at Distributed Cache solutions, for example Redis. There is MS article about that Distributed caching in ASP.NET Core

How to handle SAP Kapsel Offline app OData conflicts properly?

I build an app that is able to store OData offline by using SAP Kapsel Plugins.
More or less it's the same as generated by WEB ID or similer to the apps in this example: https://blogs.sap.com/2017/01/24/getting-started-with-kapsel-part-10-offline-odatasp13/
Now I am at the point to check the error resolution potential. I created a sync conflict (chaning data on the server after the offline database was stored and changed something on the app and started a flush).
As mentioned in the documentation I can see the error in ErrorArchive and could also see some details. But what I am missing is the information of the "current" data on the database.
In the error details I can just see the data on the device but not the data changed on the server.
For example:
Device is loading some names into offline store
Device is offline
User A is changing some names
User B is changing one of this names directly online
User A is online again and starts a sync
User A is now informend about the entity that was changed BUT:
not the content user B entered
I just see the "offline" data.
Is there a solution to see the "current" and the "offline" one in a kind of compare view?
Please also note that the server communication is done by the Kapsel Plugin and not with normal AJAX calls. This could be an alternative but I am wondering if there is no smarter way supported by the API?
Meanwhile I figured out how to load the online data (manually).
This could be done by switching http handler back to normal one.
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
Anyhow this does not look like a proper solution and I also have the issue with the conflict log itself. It must be deleted before any refresh could be applied.
I could not find any proper documentation for that. Also ETag handling is hardly described in SAPUI5 and SAP Kapsel documentation.
This question is a really tricky one, due to its implications. I understand that you are simulating a synchronization error due to concurrent modification, and want to know if there is a way for the client to obtain the "current" server state in order to give the user a means to compare the local and server state.
First, let me give you the short answer: No, there is no way for the client to see the current server state "for reference" via the Offline APIs when there are synchronization errors. Doing an online query as outlined above might work, but it certainly is a bad idea.
Now for the longer answer, which explains why this is not necessarily a defect and why I said there are quite some implications to the answer.
Types of Synchronization Errors
We distinguish a number of synchronization errors, and in this context, we are clearly dealing with business-related issues. There are two subtypes here: Those that the user can correct, e.g. validation errors, and those that are issues in the business process itself.
If the user violates the input range, e.g. by putting a negative price for a product, the server would reply with the corresponding message: "-1 is not a valid input value for 'Price'". You, as a developer, can display such messages to the user from the error archive, and the ensuing fix is indeed a very easy one.
Now when we talk about concurrent modification, things get really, really nasty. In fact, I like to say that in this case there is an issue with the business process, because on one hand, we allow data to get out of sync. On the other hand, the process allows multiple users to manipulate the same piece of information. How all relevant users should now be notified and synchronize, is no longer just a technical detail, but in fact a new business process. There just is no way to generically device how to handle this case. In most cases, it would involve back-office experts who need to decide how the changes should be merged.
A Better Solution
Angstrom pointed out that there is no way to manipulate ETags on the client side, and you should in fact not even think about it. ETags work like version numbers in optimistic locking scenarios, and changing the ETag basically means "Just overwrite what's on the server". This is a no-go in serious scenarios.
An acceptable workaround would be the following:
Make sure the server returns verbose error messages so that the user can see what happened and what caused the conflict.
If that does not help, refresh the data. This will get you an updated ETag, and merge the local changes into the "current" server state, but only locally. "Merging" really means that local changes always overwrite remote changes.
The user now has another opportunity to review the data and can submit it again.
A Good Solution
Better is not necessarily good, so here is what you should really do: Never let concurrent modification happen because it is really expensive to handle. This implies that not the developer should address this issue, but the business needs to change the process.
The right question to ask is, "When you replicate data in a distributed system, why do you allow it to be modified concurrently at all?" Typically stakeholders will not like this kind of question, and the appropriate reaction is to work out a conflict resolution process together with them. Only then they will realize how expensive fixing that kind of desynchronization is, and more often than not they will see that adjusting the process is way cheaper than insisting in yet another back-office process to fix the issues it causes. Even if they insist that there is a need for this concurrent modification, they will now understand that it is not your task to sort this out and that they need to invest in a conflict resolution process.
TL;DR
There is no way to compare the server and client state to the server state on the client, but you can do a refresh to retain the local changes and get an updated ETag. The real solution, however, is to rework the business process, because this no longer is a purely technical issue.
The default solution is that SMP or HCPms is detecting errors by ETags. At client side there is no API to manipulate ETags in case of conflicts. A potential solution to implement a kind of diff view on the device would work like this:
Show errors
Cache errors (maybe only in memory?)
delete the errors
do a refresh of the database
build a diff view with current data and cached errors
The idea with
sap.OData.removeHttpClient();
sap.OData.applyHttpClient();
could also work but could be very tricky and may introduce side effects.
Maybe some requests are triggered against the "wrong" backend.

How to handle split-brain?

I have read in Orleans FAQ when split-brain could happen but I don't understand what bad can happen and how to handle it properly.
FAQ says something vague like:
You just need to consider the rare possibility of having two instances of an actor while writing your application.
But how actually should I consider this and what can happen if I won't?
Orleans Paper (http://research.microsoft.com/pubs/210931/Orleans-MSR-TR-2014-41.pdf) says this:
application can rely on external persistent
storage to provide stronger data consistency
But I don't understand what this means.
Suppose split brain happened. Now I have two instances of one grain. When I'll send a few messages they could be received by these two (or there can be even more?) different instances. Suppose each instance prior to receiving these messages had same state. Now, after processing these messages they have different states.
How they should persist their states? There could be a conflict.
When another instances will be destroyed and only one will remain what will happen to the states of destroyed instances? It'll be like messages processed by them has never been processed? Then client state and server state could be desyncronized IIUC.
I see this (split-brain) as a big problem and I don't understand why there is so little attention to it.
Orleans leverages the consistency guarantees of the storage provider. When you call this.WriteStateAsync() from a grain, the storage provider ensures that the grain has seen all previous writes. If it has not, an exception is thrown. You can catch that exception and call DeactivateOnIdle() and rethrow the exception or call ReadStateAsync() and retry. So if you have 2 grains during a split-brain scenario, which ever one calls WriteStateAsync() first prevents the other one from writing state without first having read the most up-to-date state.
Update: Starting in Orleans v1.5.0, a grain which allows an InconsistentStateException to be thrown back to the caller will automatically be deactivated when the currently executing calls complete. A grain can catch and handle the exception to avoid automatic deactivation.

How to handle this concurrency scenario with NHibernate + asp.net mvc?

The context: a web application written in asp.net MVC + NHibernate. This is a card game where players play at the same time so their actions might modify the same field of an entity at the same time (they all do x = x + 1 on a field). Seems pretty classical but I don't know how to handle it.
Needless to say I can't present a popup to the user saying "The entity has been modified by another player. Merge or cancel ?". When you think that this is related to an action to a card, I can't interfere like this. My application has to internally handle this scenario. Since the field is in an entity class and each session has it own instance of the entity, I can't simply take a CLR lock. Does it mean I should use pessimistic concurrency so that each web request acting on this entity is queued until a player finished his action? In practical terms in means that each PlayCard request should use a lock?
Please, don't send me to NH doc about concurrency or alike. I'm after the technique that should be used in this case, not how to implement it in NH.
Thanks
It may make sense depending on your business logic to try second level caching. This may be a good depending on the length of the game and how it is played. Since the second level cache exists on the session factory level, the session factory will have to be managed according to the life time of the game. An Nh session can be created per request, but being spawned by a session factory configured for second level cache means data of interest is cached across all sessions. The advantage of using second level cache is that you can configure this on a class by class basis - caching only the entities your require. It also provides a variety of concurrency strategies depending on the cache provider. Even though this may shift the concurrency issue from the DB level to the NH session, this may give you a better option for dealing with your situation. There are gotchas to using this but it's suitability all depends on your business logic.
You can try to apply optimistic locking in this way:
DB entity will have a column tracking entity version (nhibernate.info link).
If you get "stale version" exception while saving entity ( = modified by another user) - reload the entity and try again. Then send the updated value to the client.
As I understand your back-end receives request from the client, then opens session, does some changes and updates entities closing session. In this case no thread will hold one entity in memory for too long and optimistic locking conflicts shouldn't happen too often.
This way you can avoid having many locked threads waiting for operation to complete.
On the other hand, if you expect retries to happen too often you can try SELECT FOR UPDATE locking when loading your entity (using LockMode.Upgrade in NH Get method). Although I found the thread that discourages me from using this with SQL Server: SO link.
In general the solution depends on the logic of the game and whether you can resolve concurrency conflicts in your code without showing messages to users. I'd also made UI updating itself with the latest data often enough to avoid players acting on obsolete game situation and then be surprised with the outcome.

Session Management in TWebModule

I am using a TWebModule with Apache. If I understand correctly Apache will spawn another instance of my TWebModule object if all previous created objects are busy processing requests. Is this correct?
I have created my own SessionObject and have created a TStringList to store them. The StringList is created in the initialization section at the bottom of my source code file holding the TWebModule object. I am finding initialization can be called multiple times (presumably when Apache has to spawn another process).
Is there a way I could have a global "Sessions" TStringlist to hold all of my session objects? Or is the "Safe", proper method to store session information in a database and retrieve it based on a cookie for each request?
The reason I want this is to cut down on database access and instead hold session information in memory.
Thanks.
As Stijn suggested, using a separate storage to hold the session data really is the best way to go. Even better is to try to write your application so that the web browser contains the state inherently in the design. This will greatly increase the ability to scale your application into the thousands or tens of thousands of concurrent users with much less hardware.
Intraweb is a great option, but suffers from the scale issue in the sense that more concurrent users, even IDLE users, require more hardware to support. It is far better to design from the onset a method of your server running as internally stateless as possible. Of course if you have a fixed number of users and don't expect any growth, then this is less of an issue.
That's odd. If initialization sections get called more than once, it might be because the DLL is loaded in separate process spaces. One option I can think up is to check if the "Sessions" object already exists when you create it on initialization. If the DLL really is loaded in separate processes, this will not help, and then I suggest writing a central Session storage process and use inter-process-communication from within your TWebModule (there are a few methods: messages, named pipes, COM...)
Intraweb in application mode really handles session management and database access very smoothly, and scales well. I've commented on it previously. While this doesn't directly answer the question you asked, when I faced the same issues Intraweb solved them for me.

Resources