db4o Client See Changes from Another Client - db4o

I'm running a db4o server with multiple clients accessing it. I just ran into the issue of one client not seeing the changes from another client. From my research on the web, it looks like there are basically two ways to solve it.
1: Call Refresh() on the object (from http://www.gamlor.info/wordpress/2009/11/db4o-client-server-and-concurrency/):
const int activationDeph = 4;
client2.Ext().Refresh(objFromClient2, activationDeph);
2: Instead of caching the IObjectContainer, open a new IObjectContainer for every DB request.
Is that right?
Yes, #1 is more efficient, but is that really realistic to specify which objects to refresh? I mean, when a DB is involved, every time a client accesses it, it should get the latest information. That's why I'm leaning towards #2. Plus, I don't have major efficiency concerns.
So, am I right that those are the two approaches? Or is there another?
And, wait a sec... what happens when your object goes out of scope? On a timer, I call a method that gets an object from the DB server. That method instantiates the object. Since the object went out of scope, it's not there to refresh. And when I call the DB, I don't see the changes from the client. In this case, it seems like the only option is to open a new IObjectContainer. No?
** Edit **
I thought I'd post some code using the solution I finally decided to use. Since there were some serious complexities with using a new IObjectContainer for every call, I'm simply going to do a Refresh() in every method that accesses the DB (see Refresh() line below). Since I've encapsulated my DB access into logic classes, I can make sure to do the Refresh() there, every time. I just tested this and it seems to be working.
Note: The Database variable below is the the db4o IObjectContainer.
public static ApplicationServer GetByName(string serverName)
{
ApplicationServer appServer = (from ApplicationServer server in Database
where server.Name.ToUpperInvariant() == serverName.ToUpperInvariant()
select server).FirstOrDefault();
Database.Ext().Refresh(appServer, 10);
return appServer;
}

1) As you said, the major problem with this that you usually really don't know what objects to refresh.
You can use the committed event to refresh objects as soon as any client has committed. db4o will distribute that event. Note that this also consumes some network traffic & time to send the events. And there will be a time frame where your objects have a stale state.
2) It actually the cleanest method, but not for every db request. Use a object container for every logical unit of work. Any operation which is one 'atomic' unit of work in your business-operations.
Anyway in general. db4o was never build with the client server scenario as first priority, and it shows in the concurrent scenarios. You cannot avoid working with stale (and even inconsistent) object state and there is no concurrency control options (except the low level semaphores).
My recommendation: Use a client container per unit of work. Be aware that even then you might get stale data, which then might lead to a inconsistent view & update. When there are rarely any contentions & races in your application scenario and you can tolerate a mistake once in a while, then this is fine. However if you really need to ensure correctness, then I recommend to use a database which has a better concurrency store =(

Related

Syncing of memory and database objects upon changes in objects in memory

I am currently implementing a web application in .net core(C#) using entity framework. While working on the project, I actually encountered quite a few challenges but I will start with the one which I think are most important. My questions are as follows:
Instead of frequent loading data from the database, I am having a set of static objects which is a mirror of the data in the database. However, it is tedious and error prone when I want to ensure any changes, i.e., adding/deleting/modifying of objects are being saved to the database at real time. Is there any good example or advice that I can refer to improve my approach to do this?
Another thing is that value of some objects' properties will be changed on the fly according to the value of some other objects' properties. Something like a spreadsheet where a cell's value will be changed automatically if the value in the cell that the formula is referring to changes. I do not have a solution to do this yet. Appreciate if anyone has any example that I can refer to. But this will add another layer of complexity to sync the changes of the objects in memory to database.
At the moment, I am unsure if there is any better approach. Appreciate if anyone can help. Thanks!
Basically, you're facing a problem that's called eventual consistency. Something changes and two or more systems need to be aware at the same time. The problem here is that both changes need to be applied in order to consider the operation successful. If either one fails, you need to know.
In your case, I would use the Azure Service Bus. You can create queues and put messages on a queue. An Azure Function would handle these queue messages. You would create two queues, one for database updates, and one for the in-memory update (I think changing this to a cache service may be something to think off). Now the advantage of these queues is that you can easily drop messages on these queues from anywhere. Because you mentioned the object is going to evolve, you may need to update these objects either in the database or in memory (cache).
Once you've done that, I'd create a topic, with two subscriptions. One forwarding messages to Queue 1, and the other to Queue 2. This will solve your primary problem. In case an object changes, just send it to the topic. Both changes (database and memory) will be executed automagically.
The only problem you have now, it that you mentioned you wanted to update the database in real-time. With this scenario, you're going to have to leave that.
Also, you need to make sure you have proper alerts in place for the queues so in case you did miss a message, or your functions didn't handle it well enough, you'll receive an alert to check & correct errors.
I'm totally agree with #nineedm's and answer, but there are also other solutions.
If you introduce cache, you will always face cache revalidation problem - you have to mark cache as invalid when data were changed. Sometimes it is easy, depending on nature of cached data and how often data are changed.
If you have just single application, MemoryCache can be enough with proper specified expiration options.
If there is a cluster - you have to look at Distributed Cache solutions, for example Redis. There is MS article about that Distributed caching in ASP.NET Core

breeze memory management - pattern / practice?

I have an old SL4/ria app, which I am looking to replace with breeze. I have a question about memory use and caching. My app loads lists of Jobs (a typical user would have access to about 1,000 of these jobs). Additionally, there are quite a few lookup entity types. I want to make sure these are cached well client-side, but updated per session. When a user opens a job, it loads many more related entities (anywhere from 200 - 800 additional entities) which compose multiple matrix-style views for the jobs. A user can view the list of jobs, or navigate to view 1 job at a time.
I feel that I should be concerned with memory management, especially not knowing how browsers might deal with this. Originally I felt this should all be 1 EntityManager and I would detachEntities when user navigates away from a job, but I'm thinking this might benefit from multiple managers by intended lifetime. Or perhaps I should create a new dataservice & EntityManager each time the user navigates to a new hash '/#/' area, since comments on clear() seems to indicate that this would be faster? If I did this, I suppose I will be using pub/sub to notify other viewmodels of changes to entities? This seems complex and defeating some of the benefits of breeze as the context.
Any tips or thoughts about this would be greatly appreciated.
I think I understand the question. I think I would use a multi-manager approach:
Lookups Manager - holds once-per session reference (lookup) entities
JobsView Manager - "readonly" list of Jobs in support of the JobsView
JobEditor Manager - One per edit session.
The Lookups Manager maintains the canonical copy of reference entities. You can fill it once with a single call to server (see docs for how). This Lookups Manager will Breeze-export these reference entities to other managers which Breeze-import them as they are created. I am assuming that, while numerous and diverse, the total memory footprint of reference entities is pretty low ... low enough that you can afford to have more than one copy in multiple managers. There are more complicated solutions if that is NOT so. But let that be for now.
The JobsView Manager has the necessary reference entities for its display. If you only displayed a projection of the Jobs, it would not have Jobs in cache. You might have an array and key map instead. Let's keep it simple and assume that it has all the Jobs but not their related entities.
You never save changes with this manager! When editing or creating a Job, your app always fires up a "Job Editor" view with its own VM and JobEditor Manager. Again, you import the reference entities you need and, when editing an existing Job, you import the Job too.
I would take this approach anyway ... not just because of memory concerns. I like isolating my edit sessions into sandboxes. Eases cancellation. Gives me a clean way to store pending changes in browser storage so that the user won't lose his/her work if the app/browser goes down. Opens the door to editing several Jobs at the same time ... without worrying about mutually dependent entities-with-changes. It's a proven pattern that we've used forever in SL apps and should apply as well in JS apps.
When a Job edit succeeds, You have to tell the local client world about it. Lots of ways to do that. If the ONLY place that needs to know is the JobsView, you can hardcode a backchannel into the app. If you want to be more clever, you can have a central singleton service that raises events specifically about Job saving. The JobsView and each new JobEditor communicate with this service. And if you want to be hip, you use an in-process "Event Aggregator" (your pub/sub) for this purpose. I'd probably be using Durandal for this app anyway and it has an event aggregator in the box.
Honestly, it's not that complicated to use and importing/exporting entities among managers is a ... ahem ... breeze. Well worth it compared to refreshing the Jobs List every time you return to it (although you'll want a "refresh button" too because OTHER users could be adding/changing those Jobs). You retain plenty of Breeze benefits: querying, validation, change-tracking, batch saves, entity navigation (those reference lists work "for free" in Breeze).
As a refinement, I don't know that I would automatically destroy the JobEditor view/viewmodel/manager when I returned to the JobsView. In my experience, people often return to the same Job that they just left. I might hold on to a view so you could go back and forth quickly. But now I'm getting tricky.

Database resiliency

I'm designing an application that relies heavily on a database. I need my application to be resilient to short losses of connectivity to the database (network going down for a few seconds for example). What is the usual patterns that people use for these kind of problems. Is there something that I can do on the database access layer to handle gracefully a small glitch in the network connection to the db (i'm using hibernate + oracle jdbc + dbcp pool).
I'll assume you have hidden every database access behind a DAO or something similiar.
Now create wrappers around these DAOs, that try to call them, and in case of an exception wait a second and retry. Of course this will cause 'hanging' of the application during db-outage, but it will come back to live when the database becomes available.
If this is not acceptable you'll have to move the cut up closer to the ui layer. Consider the following approach.
User causes a
request.
wrap all the request information in a message and put it in the queue.
return to the user, telling him that his request will get processed in a short time.
A worker registered on the queue will process the request, retrying when database problems exist.
Note that you are now deep in concurrency land. So you must handle things like requests referencing an entity which already got deleted.
Read up on 'eventual consistency'
Since you are using hibernate, you'll have to deal with lazy loading. An interruption in connectivity will kill your session, so for you it might be best not to use lazy loading at all, but work with detached objects.

LINQ to SQL throws exceptions when stress tested

I have this web app that is running ASP .NET MVC 1.0 with LINQ 2 SQL
I'm noticing a very strange problem with LINQ 2 SQL throwing exceptions (mainly Specified cast invalid or Sequence contains more than one element) when under a certain amount of load.
The bigger problem is, I'm not talking about Real Heavy/Professional Stress Testing... Basically what I'm doing is I open FireFox and Chrome and hold down F5 for ten seconds in each (I call this poor man stress testing) - lo and behold; the web app is throwing these exceptions randomly for the next two or five minutes. If I restart the app from IIS7 (or restart WebDev if under Visual Studio) then immediately all is back to normal. Like nothing happened.
At first I was suspecting the way I handle the DataContext, maybe i'm supposed to dispose it at every Application_End from Global.asx, but that didn't change anything.
Right now I have a single public static DataContext object used by all requests. I'm not disposing it or re-creating it. Is that the right way to do it? Am I supposed to dispose it? When exactly should I dispose it?
There are several things that happen on every request - for example, in every page, the User object (for the current user) is loaded from the database and "LastSeen" attribute is updated to DateTime.Now. Other things (like Tag Cloud for example) are cached.
Any ideas why this is happening?
The DataContext class is not threadsafe - you need to create a new one for each operation. See this article by Rick Strahl (Linq to SQL DataContext Lifetime Management)
You should dispose the datacontext after every bunch of queries, and use it like
using(MyDataContext dc = new MyDataContext())
{
var x = dc.Table.Single(a=>a.Id=3);
//do some more related stuff, but make sure your connection won't be open too long
}
Don't have one static datacontext be used by every request. Would you use the same Connection object in normal ADO.Net for every request?!
See also http://blog.codeville.net/2007/11/29/linq-to-sql-the-multi-tier-story/
DataContext isn’t thread-safe, as far
as I know
You lose isolation and
cannot control when SubmitChanges() is
called - concurrent requests will
interfere with one another
Memory
leaks are pretty likely
Create a DC for every operation like Rob suggests OR use a IoC container to have a DC shared per request.
DO NOT DISPOSE DC's - they are designed to be lightweight. Disposing is not only not necessary but it can cause bad practices and maybe other threading issues in the future which may be even harder to track down.

Storing Data In Memory: Session vs Cache vs Static

A bit of backstory: I am working on an web application that requires quite a bit of time to prep / crunch data before giving it to the user to edit / manipulate. The data request task ~ 15 / 20 secs to complete and a couple secs to process. Once there, the user can manipulate vaules on the fly. Any manipulation of values will require the data to be reprocessed completely.
Update: To avoid confusion, I am only making the data call 1 time (the 15 sec hit) and then wanting to keep the results in memory so that I will not have to call it again until the user is 100% done working with it. So, the first pull will take a while, but, using Ajax, I am going to hit the in-memory data to constantly update and keep the response time to around 2 secs or so (I hope).
In order to make this efficient, I am moving the intial data into memory and using Ajax calls back to the server so that I can reduce processing time to handle the recalculation that occurs w/ this user's updates.
Here is my question, with performance in mind, what would be the best way to storing this data, assuming that only 1 user will be working w/ this data at any given moment.
Also, the user could potentially be working in this process for a few hours. When the user is working w/ the data, I will need some kind of failsafe to save the user's current data (either in a db or in a serialized binary file) should their session be interrupted in some way. In other words, I will need a solution that has an appropriate hook to allow me to dump out the memory object's data in the case that the user gets disconnected / distracted for too long.
So far, here are my musings:
Session State - Pros: Locked to one user. Has the Session End event which will meet my failsafe requirements. Cons: Slowest perf of the my current options. The Session End event is sometimes tricky to ensure it fires properly.
Caching - Pros: Good Perf. Has access to dependencies which could be a bonus later down the line but not really useful in current scope. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Static - Pros: Best Perf. Easies to maintain as I can directly leverage my current class structures. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Does anyone have any suggestions / comments on what I option I should choose?
Thanks!
Update: Forgot to mention, I am using VB.Net, Asp.Net, and Sql Server 2005 to perform this task.
I'll vote for secret option #4: use the database for this. If you're talking about a 20+ second turnaround time on the data, you are not going to gain anything by trying to do this in-memory, given the limitations of the options you presented. You might as well set this up in the database (give it a table of its own, or even a separate database if the requirements are that large).
I'd go with the caching method of for storing the data across any page loads. You can name the cache you want to store the data in to avoid conflicts.
For tracking user-made changes, I'd go with a more old-school approach: append to a text file each time the user makes a change and then sweep that file at intervals to save changes back to DB. If you name the files based on the user/account or some other session-unique indicator then there's no issue with conflict and the app (or some other support app, which might be a better idea in general) can sweep through all such files and update the DB even if the session is over.
The first part of this can be adjusted to stagger the write out more: save changes to Session, then write that to file at intervals, then sweep the file at larger intervals. you can tune it to performance and choose what level of possible user-change loss will be possible.
Use the Session, but don't rely on it.
Simply, let the user "name" the dataset, and make a point of actively persisting it for the user, either automatically, or through something as simple as a "save" button.
You can not rely on the session simply because it is (typically) tied to the users browser instance. If they accidentally close the browser (click the X button, their PC crashes, etc.), then they lose all of their work. Which would be nasty.
Once the user has that kind of control over the "persistent" state of the data, you can rely on the Session to keep it in memory and leverage that as a cache.
I think you've pretty much just answered your question with the pros/cons. But if you are looking for some peer validation, my vote is for the Session. Although the performance is slower (do you know by how much slower?), your processing is going to take a long time regardless. Do you think the user will know the difference between 15 seconds and 17 seconds? Both are "forever" in web terms, so go with the one that seems easiest to implement.
perhaps a bit off topic. I'd recommend putting those long processing calls in asynchronous (not to be confused with AJAX's asynchronous) pages.
Take a look at this article and ping me back if it doesn't make sense.
http://msdn.microsoft.com/en-us/magazine/cc163725.aspx
I suggest to create a copy of the data in a new database table (let's call it EDIT) as you send the initial results to the user. If performance is an issue, do this in a background thread.
As the user edits the data, update the table (also in a background thread if performance becomes an issue). If you have to use threads, you must make sure that the first thread is finished before you start updating the rows.
This allows a user to walk away, come back, even restart the browser and commit whenever she feels satisfied with the result.
One possible alternative to what the others mentioned, is to store the data on the client.
Assuming the dataset is not too large, and the code that manipulates it can be handled client side. You could store the data as an XML data island or JSON object. This data could then be manipulated/processed and handled all client side with no round trips to the server. If you need to persist this data back to the server the end resulting data could be posted via an AJAX or standard postback.
If this does not work with your requirements I'd go with just storing it on the SQL server as the other comment suggested.

Resources