Caching user data to avoid excess database trips - asp.net-mvc

After creating a proof of concept for an ASP.NET MVC site and making sure the appropriate separation of concerns were in place, I noticed that I was making a lot of expensive redundant database calls for information about the current user.
Being historically a desktop and services person, my first thought was to cache the db results in some statics. It didn't take much searching to see that doing this would persist the current user's data across the whole AppDomain for all users.
Next I thought of using HttpContext.Current. However, if you put stuff here when a user is logged out, then when they log in your cached data will be out of date. I could update this every time login/logout occurs but I can't tell if this feels right. In the absence of other ideas, this is where I'm leaning.
What is a lightweight way to accurately cache user details and avoid having to make tons of database calls?

If the information you want to cache is per-user and only while they are active, then Session is the right place.
http://msdn.microsoft.com/en-us/library/system.web.sessionstate.httpsessionstate.aspx

What you're looking for is System.Web.Caching.Cache
http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx

ASP.NET session-state management is good for some situations but when heavy load is put, it tends to create bottlenecks in ASP.NET performance. Read more about it here:
http://msdn.microsoft.com/en-us/magazine/dd942840.aspx
http://esj.com/articles/2009/03/17/optimize-scalability-asp-net.aspx
The solution to avoid bottlenecks is use of distributed caching. There are many free distributed caching solutions in the market like Memcached or NCache Express.
Dont know much about Memcached but i've used NCache Express by Alachisoft, it lets you use ASP.NET caching without requiring any code change.

Related

Cache solution for a news feed, based on objective information?

I need some suggestions of what works well for caching an updatable news feed.
Please, no "Fanboy" answers either please - not looking for subjective opinions of what the "best" system, just seeking some suggestions of technologies that will fit the requirements below. So please, share what you have used in the real world, even if you prefer some other solution.
I have a rails based news feed (Neo4j database), and while performance is good, I would like to cache it so that servers don't get bogged down serving live feeds.
REQUIREMENTS:
EASY FRAGMENT UPDATES: I'd like to easily update parts of a user's newsfeed the
cache based upon specific triggers, for example, when a user edits
their status update - I don't want to regenerate the user's entire
news feed in the cache, rather I just want to update that one
"fragment", or section if you will, of the particular user's feed. And I don't want to jump through hoops to try and do so.
DELETION: If someone deletes an activity, I just want to remove that activity
from their news feed before the system eventually refreshes the entire feed for that user.
EASY RETRIEVAL: I'd like to retrieve the cache in such a way that the rails
controller/models can easily read them and hand them off to views without
any modification of the views.
PERSISTENCE: If I need to reboot the cache, it should load up the
cache from disk. Which means it should save cached entries to disk.
SPEED: Given that it must be able to be update fragments of cached
news feeds, there is going to be a performance hit of some sort. But
I need speed..
What cache technologies provide such capabilities? Will Redis, MongoDB, Memcached fit these requirements? What other options are there? (CouchDB, Tokyo File cabinet, etc)..
In the spirit Stack Overflow, I'm not asking for subjective opinions on what you like better and why, I'm just asking for possible candidate systems that you may have actually used in production to accomplish caching and updating a cached news feed (or anything similar).
Since it is mainly an opinion-based topic, this answer will be subjective. But I will try anyway to remain factual.
The first point to notice is your requirements tend to be mutually exclusive. As we said in France, you want the butter, the money for the butter, and the wife of the farmer (ok, this is probably a lousy translation).
For example, to support easy fragment updates and proper deletion, you will need some kind of data structures in the cache. I have zero knowledge about Rails, but I guess it will have impact on the data access patterns, and the definitions of controllers/models. In other words, it will add complexity to data retrieval. You need speed, but at the same time, you also require persistency, and also non-trivial data access patterns. Well, you cannot get everything at the same time, you will have to make choices, and prioritize these requirements.
My second point is a cache is only useful when there is a significant difference in term of performance between the cache and the underlying storage engine. Since you already use a NoSQL engine which is rather efficient (Neo4j), you need to consider only engines which are truly designed for raw performance (i.e. low-latency stores): memcached, redis, couchbase, aerospike, to name well-established open-source products. If you feel a bit more adventurous, you can also consider other projects like tarantool or hyperdex.
There are a number of commercial products as well, but I'm not sure they provide a Ruby client (TIBCO ActiveSpaces, Gigaspaces, Red-Hat Infinispan, etc ...)
Other NoSQL engines (MongoDB, Cassandra, CouchDB, etc ...) have other interesting properties, but they will not beat these solutions at raw performance for a mixed r/w workload. Here, I'm only talking about raw performance (i.e. low latency at high throughput), not scalability.
Actually, memcached can be excluded because it does not support persistency. I would say you can probably implement what you want with Redis, Couchbase or Aerospike, but Aerospike 3 does not seem to have yet an officially supported Ruby client.
Supporting multiple data accesses paths (i.e. consistent indexing data structure) will be easier with Redis and Aerospike than Couchbase. High-availability will be easier with Couchbase or Aerospike than with Redis. Implementing a cache behavior will be easier with Redis and Couchbase than with Aerospike.
Some general advices:
make sure you really have a performance or a scalability issue with Neo4j before adding the complexity of an extra layer. Complexity is like toothpaste: once it is out of the tube, you cannot put it back.
data access patterns should be listed at design time, and must be backed by corresponding data structures in the chosen engine.
the hardware footprint must be considered as well. If you have only a couple of boxes, pick a lightweight solution like Redis.
with persistency, you need to consider also HA. What happens if the caching layer is lost? Actually, I would say that for a cache, HA may be more important than persistency.
Finally, you need also to define the exact cache semantic you want (update behavior, invalidation behavior, cache miss management, TTL policy if any, etc ...). The 3 NoSQL engines I have listed provide some tools to help the implementation of the various strategies, but none of them will support an off-the-shelf strategy. This will require some coding to implement it.

Is there a best practice and recommended alternative to Session variables in MVC

Okay, so first off before anyone attempts to make a determination that this is a "duplicate" question; I have reviewed most of the posts on SO regarding similar questions but even in combination of all that has been said I still am somewhat at a dilemma as to the definitive or maybe I should say unanimous agreement on this.
I can however say that I have (based on the posts) conclusively determined that the answer is based on the scope of the requirement. But even with consideration of this, the opinions seem too diverse for me to make a decision as to how I should handle this.
My immediate requirement is that I need to persist variable data from 1 controller across many views. More specifically, I have a controller and corresponding view that handles shopping cart item counts and I would like to persist that data across multiple views. I am thinking that the _layout view is the most logical choice for this.
Now I have successfully accomplished this task by assigning the value to a Session variable which is retrieved from my _layout view; so even when the user were to navigate any where within the site the number of items in the Shopping Cart will persist until either they leave the site or complete the checkout; in which case the variable will be cleared in code.
The posts I've read seemed biased to either staying away from Session variables in favor of Cookies and storing data in a database; or stating that for the intent purpose for which I propose to use them, Session variables are perfectly fine to use.
The other thing I've read suggests that Session variables can potentially impede overall performance if there is high traffic on the site since the information is stored on the server.
I personally cannot justify storing this type of information in a database and subsequently hitting the database as I'd imagine that this could also affect site performance and seems a bit overkill for storage of temporary data. TempData, ViewData and ViewBag do not work in persisting the data so they are not logical choices for the requirement IMO.
If there is another well suited alternative to the Session variable (which is working for me) I would like to know what it is.
2 posts that seem contradictory in effort of providing best recommendations leave me a bit confused.
Cons: Is it a good practice to avoid using Session State in ASP.NET MVC? If yes, why and how?
Pros: Still ok to use Session variables in ASP.NET mvc, or is there a better alternative for some things (like a cart)
Seems that this question (although presented in many different variations) has no definitive answer that I can conclude.
If there is a more preferrable way to accomplish this without overkill then that is the answer I'm in search of.
I read somewhere the use of MVC filters in tandem with the Global.ascx application start section as well, but this does not seem appropriate for variables set at the controller level as much as perhaps, static variables.
Can someone maybe squash (for lack of a better word) the many diverse opinions on the topic and maybe provide a more definitive answer to the question? I'm sure the diverse opinions have their place and I'm not attempting to discredit them. But having a definitive and possibly unanimous answer would be better; then I could sort through the other posts to determine what is best for my application.
Of course, if this question has no definitive answer; just tell me that and I'll attempt to derive my own answer from the other posts.
Thanks
===========================================================
UPDATED RESPONSE TO ANSWERS PROVIDED
Caching and Cookies seem to be a general preference from the responses however I've also noted the statement that caching its not an ideal candidate to use across multiple web server because synchronization can be a potential issue.
Giving credit to Tim, it's stated that Database storage is optimized and users have the option to return at a later time and continue where they left off.
That is an excellent point, but keeping foresight on probabilities; its likely a reasonable given that some users may not return leaving unneccessary data in the database.
So keeping the DB optimized and clean (which "to me" is of equal relevance) would require implementing a maintenance task to automatically expire those records based on a set threshold of time to account for those circumstances. Although a maintenance task is not an unquestionable option, I still think this adds just a bit more work to the task simply for the intent purpose of serving as temporary storage.
Nonetheless, I do respect Tim's recommendation and believe it deserves merit on countering my initial opinion to a degree; that a database would not seem to be a viable option for storing Temporary data; so I think the compromise would be to store the data in a database (given the scenario of a Shopping Cart or similar) perhaps after a checkout. This way as you previously stated, the data may be persistently tracked upon subsequent visits so you have a record of transactions. But more importantly, it would be data of those transactions having real relevance to persist to the database.
It was also stated that although Session is faster than Database; but notwithstanding to have its caveats that can to some degree be mitigated by other mechanisms such as leveraging the SessionStateBehavior attribute, just serving as one example.
BUT... I think Erik kind of drove the point home with the Dunning-Kruger Effect. Although, from the content and explanations for proposed answers given here; I seriously doubt the expertise of any of the individuals who have responded is any way questionable. Nonetheless, I tend to agree on the fact of getting a unanimous opinion may be somewhat of a higher than reasonable expectation on my part.
What I was more specifically looking for was a general consensus for a technique that would comfortably accomodate a diverse number of scenarios. In other words, something that would accomodate not only my particular scenario but also provide the element of scalability to larger environments with potentially heavier traffic. This way a change in the programming would be either alleviated altogether or minimal at best.
==================================================
Summary based on the feedback:
Session variables seem to accomodate smaller case scenarios and when applicable, but they have some potential for persistence concerns among other notable discrepancies as stated very thorougly by Erik. So this option obviously will not fit a scalable model.
Caching is preferable over Session variables but again not neccessarily the "best" scalable option due to among other things to the potential synchronization complexities in web server farm environments as previously pointed out. But an option nonetheless.
Database storage is scalable but for the intent purpose of temporary volatile storage is probably not the most elegant option from a database perspective as it would require periodical cleanup. Personally, having a strong foundation in database concepts earlier in my career this probably is not going to be something that many developers will likely agree with; but using the database for this purpose may suffice for Web Development from a programmers perspective; however from perspective of the DAL and DB development this (to me) has the potential for mandating an additional DB task to enforce an efficient backend.
Cookies seem to be a nice option having the combined "desirable" elements of Session variables and caching.
==================================================
CONCLUSION
Based on the answers; I think COOKIES and CACHING seem to be generally well rounded proposals for best practice across the board in combination with database storage when continued persistence is required after the fact; as potentially good candidates for scalability of the ones presented.
The ultimate choice between the 2 would seem to be based on the amount and type of data requiring storage (e.g. sensitive vs non-sensitive and whether or not there is any concern that the client may alter the data on their end); in addition to special considerations for COOKIES in the fact that they may be disabled by the clients.
Obviously, there is no one size fits all solution as clearly pointed out and concluded from the answers provided but in terms of scalability; I may be wrong but these seem to be the BEST choices available.
Because all the responses are good; I'm fairly going to credit all the posts as useful and going to accept Erik's answer as a well rounded overall scalable solution. I wish I could select more than one accepted answer as I believe Tim's response was also very well layed out and concise.
Gupta's response was good also, but I wanted more elaboration of the proposed answer and not a repeat of previous posts.
Thanks Guys!
You will never get unanimous opinion on anything in any large group of people. That's just human nature. Part of that stems from the Dunning-Kruger Effect which states that the less someone knows about a subject, the more likely they are to over value their expertise in that subject. In other words, lots of people think they know something, but only because they don't know they don't know it. Part of it is simply that people have different experiences, and some have found no problems with session, while others have in various situations, or vice versa...
So, to backup your research, which suggest that the answer depends heavily on the requirements, we need to understand what your requirements are. If this is to be a high traffic site, with load balanced servers in a web farm, then stay as far away from session as you can. Sure, it's possible to share session in various ways in a server farm environment (session server, distribute cache server, etc..), but avoiding session will almost always be faster if you can help it.
If your site is a single server, and unlikely to ever grow beyond that. And your traffic patterns are relatively low, then session may be a useful option. However, you should always be aware that session is unreliable storage, and can disappear on you at any time. If the app pool is recycled, session is gone. If an uncaught exception bubbles up to the worker process, the session may be gone. If IIS thinks there's not enough memory, your session may be gone, regardless of any timeout values configured. You also can't always get reliable notification that a session has ended, since terminated sessions do not fire the Session_End event.
Another issue is that Session is serialized. In other words, IIS prevents more than one thread from writing to the session at a time, and it often does this by locking the session while a thread is running if it has not opted out of writable session locking. This can cause severe problems in some cases, and merely poor performance in others. You can mitigate this by marking various methods with a read-only session attribute if you aren't going to be modifying it in that method.
Ultimately, if you do choose to use session, then try to only use it for small, short lived things if at all possible, and if not possible then build in a way to "regenerate" the data if the session is lost. For instance, using your number of items in cart example, you could write a method that first checks to see if the value is there, and if not it goes out and loads it from the database. Always use this method to access the variable, rather than accessing it directly from session... this way, if the session is lost it will just reload it.
However, having said this... For the number of items in a cart, I would generally prefer to use a cookie for this information, since cookies get passed to the page on every load anyways, and this is a small discrete unit of data. Generally prefer Session for sensitive data that you want to prevent the user from being able to change.. number of items in the cart simply doesn't fit that rule.
When
Databases are highly optimized. A simple value like a shopping cart count is a good candidate for caching by the database and (hopefully) cheap to compute outright. It may be a non-issue.
However, if you have ruled out other mechanisms, small, user-by-user values are viable candidates for session.
Cache is fine for site-wide values, or user-specific values with unique keys. However, synchronizing caches across multiple web servers can be difficult. Out of process session state will stay synchronized because it is stored in a single location (database or a state server).
Of course, there are many 3rd party caching alternatives with various options to keep them synchronized.
Regardless of where the count is temporarily stored, I'm of the opinion that shopping carts themselves should be stored in the database so that users have the option to return later and continue where they left off.
Performance
If you use out of process session state (e.g. in a load balanced environment and/or to make session more durable), it will hit a database or call an out of process service, but the call is relatively cheap unless you are serializing large object graphs.
Session is loaded once per request. Subsequent read access is very fast.
Writing to session can be detrimental to performance, even when there is no load. Why? most modern applications use asynchronous calls, and when multiple async calls hit an HTTP handler (page, controller, etc) that reads/writes session, ASP.Net will lock the session to serialize access. To avoid this, you can decorate your controllers with [SessionState( SessionStateBehavior.ReadOnly )]
Design
Now I have successfully accomplished this task by assigning the value
to a Session variable which is retrieved from my _layout view;
This seems like mixing concerns, i.e. having the view aware of the underlying storage mechanism. From a purist standpoint, I would set this value on a view model or at least put it in the ViewBag. From a practical standpoint, one or two values retrieved in this manner probably won't hurt anything, but beware of letting it grow much further.
I read somewhere the use of MVC filters in tandem with the Global.ascx
application start section as well, but this does not seem appropriate
for variables set at the controller level as much as perhaps, static
variables.
Static variables have perfectly legitimate uses, but you must understand them thoroughly or risk serious problems.
See my answers pertaining to static variables in ASP.Net:
does aspx provide special treatment for c# static variables
Static fields vs Session variables
Session alternative in different prospective :-
When you keep something in session it breaks the primary rule in ASP.NET MVC. You can use these options as an alternative of session.
If your asp.net (MVC) session do boxing unboxing on the object then it makes a little load on the server. Try this idea
Caching :- Storing a List or something like large data in session is better can fit in Caching. You have control on whenever you want it to expire rather than user session.
If your app depends on JSON/Ajax data then you can use some kind of functionality provided in html5 (like WebSQL, IndexDB). it will not use the cookie so you can save some workload on the server.

what is the best way for caching frequently changing status information?

my projects deals with a client / server structure where the clients provide status information via a soap interface in a periodically way. every request (1 per minuete) contains a complex stucture of stat us data.
status information is used by many views and instead fetching the information each time from database i store the data in a sychronized list.
are there better caching techniques in grails? are sychronized lists a good solution?
This seems more like a generalized question so I'll provide some generalized thoughts form my own experience.
Are there better caching techniques in grails? are sychronized lists a good solution?
There may be several layers of cache depending on what your dealing with. I don't believe bare-bones grails itself caches anything with regard to your question however; there are configurable options and plugins that allow you to cache everything from queries, domain classes, service calls, page fragment, images, css and just about everything else. Not to mention your database and other layers may have their own cache options.
Having said that I would avoid using your own caching techniques unless your dealing with a very specific issue where you know you can perform better than a more generic approach like a second level cache (ie EHCache).
If you do roll your own cache you'll want to be aware of everything else that might be caching the same content as well. Caching a cached object form a cached query is a tough one to debug.
If performance is your concern you should always do some bench marking before you change anything. To truly get the best performance out of anything you'll need to understand how it works. Grails, hibernate and spring work together on performance and this isn't anything I can put in few sentences but there are plugins that can help you understand what is going on beyond the scenes like JavaMelody.
Lastly, if you already built something that works and everyone's happy don't break it. :)
Probably a properly scoped service may help:
http://grails.org/doc/2.0.x/guide/services.html#scopedServices
Maybe a "session"-scoped service may be the thing you're looking for.
You may want to take a look at the built-in caching techniques: http://grails.org/doc/2.0.x/ref/Database%20Mapping/cache.html
A more detailed way is described here: http://grails.org/doc/2.0.x/guide/single.html#caching
Depending on what you want to cache, you may want to use Caching instances (to cache everything of that instance) or Caching Queries (where you only cache the result of one query)
As you can see in the second link, the config lets you use EhCache as cache manager.

Is Using Db4o For Web Sites a judicious choice?

Is using Db4o as a backend datastore for a Web site (ASP.NET MVC) a judicious choice as an alternative to MS SQL Server ?
The main issue with DB4o is: Can you cut your object net in some useful manner? If not, then you'll keep too many objects in RAM for too long and your performance will suffer.
For example, in SQL, you can create a cursor and then easily traverse a huge set of results. You can also query for a small set of columns while DB4o always loads the whole objects (and its references and the references of the references). With DB4o, you must make sure that DB4o doesn't try to pull in all objects from the DB at once.
You'll also need to get used to querying things your "DB" by filling out example objects which feels weird in the beginning.
That depends, what kind of site your creating, the traffic your expecting etc...Are you going to handle a million requests a second, or 100 a minute...Does your domain justify using a Object Database? Do you really need it?
In general, most sites are not heavy hitters so they might not require all the scale out functionality (I believe and this is only a belief that traditional RDBMS have been tested and designed to handle extreme loads where as Object DB's might not have been given the same attention).
So then the question is does your domain justify this? Your going to base a core piece of your site on a technology that you will not find a lot of experts in. So how do you handle turn over rate? Are you willing to take the cost associated with training all current and future employees on this?

How can I implement Caching Strategy in my Asp.net Mvc With linq2sql repository?

I dont know if I should use the httpcontext caching or the Enterprise Library Caching Application Block. Also, what is the best pattern for the caching Strategy when deleting or updating an entity that is part of a cached list?
Should I remove all of a list from the cache or only remove the item from the cached list?
If I update it will I remove the list from the cache or update the entity in it.
Having done some testing with both I did a full review of the caching application block in the context of our code and blogged my experience with it. It's very simple to use and powerful enough for our needs. I would recommend it, my results were blogged here.
In your position I would use the Repository pattern to maintain my cache, it works well for database datasets and should work equally well for the cache in your own. If you're not familar with the repository pattern, check out this post from Steven Walther.. I would tend to disagree with the previous answer however, taking out only the items you need for modification and laeving the rest untouched. This will allow you to expire items from the cache independantly from the whole list should you so wish.
There are several approaches to implement caching,httpcontext being the easiest one, but it's not necessarily the worst. Take a look at memcached or MS Velocity, both of which can be used as backends for the ASP.NET Cache. Especially memcached has a reputation of doing a really good job.
As for caching policy: you have to decide what works best for you.I personally would remove the complete list from the cache upon update/delete rather than trying to find out whether the entity is in the list, since it takes a non-trivial amount of time and you need to take concurrency issues into account (locking the list, since somebody might do an update/delete of another entity).
Sometimes it does make sense to update an entity in place (if you have a complete object with all data you need), sometimes it's a waste of time, because due to some state change the entity should move somewhere else (e.g. when you sort by LastChangedDate etc.)
Don't forget to optimize your DB code too so that it does not take too much time to refresh the flushed list.
just use [OutputCache(Duration=10, VaryByParam="none")]
on every action or even controller you want to cache.
from http://www.asp.net/mvc/tutorials/older-versions/controllers-and-routing/improving-performance-with-output-caching-cs

Resources