ASP.NET MVC, autocomplete textbox, caching? - asp.net-mvc

Using ASP.NET MVC, I've implemented an autocomplete textbox using the approach very similar to the implementation by Ben Scheirman as shown here: http://flux88.com/blog/jquery-auto-complete-text-box-with-asp-net-mvc/
What I haven't been able to figure out is if it's a good idea to cache the data for the autocomplete textbox, so there won't be a roundtrip to the database on every keystroke?
If caching is prefered, can you guide me in the direction to how to implement caching for this purpose?

You have a couple things to ask yourself:
Is the data I'm pulling back dynamic?
If not, how often do I expect this call to occur?
If the answers are, 1- not really and 2 - call to happen frequently, you should cache it.
I don't know how your data access is setup, but I simply throw my data into cache objects like so:
public IQueryable<Category> FindAllCategories()
{
if (HttpContext.Current.Cache["AllCategories"] != null)
return (IQueryable<Category>)HttpContext.Current.Cache["AllCategories"];
else
{
IQueryable<Category> allCats = from c in db.Categories
orderby c.Name
select c;
// set cache
HttpContext.Current.Cache.Add("AllCategories", allCats, null, System.Web.Caching.Cache.NoAbsoluteExpiration, new TimeSpan(0, 0, 30, 0, 0), System.Web.Caching.CacheItemPriority.Default, null);
return allCats;
}
}
This is an example of one of my repository queries, based off of LINQ to SQL. It first checks the cache, if the entry exists in cache, it returns it. If not, it goes to the database, then caches it with a sliding expiration.

You sure can Cache your result, using the attribute like:
[OutputCache(Duration=60, VaryByParam="searchTerm")]
ASP.net will handle the rest.

I think caching in this case would require more work than simply storing every request. You'd want to focus more on the terms being searched than individual keys. You'd have to keep track of what terms are more popular and cache combinations of characters that make up those terms. I don't think simply caching every single request is going to get you any performance boost. You're just going to have stale data in your cache.

Well, how will caching in asp.net prevent server round trips? You'll still have server round trips, at best you will not have to look up the database if you cache. If you want to prevent server roundtrips then you need to cache at the client side.
While it's quite easily possible with Javascript (You need to store your data in a variable and check that variable for relevant data before looking up the server again) I don't know of a ready-tool which does this for you.
I do recommend you consider caching to prevent round-trips. In fact I have half a mind to implement javascript caching in one of my own websites reading this.

Related

Is there any alternative to ASP.NET MVC OutputCache without sliding expiration

I have a public static/singleton class with IsDataModified() which is affected by change in database, file, type of user, api, etc, processes immidiately, just returns a bool variable.
The frequency of modification of output data varies extremely from a minute to months, so I won't use sliding expiration, instead let duration be MAX or infinite.
But what I'm looking for is
List item
request by browser
MVC filter to check if cache missing or IsDataModified()
Update cache and return
Else return existing cache
I tried extending OutputCache, setting duration to very large number, but once the page is cached the filters are not triggered.
Basically I do not want the duration specified to be the deciding factor as to when cache will expire, rather IsDataModified() should be the deciding factor.
One approach I think is to create a simple filter and use output cache or similar object through code behind, but I could not find OutputCacheAttribute giving a cached viewresult.
Is this possible? Please suggest.
So I have implemented a solution built on top of Redis (memcache is a lot messier). I use an open source Redis Output Cache provider which basically creates a key corresponding to the URL of the page. Whenever the underlying data is changed for one of the pages I remove the value from Redis where the key maches some pattern. (My data sort of has a hierarchy so I delete the cache for more items when it is a piece of data from the parent that is updated).
Using a similar approach of deleting the cached page when the data is updated would probably also work for you. On a side note, I am thinking of trying to change my process so that I have a background service that creates the page when data is updated and replaces the cache so that the first users don't have a slow response after the page is first removed from the cache.

Caching lookups in MVC, default expiration?

I am caching lookup data in my mvc application, I have the following code:
// GET: Category Types
public JsonResult GetAuditGrants(int auditID)
{
AuditDAL ad = new AuditDAL();
if (System.Web.HttpContext.Current.Cache["AuditGrants"] == null)
{
System.Web.HttpContext.Current.Cache["AuditGrants"] = ad.GetAuditIssueGrants(auditID);
}
var types = (IEnumerable<Grant>)System.Web.HttpContext.Current.Cache["AuditGrants"];
return this.Json(types.ToList());
}
If expiration is not set, by default when does the data expire in cache? Is recommended and should it be stored in the webconfig for consistency for lookup data in my app?
To answer your first question, we can consult MSDN. According to its documentation, adding an object using the Item property (or indexer) is equivalent to calling the Insert method, whose documentation states:
The object added to the cache using this overload of the Insert method
is inserted with no file or cache dependencies, a priority of Default,
a sliding expiration value of NoSlidingExpiration, and an absolute
expiration value of NoAbsoluteExpiration.
Your second question is really pretty application-specific. The best practice is to profile your application. If your application is experiencing a ton of cache-misses and your cache stays small, then you might want to extend the expiration sliding window by using one of the Add or Inserts overloads that give you that control. In that case, storing your selected parameters in the app settings seems like a good idea.
One thing to remember about this cache, however: it is per-app domain. If you have multiple web frontends, or even an IIS server configured to launch more than one worker process for your app, then you may not be getting the most out of your caching strategy. In that case, you might need to use something that can offer persistence to multiple instances of your app. We use Redis, but there are many other options.

ASP Mvc Nhibernate Issue

I am experiencing some bizarre problems with Nhibernate within my MVC web application.
There is not 1 consistent error, I keep getting loads of random ones:
Transaction not successfully started
New request is not allowed to start because it should come with valid transaction descriptor
Unexpected row count: -1; expected: 1
To give a little context to the setup, I am using Ninject to DI the sessions and other Nhibernate related objects, currently I am using RequestScope however I have tried SingletonScope. I have a large and complicated data model, which is read out as a whole, but persisted back in separate parts, as these can all be edited and saved individually.
An example would be having a Customer object, which contains a address object, a contact object, friends object, previous orders object etc etc...
So the whole object is read out, then mapped to the UI domain models and then displayed in different partials within the page. Each partial can be updated individually via ajax, so you may update 1 section or you could update them all together. It seems mainly to give me the problems when I try to persist them all together (so 2-4 simultanious ajax requests to persist chunks of the model).
Now I have integration tests that work fine, which just test the persistence and retrieval of entities. As a whole and individually and all pass fine, however in the web app they just seem to keep throwing random exceptions, and originally refused to persist outside of the Nhibernate cache. I found a way round this by wrapping most units of work within transactions, which got the data persisting but started adding new errors to the mix.
Originally I was thinking of just scrapping Nhibernate from the project, as although I really want its persistance/caching layer, it just didnt seem to be flexible enough for my domain, which seems odd as I have used it before without much problem, although it doesn't like 1-1 mappings.
So has anyone else had flakey transaction/nhibernate issues like this within an ASP MVC app... I know this may be a bit vague as the errors dont point to one thing, and it doesn't always error, so its like stabbing in the dark, but I am out of ideas so any help would be great!
-- Update --
I cannot post all relevant code as the project is huge, but the transaction bit looks like:
using (var transaction = sessionManager.Session.BeginTransaction(IsolationLevel.ReadUncommitted))
{
try
{
// Do unit of work
transaction.Commit();
}
catch (Exception)
{
transaction.Rollback();
throw;
}
}
Some of the main problems I have had on this project have stemmed from:
There are some 1-1 relationships with composite keys, but logically it makes sense
The Nhibernate domain entities go through a mapping layer to become the UI domain entities, then vice versa when saving. Problem here is that with the 1-1 mappings, when persisting the example Address I have to make a Surrogate Customer object with the correct Id then merge.
There is ALOT of Ajax that deals with chunks of the overall model (I talk like there is one single model, but there are quite a few top level models, just one that is most important)
Some notes that may help. I use windsor but imagine the concepts are the same. Sounds like there may be a combination of things.
SessionFactory should be created as singleton and session should be per web request. Something like:
Bind<ISessionFactory>()
.ToProvider<SessionFactoryBuilder>()
.InSingletonScope();
Bind<ISession>()
.ToMethod( context => context.Kernel.Get<ISessionFactory>().OpenSession() )
.InRequestScope();
Be careful of keeping transactions open for too long, keep them as short lived as possible to avoid deadlocks.
Check your queries are running as as expected by using a tool like NHProf. Often people load up too much of the graph which impacts performance and can create deadlocks.
Check your mappings for things like not.lazyload() and see if you actually need the additional data in the queries and keep results returned to a min. Check your queries execution plans and ensure adequate indexes are in place.
I have had issues with mvc3 action filters being cached, which meant transactions were not always started, but would attempt to be closed causing issues. Moved all my transaction commits into ActionResults in the controllers to keep transaction as short as possible and close to the action.
Check your cascades in your mappings and keep the updates to a minimum.

Where / How to fit Solr into ASP.net MVC app (using nHibernate / Repository Pattern)

I'm currently in the middle of a reasonably large question / answer based application (kind of like stackoverflow / answerbag.com)
We're using SQL (Azure) and nHibernate for data access and MVC for the UI app.
So far, the schema is roughly along the lines of the stackoverflow db in the sense that we have a single Post table (contains both questions / answers)
Probably going to use something along the lines of the following repository interface:
public interface IPostRepository
{
void PutPost(Post post);
void PutPosts(IEnumerable<Post> posts);
void ChangePostStatus(string postID, PostStatus status);
void DeleteArtefact(string postId, string artefactKey);
void AddArtefact(string postId, string artefactKey);
void AddTag(string postId, string tagValue);
void RemoveTag(string postId, string tagValue);
void MarkPostAsAccepted(string id);
void UnmarkPostAsAccepted(string id);
IQueryable<Post> FindAll();
IQueryable<Post> FindPostsByStatus(PostStatus postStatus);
IQueryable<Post> FindPostsByPostType(PostType postType);
IQueryable<Post> FindPostsByStatusAndPostType(PostStatus postStatus, PostType postType);
IQueryable<Post> FindPostsByNumberOfReplies(int numberOfReplies);
IQueryable<Post> FindPostsByTag(string tag);
}
My question is:
Where / how would i fit solr into this for better querying of these "Posts"
(I'll be using solrnet for the actual communication with Solr)
Ideally, I'd be using the SQL db as merely a persistant store-
The bulk of the above IQueryable operations would move into some kind of SolrFinder class (or something like that)
The Body property is the one that causes the problems currently - it's fairly large, and slows down queries on sql.
My main problem is, for example, if someone "updates" a post - adds a new tag, for example, then that whole post will need re-indexing.
Obviously, doing this will require a query like this:
"SELECT * FROM POST WHERE ID = xyz"
This will of course, be very slow.
Solrnet has an nHibernate facility- but i believe this will be the same result as above?
I thought of a way around this, which I'd like your views on:
Adding the ID to a queue (amazon sqs or something - i like the ease of use with this)
Having a service (or bunch of services) somewhere that do the above mentioned query, construct the document, and re-add it to solr.
Another problem I'm having with my design:
Where should the "re-indexing" method(s) be called from?
The MVC controller? or should i have a "PostService" type class, that wraps the instance of IPostRepository?
Any pointers are greatly received on this one!
On the e-commerce site that I work for, we use Solr to provide fast faceting and searching of the product catalog. (In non-Solr geek terms, this means the "ATI Cards (34), NVIDIA (23), Intel (5)" style of navigation links that you can use to drill-down through product catalogs on sites like Zappos, Amazon, NewEgg, and Lowe's.)
This is because Solr is designed to do this kind of thing fast and well, and trying to do this kind of thing efficiently in a traditional relational database is, well, not going to happen, unless you want to start adding and removing indexes on the fly and go full EAV, which is just cough Magento cough stupid. So our SQL Server database is the "authoritative" data store, and the Solr indexes are read-only "projections" of that data.
You're with me so far because it sounds like you are in a similar situation. The next step is determining whether or not it is OK that the data in the Solr index may be slightly stale. You've probably accepted the fact that it will be somewhat stale, but the next decisions are
How stale is too stale?
When do I value speed or querying features over staleness?
For example, I have what I call the "Worker", which is a Windows service that uses Quartz.NET to execute C# IJob implementations periodically. Every 3 hours, one of these jobs that gets executed is the RefreshSolrIndexesJob, and all that job does is ping an HttpWebRequest over to http://solr.example.com/dataimport?command=full-import. This is because we use Solr's built-in DataImportHandler to actually suck in the data from the SQL database; the job just has to "touch" that URL periodically to make the sync work. Because the DataImportHandler commits the changes periodically, this is all effectively running in the background, transparent to the users of the Web site.
This does mean that information in the product catalog can be up to 3 hours stale. A user might click a link for "Medium In Stock (3)" on the catalog page (since this kind of faceted data is generated by querying SOLR) but then see on the product detail page that no mediums are in stock (since on this page, the quantity information is one of the few things not cached and queried directly against the database). This is annoying, but generally rare in our particularly scenario (we are a reasonably small business and not that high traffic), and it will be fixed up in 3 hours anyway when we rebuild the whole index again from scratch, so we have accepted this as a reasonable trade-off.
If you can accept this degree of "staleness", then this background worker process is a good way to go. You could take the "rebuild the whole thing every few hours" approach, or your repository could insert the ID into a table, say, dbo.IdentitiesOfStuffThatNeedsUpdatingInSolr, and then a background process can periodically scan through that table and update only those documents in Solr if rebuilding the entire index from scratch periodically is not reasonable given the size or complexity of your data set.
A third approach is to have your repository spawn a background thread that updates the Solr index in regards to that current document more or less at the same time, so the data is only stale for a few seconds:
class MyRepository
{
void Save(Post post)
{
// the following method runs on the current thread
SaveThePostInTheSqlDatabaseSynchronously(post);
// the following method spawns a new thread, task,
// queueuserworkitem, whatevever floats our boat this week,
// and so returns immediately
UpdateTheDocumentInTheSolrIndexAsynchronously(post);
}
}
But if this explodes for some reason, you might miss updates in Solr, so it's still a good idea to have Solr do a periodic "blow it all away and refresh", or have a reaper background Worker-type service that checks for out-of-date data in Solr everyone once in a blue moon.
As for querying this data from Solr, there are a few approaches you could take. One is to hide the fact that Solr exists entirely via the methods of the Repository. I personally don't recommend this because chances are your Solr schema is going to be shamelessly tailored to the UI that will be accessing that data; we've already made the decision to use Solr to provide easy faceting, sorting, and fast display of information, so we might as well use it to its fullest extent. This means making it explicit in code when we mean to access Solr and when we mean to access the up-to-date, non-cached database object.
In my case, I end up using NHibernate to do the CRUD access (loading an ItemGroup, futzing with its pricing rules, and then saving it back), forgoing the repository pattern because I don't typically see its value when NHibernate and its mappings are already abstracting the database. (This is a personal choice.)
But when querying on the data, I know pretty well if I'm using it for catalog-oriented purposes (I care about speed and querying) or for displaying in a table on a back-end administrative application (I care about currency). For querying on the Web site, I have an interface called ICatalogSearchQuery. It has a Search() method that accepts a SearchRequest where I define some parameters--selected facets, search terms, page number, number of items per page, etc.--and gives back a SearchResult--remaining facets, number of results, the results on this page, etc. Pretty boring stuff.
Where it gets interesting is that the implementation of that ICatalogSearchQuery is using a list of ICatalogSearchStrategys underneath. The default strategy, the SolrCatalogSearchStrategy, hits SOLR directly via a plain old-fashioned HttpWebRequest and parsing the XML in the HttpWebResponse (which is much easier to use, IMHO, than some of the SOLR client libraries, though they may have gotten better since I last looked at them over a year ago). If that strategy throws an exception or vomits for some reason, then the DatabaseCatalogSearchStrategy hits the SQL database directly--although it ignores some parameters of the SearchRequest, like faceting or advanced text searching, since that is inefficient to do there and is the whole reason we are using Solr in the first place. The idea is that usually SOLR is answering my search requests quickly in full-featured glory, but if something blows up and SOLR goes down, then the catalog pages of the site can still function in "reduced-functionality mode" by hitting the database with a limited feature set directly. (Since we have made explicit in code that this is a search, that strategy can take some liberties in ignoring some of the search parameters without worrying about affecting clients too severely.)
Key takeaway: What is important is that the decision to perform a query against a possibly-stale data store versus the authoritative data store has been made explicit--if I want fast, possibly stale data with advanced search features, I use ICatalogSearchQuery. If I want slow, up-to-date data with the insert/update/delete capability, I use NHibernate's named queries (or a repository in your case). And if I make a change in the SQL database, I know that the out-of-process Worker service will update Solr eventually, making things eventually consistent. (And if something was really important, I could broadcast an event or ping the SOLR store directly, telling it to update, possibly in a background thread if I had to.)
Hope that gives you some insight.
We use solr to query a large product database.
Around 1 million products, and 30 stores.
What we did is we used triggers on the product table and stock tables on our Sql server.
Each time a row is changed it flags the product to be reindexed. And we have a windows service that grabs these products and post them to Solr every 10 seconds. (With a limit of 100 products per batch).
It's super efficient, almost real time info for the stock.
If you have a big text field (your 'body' field), then yes, re-index in background. The solutions you mentioned (queue or periodic background service) will do.
MVC controllers should be oblivious of this process.
I noticed you have IQueryables in your repository interface. SolrNet does not currently have a LINQ provider. Anyway, if those operations are all you're going to do with Solr (i.e. no faceting), you might want to consider using Lucene.Net instead, which does have a LINQ provider.

Where should caching occur in an ASP.NET MVC application?

I'm needing to cache some data using System.Web.Caching.Cache. Not sure if it matters, but the data does not come from a database, but a plethora of custom objects.
The ASP.NET MVC is fairly new to me and I'm wondering where it makes sense for this caching to occur?
Model or Controller?
At some level this makes sense to cache at the Model level but I don't necessarily know the implications of doing this (if any). If caching were to be done at the Controller level, will that affect all requests, or just for the current HttpContext?
So... where should application data caching be done, and what's a good way of actually doing it?
Update
Thanks for the great answers! I'm still trying to gather where it makes most sense to cache given different scenarios. If one is caching the entire page, then keeping it in the view makes sense but where to draw the line when it's not the entire page?
I think it ultimately depends on what you are caching. If you want to cache the result of rendered pages, that is tightly coupled to the Http nature of the request, and would suggest a ActionFilter level caching mechanism.
If, on the other hand, you want to cache the data that drives the pages themselves, then you should consider model level caching. In this case, the controller doesn't care when the data was generated, it just performs the logic operations on the data and prepares it for viewing. Another argument for model level caching is if you have other dependencies on the model data that are not attached to your Http context.
For example, I have a web-app were most of my Model is abstracted into a completely different project. This is because there will be a second web-app that uses this same backing, AND there's a chance we might have a non-web based app using the same data as well. Much of my data comes from web-services, which can be performance killers, so I have model level caching that the controllers and views know absolutely nothing about.
I don't know the anwser to your question, but Jeff Atwood talks about how the SO team did caching using the MVC framework for stackoverflow.com on a recent hanselminutes show that might help you out:
http://www.hanselminutes.com/default.aspx?showID=152
Quick Answer
I would start with CONTROLLER caching, use the OutputCache attribute, and later add Model caching if required. It's quicker to implement and has instant results.
Detail Answer (cause i like the sound of my voice)
Here's an example.
[OutputCache(Duration=60, VaryByParam="None")]
public ActionResult CacheDemo() {
return View();
}
This means that if a user hits the site (for the cache requirements defined in the attribute), there's less work to get done. If there's only Model caching, then even though the logic (and most likely the DB hit) are cached, the web server still has to render the page. Why do that when the render result will always be the same?
So start with OutputCaching, then move onto Model caching as you performance test your site.
Output caching is also a lot simpler to start out with. You don't have to worry about web farm distributed caching probs (if you are part of a farm) and the caching provider for the model.
Advanced Caching Techniques
You can also apply donut caching -> cache only part of the UI page :) Check it out!
I would choose caching at the model level.
(In general, the advice seems to be to minimize business logic at the controller level
and move as much as possible into model classes.)
How about doing it like this:
I have some entries in the model represented by the class Entry
and a source of entries (from a database, or 'a plethora of custom objects').
In the model I make an interface for retrieving entries:
public interface IEntryHandler
{
IEnumerable<Entry> GetEntries();
}
In the model I have an actual implementation of IEntryHandler
where the entries are read from cache and written to cache.
public class EntryHandler : IEntryHandler
{
public IEnumerable<Entry> GetEntries()
{
// Check if the objects are in the cache:
List<Entry> entries = [Get entries from cache]
if (entries == null)
{
// There were no entries in the cache, so we read them from the source:
entries = [Get entries from database or 'plethora of custom objects']
[Save the retrieved entries to cache for later use]
}
return entries;
}
}
The controller would then call the IEntryHandler:
public class HomeController : Controller
{
private IEntryHandler _entryHandler;
// The default constructor, using cache and database/custom objects
public HomeController()
: this(new EntryHandler())
{
}
// This constructor allows us to unit test the controller
// by writing a test class that implements IEntryHandler
// but does not affect cache or entries in the database/custom objects
public HomeController(IEntryHandler entryHandler)
{
_entryHandler = entryHandler;
}
// This controller action returns a list of entries to the view:
public ActionResult Index()
{
return View(_entryHandler.GetEntries());
}
}
This way it is possible to unit test the controller without touching real cache/database/custom objects.
I think the caching should somehow be related to the model. I think the controller shouldn't care more about the data. The controller responsibility is to map the data - regardless where it come from - to the views.
Try also to think why you need to cache? do you want to save processing, data transmission or what? This will help you to know where exactly you need to have your caching layer.
It all depends on how expensive the operation is. If you have complicated queries then it might make sense to cache the data in the controller level so that the query is not executed again (until the cache expires).
Keep in mind that caching is a very complicated topic. There are many different places that you can store your cache:
Akamai / CDN caching
Browser caching
In-Memory application caching
.NET's Cache object
Page directive
Distributed cache (memcached)

Resources