How well does HttpRuntime.Cache scale? - asp.net-mvc

We are thinking of using HttpRuntime.Cache for storing data accessed frequently by all users, but wanted to know what are the performance implications of using HttpRuntime.Cache? Are the contents of the cache transported in every http request and response? How much information can be reasonably stored in there?

What are the performance implications of using HttpRuntime.Cache?
Normally, Cache is stored in server's memory, unless server are configured as Web Farm or Web Garden. As the result, accessing to Cache is really fast compare is Database.
Are the contents of the cache transported in every http request and
response?
No.
How much information can be reasonably stored in there?
Virtually, there is no limit. However, you only want to cache the information - you often needed and do not change very often. In addition, you do not want to cache images and files - Cache is not meant for that.

Related

When fetching data using an api, is it best to store that data on another database, or is it best to keep fetching that data whenever you need it? [duplicate]

This question already has an answer here:
Caching calls to an external API in a rails app
(1 answer)
Closed 6 years ago.
I'm using the TMDB api in order to fetch information such as film titles and release years, but am wondering whether I need to create an extra database in order to store all this information locally, rather than keep having to use the api to get the info? For example, should I create a film model and call:
film.title
and by doing so accessing a local database with the title stored on it, or do I call:
Tmdb::Movie.detail(550).title
and by doing so making another call to the api?
Having dealt with a large Rails application that made service calls to about a dozen other applications, caching is your best bet. The problem with the database solution is keeping it up to date. The problem with making the calls every time is that it's too slow. There is a middle ground. For this you want Ruby on Rails Low Level Caching:
Sometimes you need to cache a particular value or query result instead of caching view fragments. Rails' caching mechanism works great for storing any kind of information.
The most efficient way to implement low-level caching is using the Rails.cache.fetch method. This method does both reading and writing to the cache. When passed only a single argument, the key is fetched and value from the cache is returned. If a block is passed, the result of the block will be cached to the given key and the result is returned.
An example that is pertinent to your use case:
class TmdbService
def self.movie_details(id)
Rails.cache.fetch("tmdb.movie.details.#{id}", expires_in: 4.hours) do
Tmdb::Movie.detail id
end
end
You can then configure your Rails application to use memcached or the database for the cache, it doesn't matter. The point is you want this cached data to expire at some point to ensure you are getting up-to-date information.
This is a big decision to make. If the amount of data you get through the API is not huge you can store all of it in your database. This way you will get the data much faster and your application will work even when the API is down.
If the amount of data you get is huge and you don't have sources to store all the data, you should at least store the most important data in your database as cache.
If you do not store any data on you own you are dependent on the source of data and it can have downtime.
Problem with storing data on your side is when the data change and you need to synchronize. In that case it is still good to store data on your side as cache to get results faster and synchronize the data periodically.
Calls to a local database are way faster than calls to external APIs. I would expect a local database to return within a few milliseconds, whereas an API will probably take hundreds of milliseconds. And local calls are less likely effected by network issues or downtimes.
Therefore I would always cache the result of an API call in a local database and occasionally updated the local version with a newer version from the API.
But in the end it depends on your requirement: Do you need real-time or is a cached version okay? How often do you need that data and how often is is updated? How fast is the API and is latency an issue? Does the API have a rate limit (a maximum number of request per time)?

OutputCache being invalidated

I've noticed that every time we alter the response cookies through:
HttpContext.Response.Cookies.Add(myCookie)
Header becomes:
Cache-Control: public, no-cache="Set-Cookie"
and Output Cache is invalidated.
It is very annoying and I was wondering if any one noticed similiar issues while output caching.
You could always switch to using a server-side caching model, such as System.Web.Caching.Cache or System.Runtime.Caching.MemoryCache, which would share caching of objects between users while still allowing communication with the browser.
Frankly, this server-side is the first caching model I have used. I only recently started using output caching and I find it very limited by comparison. Its only advantages are that it caches the page on the client side under certain scenarios and that it caches content rather than the data that generates the content (saving some CPU cycles). Its main disadvantage is that you have to disable it under certain conditions, such as during authentication or writing cookies. You never have to disable server-side caching - not even for application pool recycles - because it doesn't hinder communication with the browser.
For the best of both worlds, you could combine both approaches so whatever backend process that you don't want executed multiple times provide cached data when the view is generated. Then you would have client-side caching in most cases, and would rely on the server side caching when updating cookies. It could take more memory to use this approach, but that tradeoff might be worth it in your case.

Can I download and parse a large file with RestKit?

I have to download thousands or millions of hotposts from a web service and store them locally in core data. The json response or file is about 20 or 30 MB, so download will take time. I guess mapping and store it in core data will also take time time.
Can I do it in restkit? or has been designed just for reasonable size responses?
I see I can track progress when downloading a large file, even I see I can know when mapping starts or finishes: http://restkit.org/api/latest/Protocols/RKMapperOperationDelegate.html
Probably I can also encapsulate the core data operation to avoid blocking the UI.
What do you think? Do you think is this feasible? Or should I select a more manual approach? I would like to know your opinion. Thanks in advance.
Your problem is not encapsulation or threading, it's memory usage.
For a start, thousands or millions of 'hot posts' are likely to cause you issues on a mobile device. You should usually be using a web service that allows you to obtain a filtered set of content. If you don't have that already, consider creating it (possibly by uploading the data to a service like Parse.com).
RestKit isn't designed to use a streaming parser, so the full JSON will need to be deserialised into memory before it can be processed. You can try it, but I suspect the mobile device will be unhappy if the JSON is 20 / 30 MB.
So, create a nice web service or use a streaming parser and process the results yourself (which, could technically be done using RestKit mapping operations).

safe and performing way to save site wide variable

I am building a rails app, have a site wide counter variable (counter) that is used (read and write) by different pages, basically many pages can cause the counter increment, what is a multi-thread-safe way to store this variable so that
1) it is thread-safe, I may have many concurrent user access might read and write this variable
2) high performing, I originally thought about persist this variable in DB, but wondering is there a better way given there can be high volume of request and I don't want to make this DB query the bottleneck of my app...
suggestions?
It depends whether it has to be perfectly accurate. Assuming not, you can store it in memcached and sync to the database occasionally. If memcached crashes, it expires (shouldn't happen if configured properly), you have to shutdown, etc., reload it from the database on startup.
You could also look at membase. I haven't tried it, but to simplify, it's a distributed memcached server that automatically persists to disk.
For better performance and accuracy, you could look at a sharded approach.
Well you need persistence, so you have to store it in the Database/Session/File, AFAIK.

Storing Data In Memory: Session vs Cache vs Static

A bit of backstory: I am working on an web application that requires quite a bit of time to prep / crunch data before giving it to the user to edit / manipulate. The data request task ~ 15 / 20 secs to complete and a couple secs to process. Once there, the user can manipulate vaules on the fly. Any manipulation of values will require the data to be reprocessed completely.
Update: To avoid confusion, I am only making the data call 1 time (the 15 sec hit) and then wanting to keep the results in memory so that I will not have to call it again until the user is 100% done working with it. So, the first pull will take a while, but, using Ajax, I am going to hit the in-memory data to constantly update and keep the response time to around 2 secs or so (I hope).
In order to make this efficient, I am moving the intial data into memory and using Ajax calls back to the server so that I can reduce processing time to handle the recalculation that occurs w/ this user's updates.
Here is my question, with performance in mind, what would be the best way to storing this data, assuming that only 1 user will be working w/ this data at any given moment.
Also, the user could potentially be working in this process for a few hours. When the user is working w/ the data, I will need some kind of failsafe to save the user's current data (either in a db or in a serialized binary file) should their session be interrupted in some way. In other words, I will need a solution that has an appropriate hook to allow me to dump out the memory object's data in the case that the user gets disconnected / distracted for too long.
So far, here are my musings:
Session State - Pros: Locked to one user. Has the Session End event which will meet my failsafe requirements. Cons: Slowest perf of the my current options. The Session End event is sometimes tricky to ensure it fires properly.
Caching - Pros: Good Perf. Has access to dependencies which could be a bonus later down the line but not really useful in current scope. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Static - Pros: Best Perf. Easies to maintain as I can directly leverage my current class structures. Cons: No easy failsafe step other than a write based on time intervals. Global in scope - will have to ensure that users do not collide w/ each other's work.
Does anyone have any suggestions / comments on what I option I should choose?
Thanks!
Update: Forgot to mention, I am using VB.Net, Asp.Net, and Sql Server 2005 to perform this task.
I'll vote for secret option #4: use the database for this. If you're talking about a 20+ second turnaround time on the data, you are not going to gain anything by trying to do this in-memory, given the limitations of the options you presented. You might as well set this up in the database (give it a table of its own, or even a separate database if the requirements are that large).
I'd go with the caching method of for storing the data across any page loads. You can name the cache you want to store the data in to avoid conflicts.
For tracking user-made changes, I'd go with a more old-school approach: append to a text file each time the user makes a change and then sweep that file at intervals to save changes back to DB. If you name the files based on the user/account or some other session-unique indicator then there's no issue with conflict and the app (or some other support app, which might be a better idea in general) can sweep through all such files and update the DB even if the session is over.
The first part of this can be adjusted to stagger the write out more: save changes to Session, then write that to file at intervals, then sweep the file at larger intervals. you can tune it to performance and choose what level of possible user-change loss will be possible.
Use the Session, but don't rely on it.
Simply, let the user "name" the dataset, and make a point of actively persisting it for the user, either automatically, or through something as simple as a "save" button.
You can not rely on the session simply because it is (typically) tied to the users browser instance. If they accidentally close the browser (click the X button, their PC crashes, etc.), then they lose all of their work. Which would be nasty.
Once the user has that kind of control over the "persistent" state of the data, you can rely on the Session to keep it in memory and leverage that as a cache.
I think you've pretty much just answered your question with the pros/cons. But if you are looking for some peer validation, my vote is for the Session. Although the performance is slower (do you know by how much slower?), your processing is going to take a long time regardless. Do you think the user will know the difference between 15 seconds and 17 seconds? Both are "forever" in web terms, so go with the one that seems easiest to implement.
perhaps a bit off topic. I'd recommend putting those long processing calls in asynchronous (not to be confused with AJAX's asynchronous) pages.
Take a look at this article and ping me back if it doesn't make sense.
http://msdn.microsoft.com/en-us/magazine/cc163725.aspx
I suggest to create a copy of the data in a new database table (let's call it EDIT) as you send the initial results to the user. If performance is an issue, do this in a background thread.
As the user edits the data, update the table (also in a background thread if performance becomes an issue). If you have to use threads, you must make sure that the first thread is finished before you start updating the rows.
This allows a user to walk away, come back, even restart the browser and commit whenever she feels satisfied with the result.
One possible alternative to what the others mentioned, is to store the data on the client.
Assuming the dataset is not too large, and the code that manipulates it can be handled client side. You could store the data as an XML data island or JSON object. This data could then be manipulated/processed and handled all client side with no round trips to the server. If you need to persist this data back to the server the end resulting data could be posted via an AJAX or standard postback.
If this does not work with your requirements I'd go with just storing it on the SQL server as the other comment suggested.

Resources