Load Balancing in ASP.NET MVC Web Application. What can/can't be done? - asp.net-mvc

I'm in the middle of developing a web application and have been asked the question whether it will work with a load balancer. My initial reaction is yes, since there is no state tracked between requests anywhere in the system. However, there is some application specific state loaded on app start (configuration settings from the database mainly.)
This data is all Read Only. Is it sufficient to rely on the normal cache dependency mechanisms to manage this and invalidate these objects across all the applications in the cluster or would I have to move to a shared cache system like App Fabric to ensure reliability/consistency?
With diagnostics enabled, I've got numerous logging calls using EventSource.Write and an out of process logger picking these up. I assume in this case, I'd need one logger installed on each of the servers in the cluster to pick up the events each one triggers. I'm not too fussed about that, but what is a good way to identify which server in the cluster serviced the request?

If you initialize the data on each server seperately and it is read-only, there's no problem. The separate applications will have a copy each.
Yes, you'd need a logger on each instance. In order to identify the server you could include the servers' IP into the log. That way you can track the server. (provided you have static IP's, but I assume you do).

Related

MassTransit in ASP.NET MVC site?

I'd like to decouple a number of business objects that my website is using to support actions of the users.
My website is a SaaS/B2B site and I do not anticiapte to have a need for "mega scale". My primary issue is a need to decouple business objects from each other, and perform occasional longer-running operations asynchronously - outside of execution of threads that handle user traffic.
Having said that, I really do not want to have a separate set of servers that process my messages, and would prefer for web servers to just host MassTransit or other Bus software) internaly in memory. Assured message delivery (at this point) is also not yet my most important concenrn. I plan to "outsorce" a number of supporting business actions to the bus so that they do not pollute my main business services/objects.
Is this possible? Do I need Loopback for now as a transport or do I need full RabbitMq? Will RabbitMQ require me to install yet another set of servers to host it?
TIA
Loopback is just for testing. Installing RMQ is the right path. You don't NEED different servers for it, but would suggest it. If you off load work to a bus, you don't really want that contending with resources for the website. Given that, you can run RMQ locally without any issue. It message volume is low, so is resource usage in RMQ. When you reacher higher volumes, IO can be a problem with RabbitMQ (or any MQ).

How to run two grails apps on the same machine and have them not share a rabbitMQ

I have a grails app running with a single rabbit node. It is great. I want to fire up the same app a second time on the same machine on a different port. Currently, both apps answer jobs from both apps. I want their rabbits to be independent. What is the easiest way to ensure that each app only responds to the messages it sends? Multiple rabbit queues?
You can provide a virtualhost entry in the grails configuration:
rabbitmq.connectionfactory.virtualHost The name of the virtual host to connect to
Define two different vhosts in RabbitMQ, and each grails app will have their very own configured area to use. Messages sent through one vhost will only be available on that vhost, effectively separating the two grails apps without having to change queue setup or other internal parts of each app - just the configuration of the connection.
Remember that access control is performed on a per vhost basis, so you'll have to give your user access to each vhost in rabbitmq.
As #fiskfisk said, multiple vhosts is an option, and would work particularly well if you have a complex set of queues, exchanges, and bindings. There are some downsides to using a new vhost for the second application, including duplication of access control management, as well as some minor performance overhead.
If you have a fairly simple queue/exchange/binding setup, I would suggest pointing the second app at a queue with a different name, or giving your app the ability to be runtime-configured to either use a different queue, or to leverage the topic-based routing within RabbitMQ and have each app flag their messages with an app-specific prefix (or something similar).
One advantage of using topic routing to differentiate apps is that you can easily dip into the full stream of messages and do other things with that stream that you didn't foresee initially, including things like archival logging or audit logging, as well as other metrics collection or analysis.
tl;dr;
For long-term flexibility, have each instance of your application send messages to queues based on topic-routing.
For quick-and-dirty / get-it-working-yesterday, use a separate vhost for each instance of your application.

Horizontal scaling of JSF 2.0 application

Given that JavaServer Faces is inherently stateful on the server side, what methods are recommended for horizontally scaling a JSF 2.0 application?
If an application runs multiple JSF servers, I can imagine the following scenarios:
Sticky Sessions: send all requests matching a given session to the same server.
Question: what technology is commonly used to achieve this?
Problem: server failure results in lost sessions... and generally seems like fragile architecture, especially when starting fresh (not trying to scale an existing application)
State (Session) Replication: replicate JSF state on all JSF servers in cluster
Question: what technology is commonly used to achieve this?
Problem: does not scale. total memory of cluster = total memory on smallest server
Instruct JSF (via configuration) to store its state on an external resource (e.g. another server running a very fast in-memory database), then access that resource from the JSF servers when application state is needed?
Question: is this possible?
Instruct JSF (via configuration) to be stateless?
Question: is this possible?
[EDIT]
Updated in response to Ravi's suggestion of Sticky Sessions
This can be achieved by configuring your load balancer in sticky session mode.
More info
This way all your subsequent requests are sent to the same application server.
Here's an idea from Jelastic PaaS:
Pair-up application instances in 2-server clusters, and apply session replication between those 2 instances within one cluster. Then you can add as many 2-instance clusters as you want and load balance requests between clusters, with each session sticking to the cluster it originated from. Within cluster, requests could be load balanced between instances.
This way there is some degree of fail safety, since if one instance in cluster fails, the other takes over, with same session state. On the other hand, memory impact is not as severe as with full replication.
In short, it is combination of 1. and 2. from your question. Of course, there can be more than 2 instances in each cluster, if availability is of greater concern.
Link to Jelastic docs I lifted the idea from: http://jelastic.com/docs/session-replication.
Disclaimer: I don't actually know how to configure this with JSF2, and have no affiliation with Jelastic. Just liked the idea and thought it might help.
What about session replication with "buddy" semantics?
With one buddy total memory is halved (every server needs to hold the session data of two servers), which is a lot better than having to hold data of each and every server out there.
Buddy replication also reduces bandwidth overhead.

Why are some websites spread across www2, www3 sub-domains whilst others manage scaling without it?

I know it's to do with having a variety of load balancing servers, but why do some sites make use of differently named "www" sub domains (www2.somesite.com, www3.somesite.com etc) where as other can be perfectly massive without doing this - ie all traffic is to www.hugesite.com.
Does it indicate certain architectural decisions / have a specific purpose? Can it be avoided or is it a limitation of having the site scale a certain way?
www[n] is an easy way to add more more servers to cope with load since you can load balance very easily between the various servers - with www[n] you can just redirect the request to the appropriate server and forget about subsequent requests - because the client then deals with www1 or www2 etc... Adding more servers is simple... but it's non persistent in terms of subsequent requests
The alternative is for the load balancer to maintain a pool of backend nodes that are maintained "behind the scenes". It keeps track of which node the user has been allocated to - normally by using session cookies to identify which backend node the user has been allocated to. It just maintains a big in memory hashmap (effectively) of the session id's to backend nodes, delegating requests from a user's browser to the backend node each time... it's more complex to setup, but more powerful in the long run.
More info here:
http://en.wikipedia.org/wiki/Load_balancing_%28computing%29

Best practice for rate limiting users of a REST API?

I am putting together a REST API and as I'm unsure how it will scale or what the demand for it will be, I'd like to be able to rate limit uses of it as well as to be able to temporarily refuse requests when the box is over capacity or if there is some kind of slashdotted scenario.
I'd also like to be able to gracefully bring the service down temporarily (while giving clients results that indicate the main service is offline for a bit) when/if I need to scale the service by adding more capacity.
Are there any best practices for this kind of thing? Implementation is Rails with mysql.
This is all done with outer webserver, which listens to the world (i recommend nginx or lighttpd).
Regarding rate limits, nginx is able to limit, i.e. 50 req/minute per each IP, all over get 503 page, which you can customize.
Regarding expected temporary down, in rails world this is done via special maintainance.html page. There is some kind of automation that creates or symlinks that file when rails app servers go down. I'd recommend relying not on file presence, but on actual availability of app server.
But really you are able to start/stop services without losing any connections at all. I.e. you can run separate instance of app server on different UNIX socket/IP port and have balancer (nginx/lighty/haproxy) use that new instance too. Then you shut down old instance and all clients are served with only new one. No connection lost. Of course this scenario is not always possible, depends on type of change you introduced in new version.
haproxy is a balancer-only solution. It can extremely efficiently balance requests to app servers in your farm.
For quite big service you end-up with something like:
api.domain resolving to round-robin N balancers
each balancer proxies requests to M webservers for static and P app servers for dynamic content. Oh well your REST API don't have static files, does it?
For quite small service (under 2K rps) all balancing is done inside one-two webservers.
Good answers already - if you don't want to implement the limiter yourself, there are also solutions like 3scale (http://www.3scale.net) which does rate limiting, analytics etc. for APIs. It works using a plugin (see here for the ruby api plugin) which hooks into the 3scale architecture. You can also use it via varnish and have varnish act as a rate limiting proxy.
I'd recommend implementing the rate limits outside of your application since otherwise the high traffic will still have the effect of killing your app. One good solution is to implement it as part of your apache proxy, with something like mod_evasive

Resources