Given that JavaServer Faces is inherently stateful on the server side, what methods are recommended for horizontally scaling a JSF 2.0 application?
If an application runs multiple JSF servers, I can imagine the following scenarios:
Sticky Sessions: send all requests matching a given session to the same server.
Question: what technology is commonly used to achieve this?
Problem: server failure results in lost sessions... and generally seems like fragile architecture, especially when starting fresh (not trying to scale an existing application)
State (Session) Replication: replicate JSF state on all JSF servers in cluster
Question: what technology is commonly used to achieve this?
Problem: does not scale. total memory of cluster = total memory on smallest server
Instruct JSF (via configuration) to store its state on an external resource (e.g. another server running a very fast in-memory database), then access that resource from the JSF servers when application state is needed?
Question: is this possible?
Instruct JSF (via configuration) to be stateless?
Question: is this possible?
[EDIT]
Updated in response to Ravi's suggestion of Sticky Sessions
This can be achieved by configuring your load balancer in sticky session mode.
More info
This way all your subsequent requests are sent to the same application server.
Here's an idea from Jelastic PaaS:
Pair-up application instances in 2-server clusters, and apply session replication between those 2 instances within one cluster. Then you can add as many 2-instance clusters as you want and load balance requests between clusters, with each session sticking to the cluster it originated from. Within cluster, requests could be load balanced between instances.
This way there is some degree of fail safety, since if one instance in cluster fails, the other takes over, with same session state. On the other hand, memory impact is not as severe as with full replication.
In short, it is combination of 1. and 2. from your question. Of course, there can be more than 2 instances in each cluster, if availability is of greater concern.
Link to Jelastic docs I lifted the idea from: http://jelastic.com/docs/session-replication.
Disclaimer: I don't actually know how to configure this with JSF2, and have no affiliation with Jelastic. Just liked the idea and thought it might help.
What about session replication with "buddy" semantics?
With one buddy total memory is halved (every server needs to hold the session data of two servers), which is a lot better than having to hold data of each and every server out there.
Buddy replication also reduces bandwidth overhead.
Related
I'm working to port some data access to dynamo DB in a high-traffic app. A bit of background - the app collects a very high volume of data, and some specific tables were causing performance issues in a traditional DB. So with a bit of re-design and some changes to the data layout we have been able to make them fit the DynamoDB niche nicely.
My question is around the use/creation of the client object. The SDK docs suggest it is better to create one client and share it amongst multiple threads, so in my repository implementation I have the client defined as a lazy singleton. This means it will be created once and all requests will share the same client (currently around 4000 requests per minute, but likely to grow massively as we come out of beta and start promoting the product).
Does anyone have any experience of making the AWS SDK scale?
Thanks
Sam
When you create one client and share it with multiple threads, only one thread can use the client at one point of time in some SDK.
Definitely if you create separate clients for different threads, it is going to slow down the process.
So I would suggest you to take a middle approach here,
Maximize the HTTP connection pooling size, so that more number of clients are allowed to be created.
And then you follow the sharing of client objects.
Batch operation can be used for .Net aws sdk
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/BatchOperationsORM.html
I'm in the middle of developing a web application and have been asked the question whether it will work with a load balancer. My initial reaction is yes, since there is no state tracked between requests anywhere in the system. However, there is some application specific state loaded on app start (configuration settings from the database mainly.)
This data is all Read Only. Is it sufficient to rely on the normal cache dependency mechanisms to manage this and invalidate these objects across all the applications in the cluster or would I have to move to a shared cache system like App Fabric to ensure reliability/consistency?
With diagnostics enabled, I've got numerous logging calls using EventSource.Write and an out of process logger picking these up. I assume in this case, I'd need one logger installed on each of the servers in the cluster to pick up the events each one triggers. I'm not too fussed about that, but what is a good way to identify which server in the cluster serviced the request?
If you initialize the data on each server seperately and it is read-only, there's no problem. The separate applications will have a copy each.
Yes, you'd need a logger on each instance. In order to identify the server you could include the servers' IP into the log. That way you can track the server. (provided you have static IP's, but I assume you do).
The Couchbase 2.0 manual describes network partitioning as a potential issue.
http://docs.couchbase.org/couchbase-manual-2.0/couchbase-architecture.html#couchbase-architecture-failover-automatic-considerations
But I didn't see how (if) Couchbase 2.0 deal with such issues on the datastore side.
My question is how is CAS implemented in a cluster and how does CAS operations deal with the split-brain problem? Is there a cluster wide lock? Is it last writer wins?
The same question was asked to our Google Groups list: http://groups.google.com/group/couchbase/browse_thread/thread/e0d543d9b17f9c77
It's down at the bottom of the thread, posts starting on Aug 30
Perry
Membase and Couchbase Server 2.0 are partitioning data. For each piece of data (vbucket) there's always single server that is source of truth.
Good side of this is that it's always strictly consistent. There's no need to design for conflict resolution etc.
But when some node goes down, you simply lose access to subset of your data. You can do failover in which case replicas will be promoted to masters for vbuckets that were lost, thus 'recoving' access to this vbuckets. Note that losing some recent mutations is unavoidable in that case due to some replication lag. And failover is manual operation (although recent version has very carefully implemented and limited autofailover).
I know it's to do with having a variety of load balancing servers, but why do some sites make use of differently named "www" sub domains (www2.somesite.com, www3.somesite.com etc) where as other can be perfectly massive without doing this - ie all traffic is to www.hugesite.com.
Does it indicate certain architectural decisions / have a specific purpose? Can it be avoided or is it a limitation of having the site scale a certain way?
www[n] is an easy way to add more more servers to cope with load since you can load balance very easily between the various servers - with www[n] you can just redirect the request to the appropriate server and forget about subsequent requests - because the client then deals with www1 or www2 etc... Adding more servers is simple... but it's non persistent in terms of subsequent requests
The alternative is for the load balancer to maintain a pool of backend nodes that are maintained "behind the scenes". It keeps track of which node the user has been allocated to - normally by using session cookies to identify which backend node the user has been allocated to. It just maintains a big in memory hashmap (effectively) of the session id's to backend nodes, delegating requests from a user's browser to the backend node each time... it's more complex to setup, but more powerful in the long run.
More info here:
http://en.wikipedia.org/wiki/Load_balancing_%28computing%29
I am putting together a REST API and as I'm unsure how it will scale or what the demand for it will be, I'd like to be able to rate limit uses of it as well as to be able to temporarily refuse requests when the box is over capacity or if there is some kind of slashdotted scenario.
I'd also like to be able to gracefully bring the service down temporarily (while giving clients results that indicate the main service is offline for a bit) when/if I need to scale the service by adding more capacity.
Are there any best practices for this kind of thing? Implementation is Rails with mysql.
This is all done with outer webserver, which listens to the world (i recommend nginx or lighttpd).
Regarding rate limits, nginx is able to limit, i.e. 50 req/minute per each IP, all over get 503 page, which you can customize.
Regarding expected temporary down, in rails world this is done via special maintainance.html page. There is some kind of automation that creates or symlinks that file when rails app servers go down. I'd recommend relying not on file presence, but on actual availability of app server.
But really you are able to start/stop services without losing any connections at all. I.e. you can run separate instance of app server on different UNIX socket/IP port and have balancer (nginx/lighty/haproxy) use that new instance too. Then you shut down old instance and all clients are served with only new one. No connection lost. Of course this scenario is not always possible, depends on type of change you introduced in new version.
haproxy is a balancer-only solution. It can extremely efficiently balance requests to app servers in your farm.
For quite big service you end-up with something like:
api.domain resolving to round-robin N balancers
each balancer proxies requests to M webservers for static and P app servers for dynamic content. Oh well your REST API don't have static files, does it?
For quite small service (under 2K rps) all balancing is done inside one-two webservers.
Good answers already - if you don't want to implement the limiter yourself, there are also solutions like 3scale (http://www.3scale.net) which does rate limiting, analytics etc. for APIs. It works using a plugin (see here for the ruby api plugin) which hooks into the 3scale architecture. You can also use it via varnish and have varnish act as a rate limiting proxy.
I'd recommend implementing the rate limits outside of your application since otherwise the high traffic will still have the effect of killing your app. One good solution is to implement it as part of your apache proxy, with something like mod_evasive