Can anyone tell me, whats the benefit of having more number of replicas of a shard other than fault torrance? Does having more replicas enable to distribute the load among available replicas and leader thus improve query response time ?
You are right. More replicas means load distribution as well as failover.
You divide data into shards when single server cannot cope with it, and you add new replicas when the existing server(s) cannot cope with QPS (queries per second) rate.
Related
I'm using Locust to load test my web servers. I'm running Locust in distributed mode. The worker nodes are written in Java, and use the Locust/Java port using locust4j. The master node and the worker nodes are containerized, our orchestrator is Kubernetes. When I want to spin up more workers, I'm doing it from there.
The problem that I'm running into is that no matter how many users I add, or worker nodes I add, I can't seem to generate more than ~8000 RPM. This is confirmed by the Locust web frontend, as well as the metrics I'm collecting from my web server.
Does anyone have any ideas why this is happening?
I've attached an image of timings I've collected. The snapshots are from running the load test for 60 seconds, I'm timing it from a stopwatch.
The usual culprit in these kinds of situations is your servers can't handle more than that. In my experience, the behavior you'll see client side as the servers get overwhelmed is you'll start to see a slow but steady increase in response times. This is one big reason why Locust includes those in the metrics it shows you.
Based on what I'm seeing in your screenshots, this is most likely the case for you. You have some very low minimum times but your average, median, and 90%iles are a lot higher than your minimums; your maximums are very significantly higher than those. Without seeing your charts I can't know for sure but that's a big red flag.
For more things to look out for, check out this question in the FAQ (especially see the list of server stats to investigate):
https://github.com/locustio/locust/wiki/FAQ#increase-my-request-raterps
I have the following situation:
Two times a day for about 1h we receive a huge inflow in messages which are currently running through RabbitMQ. The current Rabbit cluster with 3 nodes can't handle the spikes, otherwise runs smoothly. It's currently setup on pure EC2 instances. The instance type is currenty t3.medium, which is very low, unless for the other 22h per day, where we receive ~5 msg/s. It's also setup currently has ha-mode=all.
After a rather lengthy and revealing read in the rabbitmq docs, I decided to just try and setup an ECS EC2 Cluster and scale out when cpu load rises. So, create a service on it and add that service to the service discovery. For example discovery.rabbitmq. If there are three instances then all of them would run on the same name, but it would resolve to all three IPs. Joining the cluster would work based on this:
That would be the rabbitmq.conf part:
cluster_formation.peer_discovery_backend = dns
# the backend can also be specified using its module name
# cluster_formation.peer_discovery_backend = rabbit_peer_discovery_dns
cluster_formation.dns.hostname = discovery.rabbitmq
I use a policy ha_mode=exact with 2 replicas.
Our exchanges and queues are created manually upfront for reasons I cannot discuss any further, but that's a given. They can't be removed and they won't be re-created on the fly. We have 3 exchanges with each 4 queues.
So, the idea: during times of high load - add more instances, during times of no load, run with three instances (or even less).
The setup with scale-out/in works fine, until I started to use the benchmarking tool and discovered that queues are always created on one single node which becomes the queue master. Which is fine considering the benchmarking tool is connected to one single node. Problem is, after scale-in/out, also our manually created queues are not moved to other nodes. This is also in line with what I read on the rabbit 3.8 release page:
One of the pain points of performing a rolling upgrade to the servers of a RabbitMQ cluster was that queue masters would end up concentrated on one or two servers. The new rebalance command will automatically rebalance masters across the cluster.
Here's the problems I ran into, I'm seeking some advise:
If I interpret the docs correctly, scaling out wouldn't help at all, because those nodes would sit there idling until someone would manually call rabbitmq-queues rebalance all.
What would be the preferred way of scaling out?
How made ban more than one container on a node with use: Docker, Swarm, Compose?
For example I have 5 nodes and I want deploy 3 replicas some service and I want that this replicas will be on different nodes.
Docker's swarm mode currently defaults to an HA scheduling strategy, so there's nothing needed to get your service spread across multiple nodes as long as there are multiple available nodes to schedule the task on. Constraints, memory restrictions, and outages may impact the available nodes that is has to schedule a task on. The HA scheduler first searches for candidate nodes that have the fewest instances of a task (typically 0 unless you have more replicas than nodes), and then it tie breaks the resulting list by favoring nodes with fewer tasks.
For further control to spread out a task on multiple nodes, I'd recommend looking into the placement preferences added in recent versions. I don't believe this has made it into stacks and the compose file yet, but I expect that's only a matter of time until it does (new swarm mode features are first introduced to the docker service command line and later added to higher level abstractions). The placement preferences allow you to label your nodes to indicate a higher level topology, e.g. common racks, switches, or datacenters. And the placement preferences enable topology aware scheduling that can spread the tasks out among nodes with unique values of the desired label. So if you have two separate racks with 5 nodes each, without the topology aware scheduling everything could be scheduled on a single rack, and with the topology aware scheduling, it could put half of your tasks on each rack.
I'm struggling to understand the idea of replica instances in Docker Swarm Mode. I've read that it's a feature that helps with high availability.
However, Docker automatically starts up a new task on a different node if one node goes down even with 1 replica defined for the service, which also provides high availability.
So what's the advantage of having 3 replica instances rather than 1 for an arbitrary service? My assumption was that with more replicas, Docker spends less time creating a new instance on another node in the event of failure, which aids performance. Is this correct?
What Makes a System Highly Available?
One of the goals of high availability is to eliminate single points of
failure in your infrastructure. A single point of failure is a
component of your technology stack that would cause a service
interruption if it became unavailable.
Let's take your example of a replica that consists of a single instance. Now let's suppose there is a failure. Docker Swarm will notice that the service failed and restart it. The service restarts, but a restart isn't instant. Let's say the restart takes 5 seconds. For those 5 seconds your service is unavailable. Single point of failure.
What if you had a replica that consists of 3 instances. Now when one of them fails (no service is perfect), Docker Swarm will notice that one of the instances is unavailable and create a new one. During that time you still have 2 healthy instances serving requests. To a user of your service, it appears as if there was no down time. This component is no longer a single point of failure.
ROMANARMY answer is very good and i just wanted to mention that the replicas can be on different nodes, so if one of your servers goes down(become unavailable) the container(replica) on the other server can be run without problem.
I wish to know is there any way that I can create the threads on other nodes without starting the process on the nodes.
For example :- lets say I have cluster of 5 nodes I am running an application on node1. Which creates 5 threads on I want the threads not to be created in the same system but across the cluster lets say 1 node 1 thread type.
Is there any way this can be done or is it more depends on the Load Scheduler and does openMP do something like that?
if there is any ambiguity in the question plz let me know I will clarify it.
Short answer - not simply. Threads share a processes' address space, and so therefore it is extremely difficult to relocate them across cluster nodes. And, if it is possible (systems do exist which support this) then getting them to maintain a consistent state introduces a lot of synchronization and communication overhead which impacts on performance.
In short, if you're distributing an application across a cluster, stick with multiple processes and choose a suitable communication mechanism.
generally, leave threads to vm or engine to avoid very inert locks, focus app or transport, if one, create time (200 hz=5ms heuristic), if 2, repaint, good pattern: eventdrive