What is the default queue size and max concurrent connections/requests supported by kestrel? - kestrel-http-server

I am reading that kestrel theoretically supports unlimited concurrent connections.
Is there a default value of the max number of concurrent connections and requests?
If the value goes beyond then is there some kind of queue and if so then what is the queue size for conndctions/requests?

Is there a default value of the max number of concurrent connections?
No, it is infinity.
If the value goes beyond then is there some kind of queue and if so then what is the queue size?
If configured, it will start to reject connections, it won't queue them.
I assume you meant to ask about connections and not requests? For requests we have https://www.nuget.org/packages/Microsoft.AspNetCore.ConcurrencyLimiter/

Related

Implement queue priority in rabbitmq

I'm using rabbitmq in my application. There are 2 queues for different kind of requests. On consumer side i've set up a few listeners that are listening to these 2 queues .Whenever a message comes up, these listeners start processing it. i need to make sure that messages in one queue are treated as high priority item than the messages in other queue.
The problem is AMQP queues are FIFO queues. If a low priority message comes before the high priority message, that will be consumed first .
i need to do the implementation in such a way that if any listener gets free, it should serve the high priority queue first irrespective of the time of arrival of the message in high priority queue.
For example there are 3 requests in low priority queue and 2 listeners at consumer end. Since there is no request in high priority queue, both listeners take the messages from low priority queue and process them. Meanwhile, another message gets queued in high priority queue. At this point of time, there is one request in high priority and one in low priority. In my current implementation, the available listener is consuming the message from low priority queue first .
I'd like to know how can i set the queue priority such that the listener always processes the messages in high priority queue first
Please help me
FYI The publisher is a Grails application and the consumer is a Java App and at grails end, i'm using rabbitmq plugin to publish the messages.
Thanks!

How to determine concurrency (threads) while using shoryuken for background jobs?

In my Ruby on Rails application, I'm using shoryouken for background processing. I've many sqs queues (6-7) in my application. One of the queue has 2000-3000 jobs and it takes around 3 hours for the worker to process these 2-3k jobs with a default concurrency of 25. So based on what factors can we decide to increase the concurrency (which is the number of threads to process jobs). Please do comment if anything is unclear in the question.
Concurrency defaults to 25, but can be changed by altering your shoryuken.yml configuration (see below) or by adding the concurrency argument as so: shoryuken -c {desiredCount}
concurrency: 25 # Update with your desired value.
delay: 25 # The delay in seconds to pause a queue when it's empty. Default 0
queues:
- [high_priority, 6]
- [default, 2]
- [low_priority, 1]
You will need to test the optimal value for performance as you'll run into I/O and CPU bottlenecks as number of concurrent threads rises. Once you've reached the optimal value for your instance(s), you'll need to either increase the number of instances running this job or upgrade the instance(s).
If the bottleneck exists instead on your DB or other resource, you'll need to adjust it accordingly. (Not likely to be the case, but included for thoroughness' sake)
EDIT: Optimizing Performance
In response to your question on optimizing the thread count, the quickest/best way to determine the optimal concurrency value is to change concurrency and measure real-world throughput. There's other approaches, but the golden rule for performance is always to measure in a live production environment. Synthetic benchmarks are only helpful to the extent that they mirror real-time performance. (See also: premature optimization).
This is a case where you can easily end up overthinking things (then again, overthinking things is a perennial problem in development). Just measure with the appropriate metrics (CPU utilization, memory utilization, number of jobs completed per minute), and change the number of threads until you either maximize throughput or run into a bottleneck.
If your tasks are CPU bound you'll see your CPU utilization maxing out. If your tasks are I/O bound you'll see that after some point an increase in concurrent threads does not translate to an increase in throughput even though your CPU utilization fails to rise.
An I/O bottleneck can happen when any of the resources you're reading/writing are unable to keep up with your CPU demands. This includes system resources (memory, disk space), your database performance (DB CPU utilization, read/write limits), as well as other APIs you're connecting with. Network capacity is also a theoretical bottleneck but if it was you'd be big enough to have hired someone with experience in this area. Because there's so many different ways for this to happen, the only real way to figure it out what the bottlenecks are is to have your monitoring in place.
Re: formula, the short answer is that there's no one formula that you can use in this case. The long answer is probably yes, but you'd arrive at the optimum value in the course of collecting all the values you'd need to calculate it.
EDIT 2 : Concurrency, Latency, and Throughput
I realized I forgot to add one more piece of advice. When you're working with background tasks that users are not waiting for, your throughput (jobs per unit of time) is the only thing you want to optimize. Do not optimize for individual job time. It also means you cannot profile the current (and presumably un-bound) performance and get useful data because bottlenecks/constraints are target dependent. The constraints that exist for throughput will NOT be the same as the constraints that exist for individual task time.
(Technically speaking, your concurrency setting is your current constraint)
Three main factors are
Number of Cores
Type of Job - I/O or CPU bound
Is there another application or process running on server
Ideally for a cpu bound task keep number of thread to number of cpu cores.
For I/O bound task it requires benchmarking and calculating wait time for an I/O, and then you can decide the optimal value. For rough estimate if you have 4 cores than for I/O bound task you must keep at max 8 threads.
If you have your rails app running on the same then you will need to reduce number of cores.
Increasing the number of cores will not increase your performance if your system doesnt support.
Refer : http://baddotrobot.com/blog/2013/06/01/optimum-number-of-threads/

What should pool size be, w/ relation to workers, processes, and threads?

Is the following correct?
A dyno/worker and process are the same thing.
Puma, the webserver, can fork off multiple dynos/workers/processes for the application.
At the lower level of abstraction, each process has multiple threads (at least in multi-threaded webserver like Puma?). In Rails 5, this is set to 5 threads by default (/config/puma.rb) (as an aside, what determines a good threading number?
Each of these threads can handle and respond to one request at a time, and if interacting with the database must have a connection to read and write.
Each thread has a maximum of one connection to the database at any time, and the pool (/config/database.yml) is simply a cache of connections to the database (so threads can pickup connections more quickly?).
So, the highest value the pool need ever be is the number of processes (dynos) times the number of threads per process, while remaining less than or equal to the max connections to the database?
Is the pool size setting per dyno/worker/process, or for the whole application? If per process, then the pool should be the same size as the threads on one process, not times all of them?
Thanks!

When is it appropriate to increase the async-thread size from zero?

I have been reading the documentation trying to understand when it makes sense to increase the async-thread pool size via the +A N switch.
I am perfectly prepared to benchmark, but I was wondering if there were a rule-of-thumb for when one ought to suspect that growing the pool size from 0 to N (or N to N+M) would be helpful.
Thanks
The BEAM runs Erlang code in special threads it calls schedulers. By default it will start a scheduler for every core in your processor. This can be controlled and start up time, for instance if you don't want to run Erlang on all cores but "reserve" some for other things. Normally when you do a file I/O operation then it is run in a scheduler and as file I/O operations are relatively slow they will block that scheduler while they are running. Which can affect the real-time properties. Normally you don't do that much file I/O so it is not a problem.
The asynchronous thread pool are OS threads which are used for I/O operations. Normally the pool is empty but if you use the +A at startup time then the BEAM will create extra threads for this pool. These threads will then only be used for file I/O operations which means that the scheduler threads will no longer block waiting for file I/O and the real-time properties are improved. Of course this costs as OS threads aren't free. The threads don't mix so scheduler threads are just scheduler threads and async threads are just async threads.
If you are writing linked-in drivers for ports these can also use the async thread pool. But you have to detect when they have been started yourself.
How many you need is very much up to your application. By default none are started. Like #demeshchuk I have also heard that Riak likes to have a large async thread pool as they open many files. My only advice is to try it and measure. As with all optimisation?
By default, the number of threads in a running Erlang VM is equal to the number of processor logical cores (if you are using SMP, of course).
From my experience, increasing the +A parameter may give some performance improvement when you are having many simultaneous file I/O operations. And I doubt that increasing +A might increase the overall processes performance, since BEAM's scheduler is extremely fast and optimized.
Speaking of the exact numbers – that totally depends on your application I think. Say, in case of Riak, where the maximum number of opened files is more or less predictable, you can set +A to this maximum, or several times less if it's way too big (by default it's 64, BTW). If your application contains, like, millions of files, and you serve them to web clients – that's another story; most likely, you might want to run some benchmarks with your own code and your own environment.
Finally, I believe I've never seen +A more than a hundred. Doesn't mean you can't set it, but there's likely no point in it.

Rails app connection pool size, avoiding max pool size issues

I am running a JRuby on Rails application. I see a lot of this randomly in my logs:
The max pool size is currently 5; consider increasing it
I understand I can increase the max pool size in my configuration to address this. The problem I'm looking to address is to understand what the optimal number should be. I am trying to avoid contention issues for connections. Clearly setting this number to something obnoxiously large will not work either.
Is there a general protocol to follow to know your apps optimal pool size setting?
From here,
The optimum size of a thread pool depends on the number of processors available and the nature of the tasks on the work queue. On an N-processor system for a work queue that will hold entirely compute-bound tasks, you will generally achieve maximum CPU utilization with a thread pool of N or N+1 threads.
For tasks that may wait for I/O to complete -- for example, a task that reads an HTTP request from a socket -- you will want to increase the pool size beyond the number of available processors, because not all threads will be working at all times. Using profiling, you can estimate the ratio of waiting time (WT) to service time (ST) for a typical request. If we call this ratio WT/ST, for an N-processor system, you'll want to have approximately N*(1+WT/ST) threads to keep the processors fully utilized.
Processor utilization is not the only consideration in tuning the thread pool size. As the thread pool grows, you may encounter the limitations of the scheduler, available memory, or other system resources, such the number of sockets, open file handles, or database connections.
So profile your application, if your threads are mostly cpu bound, then set the thread pools size to number of cores, or number of cores + 1. If you are spending most of your time waiting for database calls to complete, then experiment with a fairly large number of threads, and see how the application performs.

Resources