[TIL-4] Little’s law, Tuning the thread pool size
Back then, when the company where I work was creating an event that expected to have 10 x incoming traffic, I was assigned a task by my Tech Lead to conduct stress tests for our application services.
This is necessary to know how many services we needed to scale out and find out the bottleneck at our application services.
During the stress tests, we found that our application services have problems with the thread pool. And we decided to tune the number of our thread pool.
If we follow “measure, don’t assume” we need to initially take a gander at the technology that is being referred to and ask what measurements make sense and how we instrument our framework to acquire them. We also need to carry some mathematics to the table.
I’m asking my Tech Lead what is the formula or rule of thumb to know how many thread pools we need, he said that there’s a law that is called Little’s law, and then we calculated together and found out the suitable size for our thread pool.
After that, I’m trying to dig out that Little’s law. And here’s what I’ve found by exploring internet sources.
Thread Pool
Before we go through Little’s law, it’s a good idea to know about “What is a thread pool?”
In web applications, thread pool size decides the number of concurrent requests that can be taken care of at any given time.
In the event that a web application gets a greater number of requests than thread pool size, overabundance requests are either queued or rejected.
Little’s law
In queueing theory, a discipline within the mathematical theory of probability, Little’s result, theorem, lemma, law, or formula is a theorem by John Little which states that the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system
Expressed algebraically the law is
L = λW
In context web applications we could say
The average number of threads in a system (Threads) is equal to the average number of request arrival rate (Request per sec), multiplied by the average response time (Response time).
Threads = Number of Threads
Request per sec = Number of Web Requests that can be processed in one second
Response time = Time is taken to process one web request
Threads =Request per sec x Response time
Example
Requests are showing up at a worker at the pace of 10 Requests/second. Each request takes 2 seconds to finish. What number of threads are required?
λ =10 rps, W = 2 seconds
L = 10 * 2
L = 20
Therefore for the case above we need 20 threads to handle the pace