What are Virtual threads
How to create Platform Threads V/s Virtual Threads V/s Carrier Threads
So using old Thread API, all following returns Platform thread which is mapped 1:1 to OS thread:
1) Using java.lang.Thread class to create Thread
1.1 )
Thread t = new Thread(() -> {
System.out.println("Running on: " + Thread.currentThread().getName());
});
t.start();
1.2 Extending Thread class
public class MyThread extends Thread {
@Override
public void run() {
System.out.println("Thread t1");
}
}
public static void main(String[] args) {
MyThread t1 = new MyThread();
t1.start();
System.out.println("Main thread finished");
}
2) Using ExecutorService
As creating Platform threads is expensive, we can not just create thousands of them, so to get around that problem, java has Thread Pools where in we can have defined pool of Threads which can be reused for multiple tasks.
2.1 )
ExecutorService executorService = Executors.newFixedThreadPool(8);
executorService.submit(() -> System.out.println("Hi"));
Threads are created lazily by thread pool on demand which means when a task is submitted then a thread is created to execute this task. When first task is submitted, first thread is created. If next task is submitted after first task is completed, same existing thread will be reused. If 2nd task is submitted before first task is completed, a new thread will be created to execute 2nd task.
So if 8 tasks are submitted concurrently, 8 threads will be created.
If 9 threads are submitted concurrently, still 8 threads will be created to execute 8 tasks as the max and core pool size defined via the Executors.newFixedThreadPool(8) is 8 and 9th tasks will be put in the queue(a blocking queue is used by fixed thread pool). Once one of the thread executing 8 tasks is freed, it will pick the 9th task.
But what will happen if multiple executing/running threads become idle at the same time, then who will pick the 9th task from the queue ?
And this is the reason fixed thread pool uses blocking queue. If it is not a blocking queue, threads will be in race condition. All idle threads might read the queue as not empty and will try to execute the task which can result in tasks being executed multiple times, exception or corrupted state. With blocking queue, while one thread is reading the task from queue, it will block the queue for further read.
Fixed thread pool is ideal to use for:
- Long running or blocking tasks (I/O tasks like reading/writing to database, files, Http call(e.g Rest API).
- For bounded concurrency, as we don't want to overwhelm CPU or memory. At any time, max n threads will be be in the pool, so 8 in above example.
2.2)
ExecutorService executorService = Executors.newSingleThreadExecutor();
executorService.submit(() -> System.out.println("Hi"));
Here also, thread is created lazily by thread pool on demand when task is submitted. However only one thread is created and is ever there. When more than one task(s) are submitted then remaining tasks are put in the blocking queue. Once the thread has completed the currently executing task, it picks the task from queue from front(FIFO), so the tasks are processed sequentially in the order they are submitted.
Single thread pool is ideal to use for:
- Sequential execution of tasks
- Decoupling submission from execution
2.3)
ExecutorService executorService = Executors.newCachedThreadPool();
executorService.submit(() -> System.out.println("Hi"));
Threads are created on demand as tasks are submitted. Uses blocking synchronous queue in which each insert operation must wait for a corresponding remove operation by another thread.
When 1st task is submitted, a new thread is created for executing the task. When 2nd task is submitted, if the first thread has become idle after executing first task, it executes the 2nd task else a new thread is created.
For n tasks submitted concurrently, n threads are created. As these are platform threads and are mapped to operating system threads and we have limited operating system threads, we can not create infinite number of threads via cachedThreadPool, so threads which are idle for more than 60 secs are terminated.
No definite core pool size needs to be maintained and maximum pool size theoretically can be as many as Integer.MA_VALUE.
Cached thread pool is ideal to use for:
- Short lived asynchronous tasks.
- Tasks coming intermittently or in burst.
2.4)
ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool(8);
scheduledExecutorService = scheduledExecutorService.schedule(() -> System.out.println("Hi"), 1000, TimeUnit.MILLISECONDS);
Threads in a ScheduledThreadPool are created lazily on demand, similar to a fixed thread pool. When the first scheduled task is scheduled, task is added to the delay queue and the 1st thread is created to execute it. Task is executed after give delay period. If the 2nd task is submitted after the 1st task completes, the same existing thread is reused. If a 2nd task is scheduled while the 1st is still running, a new thread is created (up to the core pool size) to execute it. A core pool size of 8 threads is maintained in the pool, even if they are idle.
3) Using CompletableFuture
CompletableFuture.supplyAsync(dbCall(), executor)
4) Using new Thread.Builder API
4.1
Thread.ofPlatform().start(() -> System.out.println("Platform Thread"));
4.2
ThreadFactory factory = Thread.ofPlatform().factory();
Thread t = factory.newThread(() -> System.out.println("task"));
t.start();
Virtual Threads:
And here is how we can create Virtual threads :
1. Using Thread.Builder API
1.1
Thread.ofVirtual()
.name("virtual thread-", 0)
.start(() -> System.out.println("Task"));
1.2
Thread.ofVirtual()
.factory()
.newThread(() -> System.out.println("Task"))
.start();
1.3
Thread.startVirtualThread(() ->
System.out.println("Creating and starting Virtual thread"));
2. Using ExecutorService
2.1
ThreadFactory factory = Thread.ofVirtual().factory();
ExecutorService executor = Executors.newThreadPerTaskExecutor(factory);
2.2
try (ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor()) {
executor.submit(() -> System.out.println("task"));
}
3. Using Structured Concurrency
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
Subtask<String> subtask1 = scope.fork(() -> callServiceA());
Subtask<String> subtask2 = scope.fork(() -> callServiceB());
scope.join();
return subtask1.get() + subtask2.get();
}
How Async API like CompletableFuture helps with scaling and issues with it
Why Virtual Threads ? What problem they solve ?
However, this model had following limitations :
- As the incoming HTTP request is executed by a Platform thread backed by OS thread, when this request comes across to the blocking operation(s) like writing to database, it blocks the Platform/OS thread and as each request is executed by a single thread, this thread just stays idle waiting for blocking operation(s) to finish, which means this thread can not utilize CPU, hence inefficient use of CPU.
If we consider example of Tomcat, typically it can handle maximum 200 concurrent requests with default thread pool size of 200 which means 200 Platform threads. If there are more than 200 concurrent requests those are either queued or rejected. And if these requests have lots o blocking operations like database call , writing to File or making HTTP call to another service, most of these threads will be sitting idle for quite some time for blocking operations to complete.
Also from memory perspective, for 200 requests
Memory used : 200 * 2 MB = 400 MB
It is not that much.
so which means we are not even hitting memory limits but with this model, CPU is highly under utilized and if want to scale our application, we need to add more severs which we also call horizontal scaling but horizontal scaling comes at a cost of buying more hardware.
To solve this, asynchronous programming constructs like CompletableFuture was introduced in Java 8, which makes sure that thread to handle incoming request is immediately freed up to handle other requests and the blocking operations are performed in separate thread(s) but the problem with CompletableFuture was/is that developers need to know lots of API related to it, business logic gets embedded in async code jargons and hence makes code difficult to read as well as difficult to debug.
Hence Java designers with Project Loom introduced Virtual threads.
Virtual threads make blocking cheap and they are also cheap to create.
Now for each incoming request, Tomcat uses VirtualThreadPerTaskExecutor, so instead of Platform thread, a Virtual thread is assigned by sever. JVM mounts this Virtual thread on to the Carrier thread.
When a blocking call happens, JVM saves the stack of this Virtual thread to the heap and unmounts this Virtual thread from Carrier thread which makes Carrier thread free and another Virtual thread can now be mounted onto it and hence another request can be handled, which results in the efficient use of CPU, increased throughput or number of requests handled per second.
So now with Virtual threads, if we want to scale, we can first scale vertically as same machine can handle more concurrent requests per second but eventually we need to add as needed server(s) horizontally for example to add more CPU cores or for high availability, fault isolation etc.

.jpg)
.jpg)



