Threads, @Async, @Transactional, and Virtual Threads: What Actually Happens Inside a Spring Boot Backend
Table of Contents
A webhook fires. One HTTP request comes in.
Ten seconds later, half the app is returning 503s.
The bug is not in the webhook.
This is the article I wish someone had made me read years ago. It’s about four concepts that almost every Spring Boot developer has used — @Async, @Transactional, thread pools, and the connection pool — without quite understanding how they interact. And it’s about a fifth concept, virtual threads, that arrived in Java 21 and quietly changed some of the trade-offs without changing others.
Ten questions, in the order any working developer should ask them. Each one builds on the last. By the end, the “why is my app pretending to be down when only one endpoint is slow” feeling should have a name.
1. The four building blocks, briefly
Before anything else, four things that get conflated all the time:
- Thread — a single worker that runs your code, one line at a time. When an HTTP request lands, one thread from the server’s pool picks it up and runs it top to bottom.
- Thread pool — a bounded set of workers. Tasks are submitted; the pool assigns them to free workers, queues them if none are free, and (depending on configuration) rejects or spawns more if the queue also fills.
- Connection pool — a bounded set of database connections. In a Spring Boot 3.x app, this is HikariCP by default, sized at 10 connections unless you override it. To do anything against the DB, code has to borrow one connection from this pool, use it, and return it.
- Transaction — a “do all of this, or none of it” DB session. In JDBC terms, a transaction is a state on a specific connection: you
setAutoCommit(false), you run statements, then youcommit()orrollback(). Holding an open transaction means holding a connection borrowed from the pool.
That last sentence is the whole reason this article exists. Transactions and connections are linked at the hip. Every time you have an open transaction, one of your ten connections is checked out. Everything downstream is a consequence of that fact.
2. What does @Async actually do?
You annotate a method with @Async. You call it. It returns immediately. Some other thread runs the body.
Underneath, this is done by an AOP proxy. Spring’s default advice mode for @Async is proxy-based (confirmed in the current Spring Framework reference docs), which means Spring wraps your bean in a subclass or interface implementation. When you call the @Async method, you don’t call your method directly — you call the proxy’s version, which does something like this:
// simplified — Spring's actual proxy is more sophisticated
public class MyBean$Proxy extends MyBean {
private final Executor executor;
@Override
public CompletableFuture<String> sendEmail(String to) {
return CompletableFuture.supplyAsync(
() -> super.sendEmail(to), // real method
executor // ← runs on someone else's thread
);
}
}
The caller thread returns as soon as the task is submitted. The body runs on the executor’s thread, whenever the executor is free to schedule it.
Two consequences worth naming immediately:
@Asyncnever works when called from inside the same class. The proxy only intercepts external calls.this.sendEmail(...)bypasses the proxy and runs synchronously. If you’ve ever wondered why your@Asyncmethod silently blocked the caller, this is almost always the reason.@Asyncis not the only way to run something on another thread. You can inject anExecutorServiceand submit tasks directly. You can push a message onto a queue (RabbitMQ, SQS, Kafka) and let a consumer pick it up. You can wire aTaskExecutorinto a Spring BatchJobLauncher(which is a common pattern for triggering long-running jobs off webhook threads).@Asyncis the annotation-flavoured, Spring-managed version of “throw it on another thread.” The others are less magical and, arguably, easier to reason about.
3. Which executor does @Async use by default?
This one catches almost everyone. The precise answer depends on what you’re running:
- In plain Spring Framework (no Spring Boot), the fallback for
@AsyncisSimpleAsyncTaskExecutor— it creates a brand new thread for every task, runs the task, then discards the thread. No pool. No cap. No reuse. - In Spring Boot (which is what most of us are actually using),
TaskExecutionAutoConfigurationprovides a bean namedapplicationTaskExecutor— aThreadPoolTaskExecutorwith sensible-sounding but load-hostile defaults: core pool size 8, unbounded max pool size, unbounded queue capacity.@Asyncuses this one by default. - In Spring Boot with
spring.threads.virtual.enabled=true(available from Spring Boot 3.2 onwards), theapplicationTaskExecutorswitches to aSimpleAsyncTaskExecutorbacked by virtual threads — one virtual thread per task, but since virtual threads are cheap (see section 10), that’s a very different trade-off than the platform-thread case.
The failure mode is the same across all three defaults, and the practical rule is the same across all three: none of them apply back-pressure. Spring Framework’s SimpleAsyncTaskExecutor never rejects because it never queues — it just spawns more platform threads. Spring Boot’s autoconfigured ThreadPoolTaskExecutor never rejects because its queue is unbounded — it just piles up tasks. Both leave you with the same problem: under load, everything just gets slower and slower until the JVM tips over.
That’s fine for a demo, or for occasional low-volume async work. It is not fine for a production backend receiving hundreds of concurrent requests. Two concrete failure modes:
- Thread exhaustion (platform threads only — not a concern with virtual threads). Each platform thread costs ~1 MB of stack space. A burst of 5,000 requests to your
@Async’d method spawns thousands of threads that fight every other pool in the JVM for CPU scheduling and memory. - No back-pressure (all default configurations). Because the queue is either non-existent or unbounded, tasks are never rejected. When the system can’t keep up, memory pressure grows, GC pressure grows, latency climbs, and eventually something snaps.
The practical rule: always define an explicit executor for @Async. A ThreadPoolTaskExecutor bean with a core size, max size, and a bounded queue. Something like:
@Bean(name = "emailNotificationExecutor")
public Executor emailNotificationExecutor() {
var pool = new ThreadPoolTaskExecutor();
pool.setCorePoolSize(2);
pool.setMaxPoolSize(4);
pool.setQueueCapacity(250);
pool.setThreadNamePrefix("email-notify-");
pool.initialize();
return pool;
}
Then @Async("emailNotificationExecutor") on the method. Now you know exactly how many threads can be running that workload at once, and you know what happens when the queue fills (the pool rejects, and you can decide what to do about it).
Rule of thumb: one named executor per workload. Not because you can’t share, but because sharing means two unrelated workloads can starve each other for threads under load. Naming them means you can reason about “how many of X are running” independently of every other workload.
Spring Boot 3.2+ also offers global virtual threads for @Async via spring.threads.virtual.enabled=true. That changes the picture a lot — see section 9.
4. What does @Transactional actually do?
Also an AOP proxy. The pattern is exactly the same shape as @Async — Spring wraps your bean in a proxy that intercepts calls to @Transactional methods and wraps them in transaction lifecycle code:
// simplified — the real transaction advisor is more sophisticated
public class MyBean$Proxy extends MyBean {
private final PlatformTransactionManager txManager;
@Override
public User createUser(String email) {
TransactionStatus tx = txManager.getTransaction(...);
try {
User result = super.createUser(email); // real method
txManager.commit(tx);
return result;
} catch (RuntimeException e) {
txManager.rollback(tx);
throw e;
}
}
}
Same “proxy calls the real method” mechanism as @Async. Same “self-calls don’t work” consequence — this.createUser(...) from within the same class skips the proxy and runs with no transaction.
The interesting part is what txManager.getTransaction(...) actually does. In a typical Spring configuration with DataSourceTransactionManager (used with plain JDBC / JPA), it:
- Checks whether there’s already a transaction open (see section 6 for how it checks).
- If not, borrows a connection from the connection pool.
- Calls
connection.setAutoCommit(false)on it. - Stores that connection in a thread-local so the rest of the code in this call chain can find it.
That last step is the piece that ties everything else in this article together.
5. The thread-local at the heart of everything
Spring stores the “current transaction” — really, the connection with an open transaction on it — in a class called TransactionSynchronizationManager, which internally uses ThreadLocal variables. When your code (or Hibernate, or Spring Data JPA) needs “the current connection to run this query on,” it asks TransactionSynchronizationManager.getResource(...), which reaches into the current thread’s ThreadLocal and returns whatever’s there.
The consequence: the transaction “belongs to” whichever thread opened it, and no other thread can see it.
Not “can’t see it because of security” — literally can’t see it, because the ThreadLocal is scoped to a single thread. Every thread has its own copy of the ThreadLocal storage. Thread A’s transaction is invisible to thread B, even if they’re in the same process, even if they’re accessing the same bean.
This is the single most important sentence in this whole article. Everything downstream is a consequence of it.
6. What happens when you cross a thread boundary while in a transaction?
Simple answer: nothing follows.
If a @Transactional method calls emailNotificationExecutor.submit(() -> saveAuditLog(...)), the code inside that lambda runs on a different thread. That different thread has its own empty ThreadLocal. It has no idea a transaction is open. Any DB write it does either:
- Runs with auto-commit — each
INSERTcommits immediately, ignoring the outer transaction’s intent. - Opens its own new transaction, on its own borrowed connection, committing independently.
Neither one participates in the outer transaction. If the outer transaction rolls back, the async work is not undone. If the outer transaction commits, the async work committed at whatever time it chose.
This is the same trap @Async sets. The moment your method hits an @Async boundary, the annotation you put on the caller stops applying inside the method. Two threads, two ThreadLocals, two independent transaction stories.
There’s a related trap: @Transactional on an @Async method. That combination is usually fine — Spring’s proxy chain applies both — but only for external callers of the async method. Inside the async method’s body, if it fires an event or submits another task to yet another executor, the transaction stops there again.
Practical mental model: every time a task hops threads, treat the transaction context as if it evaporates. If you need it, you have to open a fresh transaction on the new thread. Spring will never do this for you across a thread boundary.
7. The seven propagation levels — quick reference
Since a lot of this article is about “what if I’m already in a transaction when this method is called,” it’s worth naming all seven of the propagation options @Transactional(propagation = ...) supports. Most of the time you never touch this — REQUIRED (the default) is what you want. The others exist for the specific moments when it isn’t.
| Propagation | If a transaction is already open | If none is open |
|---|---|---|
REQUIRED (default) | Join it | Start a new one |
REQUIRES_NEW | Suspend it, run this method in a brand-new independent transaction, resume the old one when done | Start a new one |
NOT_SUPPORTED | Suspend it, run this method with no transaction, resume the old one when done | Run with no transaction |
SUPPORTS | Join it | Run with no transaction |
MANDATORY | Join it | Throw IllegalTransactionStateException |
NEVER | Throw | Run with no transaction |
NESTED | Create a savepoint inside the current transaction; rollback here rolls back to the savepoint | Start a new one |
Two of these come up often in real applications:
REQUIRES_NEW— used when you need an inner operation to commit independently of the outer, even if the outer rolls back. Classic case: writing an audit log for a failed transaction. If the outer rolls back and you’d joined its transaction, the audit log rollback would too — losing exactly the information you wanted to keep.REQUIRES_NEWgives the audit log its own transaction, on its own connection, that commits separately.NOT_SUPPORTED— used when you deliberately want to not be in a transaction. The most common reason is that the method inside is going to launch something that manages its own transactions (Spring Batch is the archetypal case), and you don’t want an outer transaction holding a connection hostage for the entire duration of that inner work.
The Hibernate teaching post on this site — “save, saveAndFlush, and REQUIRES_NEW” — goes deeper on REQUIRES_NEW specifically. This section is meant as a lookup table, not the full mental model.
One important detail: propagation changes require the same AOP proxy dance as everything else. Self-calls don’t apply propagation. If method A on a bean calls method B on the same bean, method B’s @Transactional(REQUIRES_NEW) is ignored, because the call didn’t go through the proxy.
8. Connection pools and the “held across slow I/O” antipattern
This is the part where all the previous sections converge into a specific production disaster.
Consider a method like this, which reads reasonable at first:
@Transactional
public void processInvoice(Long invoiceId) {
var invoice = invoiceRepo.findById(invoiceId).orElseThrow();
var pdf = pdfDownloader.downloadFrom(invoice.getSourceUrl()); // 3 seconds
var ocrResult = ocrClient.extractText(pdf); // 2 seconds
invoice.setExtractedText(ocrResult.text());
invoiceRepo.save(invoice);
auditService.recordProcessed(invoice);
}
The @Transactional opens a transaction at method entry and commits at method exit. That means the connection is borrowed at the top and returned at the bottom.
Now count how long the connection is actually held: the entire method runtime — about 5 seconds. Now count how long that connection is actually being used for DB work: two reads, one write, one insert — maybe 50 milliseconds all told. The remaining ~4,950 milliseconds are the connection sitting there completely idle, waiting for the PDF to download and the OCR to finish.
This is what “held across slow I/O” means. Your app has ten connections. If four invoices come in at the same moment, four connections are held for five seconds each doing nothing DB-related. Every other endpoint in the entire app — the login endpoint, the health check, the admin dashboard — now competes for six connections instead of ten. Add two more concurrent invoice uploads and you’re at zero connections free for anything else. Every other request queues up waiting for a connection. The load balancer starts timing out at 30 seconds and returns 503s.
From the outside, the whole app looks down. From the inside, nothing crashed — one endpoint just happens to hold six of ten phone lines for five seconds each, and everyone else is waiting on the phone.
The three fixes that address different parts of this:
- Move the slow I/O outside the transaction. Do the download and OCR first, on a fresh thread with no transaction. Then open a short
@Transactionalblock only for the DB writes. Now the connection is held for ~50ms per invoice instead of ~5s. - Move the whole thing off the request thread. Even before you shorten the DB window, if the request thread is the one running this five-second job, that request thread is unavailable to answer any other HTTP request. Use
@Async(with an explicit bounded executor) so the request thread returns immediately. - Cap concurrency with a bounded executor. If a burst of 500 invoices lands at once, and each one is going to hold a connection for five seconds, you don’t want 500 of them running simultaneously — you’d need 500 connections. A bounded executor with, say, three workers and a queue means at most three run at once. The other 497 sit in the queue holding zero connections until it’s their turn.
None of these are alternatives — they’re layered. Fix 3 caps how many concurrent jobs can try to grab connections. Fix 2 stops the request thread from being one of the workers. Fix 1 shortens how long each running worker actually holds a connection.
One subtlety worth naming when you do Fix 2 — moving work off the request thread. You have to decide where the boundary falls between “commit to doing this work” and “actually do the work.” If your webhook accepts a request, returns 200 immediately, and only then submits the actual work to an executor, you’ve introduced a silent-loss risk: if that executor submission fails (queue full and rejection policy fires; transient DB blip while writing whatever “we accepted this” record; JVM eviction) — the caller already got 200 and won’t retry. In webhook systems that rely on the sender’s redelivery (Google Pub/Sub, most vendor webhooks, most message queues), a 200 response is a contract that says “we’ve persisted the fact that we owe you this work.” Breaking that contract means dropped events.
The correct pattern is to split the two phases explicitly:
- On the request thread, synchronously: do the bookkeeping that records “we’ve committed to this work.” Usually one small DB write — enough that if the process crashes right now, you can recover. Return 5xx if this step fails, so the sender retries.
- On the pool thread, asynchronously: do the actual slow work, guided by the record you just persisted.
Spring Batch’s TaskExecutorJobLauncher illustrates most of this pattern in Spring’s own libraries — with one subtlety that’s worth naming, because it teaches a deeper lesson about async recovery than the “clean” version does.
Its run() performs the launch bookkeeping (dedup check plus writing a JobExecution row that officially records “this job has started”) synchronously on the caller’s thread. Pre-execution validation errors (JobExecutionAlreadyRunningException, JobInstanceAlreadyCompleteException, JobRestartException) and DB errors during that bookkeeping propagate as exceptions. The caller sees them, can return 5xx, and the sender retries. That’s the “sync bookkeeping means sync error surface” half.
The subtlety: if the executor itself rejects the task (queue full, thread pool exhausted), TaskExecutorJobLauncher catches that internally, updates the JobExecution row to FAILED, and returns normally. The caller sees no exception. Overflow doesn’t cause the sender to retry — it produces a persisted FAILED row, and recovery moves from “the sender retries” to “our own scheduled sweep retries later.” Spring Batch users typically pair this with a scheduler polling FAILED job executions on some interval.
That difference is the more general lesson worth internalising. Each failure mode in your sync/async split needs its own recovery mechanism, and you have to be explicit about which is which. Bookkeeping failures on the request thread → the sender retries via HTTP redelivery. Executor overflow after bookkeeping succeeded → some scheduled sweep retries via the persisted FAILED state. In-flight execution failures inside the async work → whatever retry your job layer provides internally. There’s no single async pattern that hands you a “no lost events, always retried” guarantee for free — you have to design a recovery answer for each of the three, and if any one of them has no answer, that’s your silent-loss failure mode.
Every DB-touching method in a real backend has to answer three orthogonal questions: which pool of threads am I on, am I inside a transaction (which means: am I holding a connection right now), and how long is that connection actually held vs how much of that time is real DB work. Most production concurrency bugs come from a change that silently answers one of those three wrong. And every “move to async” refactor has to answer a fourth: at what point have we persisted the promise, and is the response we return to the caller earlier or later than that point?
9. One workload per pool
A related discipline that saves a lot of debugging: define one named executor per workload, and never share them.
The reason isn’t ideological. It’s operational. Consider what happens when two workloads share a pool:
- Workload A: email notifications. Fires ~10/minute, each takes 200ms.
- Workload B: nightly report generation. Fires 3 times a day, each takes 45 seconds.
If both use the same 4-thread pool, then during the ~2 minutes B is running, A has 4 minus (however many B is using) threads available. If B is using all four (which it likely is — it’s CPU-bound work with no I/O breaks), A queues up. Emails get delayed. Someone eventually calls in wondering why the password reset email took eight minutes.
Separate pools mean A and B can never starve each other. A four-thread pool for emails, a two-thread pool for reports, they operate independently. Under a burst on either, only that workload’s users are affected.
The specific naming — emailNotificationExecutor, reportGenerationExecutor — also makes stack traces and thread dumps far easier to interpret. When you see a thread named email-notify-3, you know instantly what work is happening on it.
The gotcha to name explicitly: defining a bean doesn’t wire it to anything. The emailNotificationExecutor bean runs work only because some @Async("emailNotificationExecutor") annotation, somewhere, points at it. If you define the bean and nothing references it, the pool just exists with zero utilisation. Naming pools and pointing workloads at them are two separate acts.
10. What virtual threads change (and don’t)
Java 21 shipped virtual threads — an implementation of threads that are cheap to create, cheap to block, and managed by the JDK rather than the OS. A single JVM can happily run millions of virtual threads. Under the hood, virtual threads run on a small pool of carrier OS threads (typically a ForkJoinPool), and when a virtual thread blocks on I/O, the JDK unmounts it from its carrier, lets the carrier pick up another virtual thread, and remounts the blocked one when its I/O completes.
The pitch, as sold, is: “stop worrying about thread pool sizing — every task gets its own thread, the JVM figures out how to schedule them, and blocking on I/O is now free.”
For code that spends most of its time blocked on I/O (webhooks, REST clients calling other services, database work), this is largely true and it does simplify a lot of code. Spring Boot 3.2 added spring.threads.virtual.enabled=true, which routes several places — Tomcat request threads, @Async, @Scheduled — onto virtual threads if you turn it on.
What virtual threads do not change:
- The connection pool is still capped. You can have a million virtual threads all trying to do DB work, but if HikariCP has ten connections, only ten can talk to the DB at once. The other 999,990 will block waiting to borrow a connection. Virtual threads make blocking on connection acquisition cheap — but they don’t make it free of contention. The rate at which you can do actual DB work is still bounded by the connection pool.
- The “connection held across slow I/O” antipattern is still fatal. In fact it’s worse — because virtual threads make it easier to have 10,000 concurrent tasks in flight, if each of them is holding a connection open for 5 seconds while doing external I/O, you’ll run through your connection pool even faster than before.
- The transaction ThreadLocal still doesn’t cross thread boundaries. Virtual threads are threads.
TransactionSynchronizationManageruses aThreadLocal, and each virtual thread gets its own copy. Firing off an@Asyncfrom inside a@Transactionalmethod still loses the transaction, regardless of whether the executor is real threads or virtual threads.
Virtual threads had one significant historical footgun: pinning. If a virtual thread was inside a synchronized block, or calling a native method, it could not be unmounted from its carrier thread. It “pinned” the carrier, and the JVM’s scheduler couldn’t reuse the carrier for other work. Under load, this could reduce the effective parallelism dramatically. Since many JDBC drivers and older libraries used synchronized internally, this made virtual threads risky for exactly the workloads (DB-heavy) they were most attractive for.
That footgun was fixed in JDK 24 (March 2025) via JEP 491, which is marked Closed / Delivered in the OpenJDK issue tracker. On JDK 24+, virtual threads inside synchronized blocks unmount and remount correctly, so the pinning problem largely goes away. Native method pinning remains, but is much rarer in practice.
Practical takeaways for using virtual threads with Spring Boot:
- If you’re on JDK 21-23, be cautious. Test under realistic load before flipping
spring.threads.virtual.enabled=truefor DB-heavy workloads. - If you’re on JDK 24 or later, the synchronized-block pinning issue is resolved. Virtual threads are a much safer default.
- Virtual threads don’t change the transaction / connection pool story at all. The mental model in sections 5-8 still applies verbatim.
- The one thing virtual threads meaningfully change: you can stop over-thinking thread pool sizing for I/O-bound work. But you still have to think hard about connection pool sizing and transaction scope.
The oversimplified summary: virtual threads make it cheaper to have many threads waiting. They do nothing to make it cheaper to do actual DB work concurrently. The connection pool remains the bottleneck; the transaction still binds to a single thread’s ThreadLocal.
Pulling it all back together
The mental model that fixed Spring concurrency for me, in one paragraph:
Every method annotated with @Async runs on some executor’s thread — pick the executor explicitly, size the pool for the workload, never share pools between unrelated workloads. Every method annotated with @Transactional opens a transaction on a connection borrowed from the connection pool, holds that connection for the entire method’s runtime (including any slow I/O trapped inside), and stores the transaction context in a ThreadLocal scoped to the current thread. Crossing any thread boundary — @Async, a raw executor, a scheduled task, an event listener — loses the transaction, because ThreadLocals don’t follow threads. Under load, the failure mode that will bring your app down before any other is “the connection pool exhausts because too many transactions are holding connections idle across slow external calls,” and the fix layers are: move the slow work off the request thread, cap concurrency with a bounded executor, and shorten the window during which each transaction actually holds a connection. Virtual threads (JDK 21+, ergonomically better from JDK 24+) make thread pool sizing less critical for I/O-bound workloads but change none of the connection-pool or transaction mental model.
That’s the whole shape. There’s plenty more surface area — programmatic transaction management, structured concurrency (JEP 505 delivered in JDK 25, still in preview as of that release, with further iteration planned via JEP 525), reactive alternatives, message-queue-based decoupling — but every one of those is built on the same five concepts: threads, thread pools, connection pools, transactions, and the AOP proxies that wrap them.
If you take one thing away, take this: every @Transactional method is renting a phone line to the database for the entire duration of the method call. Anything else the method does while it holds that line — HTTP calls, file uploads, external OCR, sleep, Thread.sleep, waiting on a countdown latch — is time the line is checked out and unavailable to anyone else. When you decide where to put @Transactional, you’re deciding how long that phone line is held. That decision, more than any other single choice in a Spring Boot backend, decides whether your app scales.



