Reader Promise
By the end of this chapter, a beginner should be able to explain the difference between an OS thread, an SPDK reactor, and an spdk_thread; trace a message sent by spdk_thread_send_msg(); understand how pollers run; and diagnose wrong-thread assertions, blocked reactors, leaked pollers, and thread-exit hangs.
This is one of the most important chapters in the book. Many SPDK bugs are not algorithm bugs. They are ownership bugs: code runs on the wrong spdk_thread, blocks a reactor, keeps an io_channel too long, or forgets to unregister a poller.
Mental Model
Use this vocabulary precisely:
- OS thread: the kernel-scheduled execution context, usually pinned to a CPU core by DPDK/SPDK.
- Reactor: SPDK's per-core event loop object. A reactor owns a list of lightweight SPDK threads and event queues.
spdk_thread: a lightweight cooperative context. It has message queues, pollers, io_channels, stats, and a cpumask.- Message: a function pointer plus context enqueued to an
spdk_thread. - Poller: a callback that runs repeatedly on an
spdk_thread, either every loop or on a timer.
Prose diagram:
CPU core 3
OS thread "reactor_3"
reactor object for lcore 3
spdk_thread "app_thread"
message ring
active pollers
timed pollers
io_channels
spdk_thread "nvmf_tgt_poll_group_3"
message ring
active pollers
io_channels
The reactor runs the OS thread. The reactor polls each spdk_thread. Each spdk_thread drains messages and runs pollers.
Source Anchors
include/spdk_internal/event.h:struct spdk_reactor,spdk_reactors_init(),spdk_reactors_start(),spdk_reactors_stop()lib/event/reactor.c:spdk_reactors_init(),reactor_construct(),spdk_reactors_start(),reactor_run(),_reactor_run(),reactor_post_process_lw_thread(),spdk_reactors_stop()include/spdk/thread.h:spdk_thread_create(),spdk_thread_poll(),spdk_thread_send_msg(),spdk_for_each_thread(),spdk_poller_register(),spdk_poller_unregister(),spdk_thread_exit()lib/thread/thread.c:struct spdk_thread,spdk_thread_create(),spdk_set_thread(),spdk_get_thread(),spdk_thread_poll(),thread_poll(),msg_queue_run_batch(),spdk_thread_send_msg(),poller_register(),thread_execute_poller(),thread_execute_timed_poller(),spdk_poller_unregister(),spdk_for_each_thread(),spdk_thread_exit(),thread_exit()lib/event/app_rpc.c:rpc_framework_get_reactors(),_rpc_framework_get_reactors()
Reactor Initialization
lib/event/reactor.c:spdk_reactors_init() creates the event framework's reactor state.
It:
- creates
g_spdk_event_mempool - allocates the
g_reactorsarray aligned to 64 bytes - initializes the thread library with
spdk_thread_lib_init_ext() - constructs a reactor for each env core
- records the scheduling reactor
- sets reactor state to initialized
The thread library call is crucial. Reactors cannot run spdk_thread objects until the thread library exists, because spdk_thread uses message mempools, message rings, poller queues, and io_channel registries.
Reactor Start
lib/event/reactor.c:spdk_reactors_start() sets the reactor state to running, launches a reactor OS thread on every selected core except the current core, and then runs the current core's reactor inline.
That last detail explains why spdk_app_start() blocks: the main OS thread becomes a reactor runner until shutdown.
lib/event/reactor.c:reactor_run() is the long-running loop. It:
- names the POSIX thread
reactor_<lcore> - registers trace ownership
- repeatedly runs either interrupt mode handling or
_reactor_run() - periodically performs scheduler work if enabled
- exits when reactor state changes
- drains and destroys remaining
spdk_threadobjects
What _reactor_run() Does
lib/event/reactor.c:_reactor_run() is the normal polling loop body.
It:
- gets the spdk_thread - calls spdk_thread_poll(thread, 0, reactor->tsc_last) - updates reactor busy or idle time based on return code - post-processes the lightweight thread
- Runs a batch of reactor events.
- If the reactor has no SPDK threads, accounts idle time and returns.
- For each lightweight thread on the reactor:
The important point: a reactor does not call arbitrary module code directly. It calls spdk_thread_poll(), and the thread runs messages and pollers.
spdk_thread Structure
lib/thread/thread.c:struct spdk_thread contains:
- active pollers queue
- timed pollers tree
- paused pollers queue
- message ring
- local message cache
- critical message slot
- io_channel tree
- cpumask
- state
- lock count
- interrupt-mode state
- trace ID
- user context
This is why spdk_thread is more than "a callback queue." It is the unit of SPDK ownership for pollers, messages, and per-thread device resources.
Creating An spdk_thread
lib/thread/thread.c:spdk_thread_create():
- allocates cache-line-aligned memory
- copies or initializes the cpumask
- initializes io_channel and poller containers
- creates a message ring
- fills a local message cache from
g_spdk_msg_mempoolif possible - assigns a name and trace ID
- assigns a monotonic thread ID
- inserts the thread into the global thread list
- calls the reactor thread-op hook so the event framework can schedule it
- marks the thread running
- records the first created thread as the app thread
The event framework created the app thread in lib/event/app.c:spdk_app_start(). Other modules create their own SPDK threads when they need separate lightweight contexts.
Messages
spdk_thread_send_msg(thread, fn, ctx) is the standard cross-thread handoff.
lib/thread/thread.c:spdk_thread_send_msg():
- Checks that the target thread is not exited.
- Tries to take a message object from the sender's local cache.
- Falls back to
g_spdk_msg_mempool. - Stores
fnandctx. - Enqueues the message to the target thread's message ring.
- Sends a notification if needed.
The function is asynchronous. It does not call fn. It only queues the work.
The message will run when the target thread is polled by its reactor and msg_queue_run_batch() drains messages inside thread_poll().
Beginner rule:
If you need code to run on a different spdk_thread, send a message. Do not call the function directly unless the function explicitly allows it.
Pollers
A poller is a callback registered on the current spdk_thread.
lib/thread/thread.c:poller_register() requires spdk_get_thread() to be non-NULL. It allocates a struct spdk_poller, names it, records the callback and argument, assigns a per-thread poller ID, converts the period from microseconds to ticks, initializes interrupt support if needed, and inserts it into either:
- active pollers, if period is zero
- timed pollers, if period is nonzero
Public wrappers:
include/spdk/thread.h:spdk_poller_register()include/spdk/thread.h:spdk_poller_register_named()include/spdk/thread.h:SPDK_POLLER_REGISTER()
Poller return values matter:
0means idle.- Positive means busy.
- Negative is allowed for some debug/reporting paths but does not mean "unregister me."
The reactor and thread stats use idle and busy return values to track work.
How A Poller Runs
Inside lib/thread/thread.c:thread_poll():
- A critical message runs first if present.
- A batch of regular messages is drained.
- Active pollers are executed.
- Post-poller handlers run if registered.
- Timed pollers whose deadline has passed are executed.
Active pollers are round-robin by queue movement. Timed pollers live in an RB tree keyed by next run time.
thread_execute_poller() and thread_execute_timed_poller() both assert that thread->lock_count == 0 after the callback. This is the source of lock-count asserts when code holds an SPDK spinlock across a point where SPDK expects cooperative progress.
The No-Blocking Rule
A reactor is a cooperative event loop. If a poller blocks, that OS thread stops polling every other spdk_thread assigned to that reactor.
Do not:
- sleep in a poller
- perform blocking filesystem I/O in a hot callback
- wait synchronously for an RPC response from the same framework
- hold locks across callbacks that may pump SPDK threads
- busy-loop inside a poller instead of returning and letting the reactor continue
Use:
- messages for ownership handoff
- pollers for repeated progress
- async callbacks for completion
- NOMEM or retry queues for resource pressure
Wrong-Thread Assertions
SPDK APIs often require that operations happen on the same spdk_thread that owns the object.
lib/thread/thread.c:wrong_thread() logs the function, object name, current thread, and expected thread, then asserts.
Common causes:
- unregistering a poller from a different thread than the one that registered it
- putting an io_channel from the wrong thread
- calling module-specific functions on a callback thread rather than the resource owner thread
- mixing OS thread identity with
spdk_threadidentity
Misconception to kill:
"I am on the same CPU core, so I am on the right SPDK thread." Not necessarily. Ownership is spdk_thread, not just core.
Thread Exit
lib/thread/thread.c:spdk_thread_exit() marks a thread as exiting. It does not instantly free the thread.
lib/thread/thread.c:thread_exit() waits until:
- message ring is empty
- no
spdk_for_each_thread()orspdk_for_each_channel()operations are outstanding - active pollers are unregistered
- timed pollers are unregistered
- paused pollers are gone
- io_channels are released
- pending io_device unregisters are complete
Only then does the state become exited. lib/event/reactor.c:reactor_post_process_lw_thread() sees an exited and idle thread, removes it from the reactor, and destroys it.
If shutdown hangs, inspect the thread for remaining messages, pollers, io_channels, or outstanding foreach operations.
Interrupt Mode
SPDK's classic model is polling. This tree also supports interrupt mode. In reactor code, reactor_run() chooses reactor_interrupt_run() when reactor->in_interrupt is true. In thread code, spdk_thread_poll() waits on the thread fd group when the thread is in interrupt mode.
For beginners, the important distinction:
- Poll mode repeatedly calls pollers for low latency and high CPU use.
- Interrupt mode waits on file descriptors where supported, reducing CPU but adding complexity.
Do not assume every poller or device path has the same interrupt-mode behavior.
Edge Cases And Failure Modes
- Message mempool exhaustion:
spdk_thread_send_msg()aborts if it cannot allocate a message. - Message ring enqueue failure: aborts.
- Target thread exited: sending a message aborts.
- Poller registered outside any
spdk_thread: assert path. - Poller unregistered from the wrong thread: wrong-thread assert.
- Poller callback blocks: reactor stalls.
- Poller callback returns busy forever: stats show busy even if no useful work happens.
- Thread exit with active pollers: exit waits and logs.
- Thread exit with io_channels: exit waits and logs.
- Reactor shutdown with non-app running threads: logs that
spdk_thread_exit()was not called.
Misconceptions To Kill
- "
spdk_threadis a pthread." It is not. It is a lightweight SPDK context run by a reactor. - "Messages run immediately." They run later when the target thread polls.
- "Pollers are background threads." They are callbacks on an
spdk_thread. - "A timed poller runs exactly at its period." It runs when the thread is polled and its deadline has passed.
- "Blocking only hurts my poller." Blocking hurts the whole reactor OS thread.
- "Returning
-1from a poller unregisters it." Unregistration is explicit.
Diskengine Relevance
Diskengine integrations tend to cross boundaries: an external controller sends RPCs, SPDK translates them into bdev or transport work, and completions come back asynchronously. Bugs appear when a control path assumes synchronous behavior.
When reading diskengine-facing SPDK code, always annotate:
- callback owner thread
- resource owner thread
- whether a function sends a message
- whether a function registers a poller
- where completion is delivered
That habit prevents most wrong-thread misunderstandings.
Prose Diagram: Message Delivery
Imagine a message as a sealed envelope:
- Sender writes function pointer and context into the envelope.
- Sender drops it into the target thread's mailbox.
- Reactor eventually visits that target thread.
spdk_thread_poll()opens a batch of envelopes.- Each function runs on the target thread.
The sender does not wait by the mailbox.
Source Reading Exercise
Read the loop from reactor to poller:
lib/event/reactor.c:spdk_reactors_start()lib/event/reactor.c:reactor_run()lib/event/reactor.c:_reactor_run()lib/thread/thread.c:spdk_thread_poll()lib/thread/thread.c:thread_poll()lib/thread/thread.c:thread_execute_poller()lib/thread/thread.c:thread_execute_timed_poller()
Then read the message path:
lib/thread/thread.c:spdk_thread_send_msg()lib/thread/thread.c:msg_queue_run_batch()lib/thread/thread.c:thread_poll()
Questions:
- Where does TLS
spdk_threadget set? - What happens before active pollers run?
- How does SPDK decide busy vs idle?
- What causes a thread to be destroyed?
Operational Lab
Use RPC and logs:
- Start an SPDK target with a small reactor mask.
- Call
framework_get_reactors. - Identify reactors, their threads, busy ticks, idle ticks, and interrupt state.
- Add or enable a component that registers a poller.
- Call
framework_get_reactorsagain and observe thread/poller changes.
Source-only variation:
- Pick one module that calls
spdk_thread_send_msg()and trace why it needs to cross ownership boundaries.
Self-Check
- What is the difference between an OS thread, a reactor, and an
spdk_thread? - Why does
spdk_thread_send_msg()not call the function directly? - Where are active pollers stored?
- Where are timed pollers stored?
- Why must pollers avoid blocking?
- What conditions must be satisfied before an
spdk_threadexits? - Why can being on the same CPU core still be the wrong SPDK thread?
References
- Local source:
include/spdk_internal/event.h - Local source:
lib/event/reactor.c - Local source:
include/spdk/thread.h - Local source:
lib/thread/thread.c - Local source:
lib/event/app_rpc.c