The problem with reading SPDK like normal application code
If you read SPDK as if it were a request/response web server, it will hurt. A web handler often receives a request, calls functions, returns a response, and the stack unwinds. SPDK code often receives an event, allocates a context object, submits work, returns immediately, and finishes later in a callback that may run on the same spdk_thread or a different one. The call stack is not the story. The state machine is the story.
This is why the code is full of small structs named ctx, req, task, iter, cb_arg, and *_ctx. Those objects are the synthetic stack frames for async C. They carry the variables that a blocking implementation would have kept on the stack.
The five-question reading loop
Every time you enter a new SPDK function, ask these questions.
1. What object is this about?
Examples:
struct spdk_bdevstruct spdk_bdev_iostruct spdk_bdev_descstruct spdk_io_channelstruct spdk_threadstruct spdk_pollerstruct spdk_nvme_ctrlrstruct spdk_nvme_qpairstruct spdk_lvolstruct spdk_blobstruct spdk_nvmf_requeststruct spdk_nvmf_qpair
Do not start with helpers. Start with the object. Find its struct definition. Read the fields. Look for embedded TAILQ_ENTRY, RB_ENTRY, reference counts, state enums, callbacks, and owner pointers.
2. Who owns it?
Ownership in SPDK usually means one or more of:
- The current
spdk_threadowns access to mutable state. - A module owns a bdev it registered.
- A descriptor owns an open claim.
- A channel owns per-thread resources.
- A controller owns qpairs and namespaces.
- A request owns child IOs until completion.
- A callback context owns heap memory until the terminal callback frees it.
Ownership bugs are the source of many hard failures. A wrong-thread channel put is not a style issue. It is a correctness issue.
3. Is this synchronous or asynchronous?
SPDK functions often return an integer that only tells you whether submission succeeded. The actual operation completes later. Examples:
- bdev IO submission returns before IO completion.
- lvol create/delete/resize uses callbacks.
- blobstore metadata operations use callbacks.
- subsystem initialization is async and must call the next init step.
- config replay sends JSON-RPC requests and waits for responses through a poller.
If a function takes a callback and callback argument, assume the return value is not the final result unless the documentation explicitly says otherwise.
4. Which thread must run the next step?
SPDK avoids locks by moving work to the owner thread. That means you must trace messages:
spdk_thread_send_msg(thread, fn, ctx);
This does not call fn immediately unless you are using a helper that explicitly executes inline for the current thread. It enqueues a message to another spdk_thread. The callback runs when that target thread is polled.
5. Who completes and who frees?
Every async path needs a terminal event. Find it.
For bdev IO, the terminal event is usually:
spdk_bdev_io_complete(bdev_io, status);
For JSON-RPC, it may be:
spdk_jsonrpc_send_result(request, w);
For subsystem init:
spdk_subsystem_init_next(rc);
For channel iteration:
spdk_for_each_channel_continue(i, status);
When debugging a hang, ask: which required completion was never called?
Why edge cases matter more than happy path
Storage systems spend most of their complexity on edge cases:
- device disappears
- controller resets
- queue is full
- CQ has no free slots
- allocation returns
-ENOMEM - bdev is being removed
- descriptor is still open
- lvolstore metadata is not loaded yet
- duplicate name after restart
- reset is draining outstanding IO
- channel destruction is deferred
- config replay issues an RPC in the wrong phase
A happy-path-only tutorial is actively dangerous because it teaches the wrong shape. SPDK code is built around preserving invariants when these edge cases happen.
The source tour method
When you read a file, do this:
- Find the public entry points.
- Find the state structs.
- Find the registration macro.
- Find the callback types.
- Find the completion function.
- Find error cleanup labels.
- Find reset/remove/shutdown paths.
- Find tests.
For example, in a bdev module:
- public RPC handler in
*_rpc.c - module registration with
SPDK_BDEV_MODULE_REGISTER - bdev function table with
submit_request - channel create/destroy
- base bdev event callback if virtual
spdk_bdev_io_complete- destruct callback
Labs
Lab 1: classify a function
Open lib/thread/thread.c and find spdk_thread_poll. Answer:
- What object does it operate on?
- Does it block?
- What callbacks can it execute?
- What does its return value mean?
- What state does it temporarily set?
Lab 2: classify a diskengine RPC wrapper
Open /home/lolwierd/Projects/excloud/diskengine/diskengine/internal/spdkclient/wrappers.go. Pick BdevNvmeAttachController.
- What JSON-RPC method is sent?
- What does the Go wrapper consider success?
- Does success mean the resulting bdev is fully examined?
- Where in SPDK is that RPC handler registered?
Lab 3: find the terminal callback
Open module/bdev/null/bdev_null.c. Find the path for a read. Where does the module eventually call spdk_bdev_io_complete? What would happen if it forgot?
Self-check
- Why is a callback context object like a manual stack frame?
- Why is "who frees this" as important as "what does this do"?
- What is the difference between submission success and operation success?
- Why can a missing
spdk_for_each_channel_continuehang a system?