The Reader Contract: How To Learn SPDK Without Drowning

The problem with reading SPDK like normal application code

If you read SPDK as if it were a request/response web server, it will hurt. A web handler often receives a request, calls functions, returns a response, and the stack unwinds. SPDK code often receives an event, allocates a context object, submits work, returns immediately, and finishes later in a callback that may run on the same spdk_thread or a different one. The call stack is not the story. The state machine is the story.

This is why the code is full of small structs named ctx, req, task, iter, cb_arg, and *_ctx. Those objects are the synthetic stack frames for async C. They carry the variables that a blocking implementation would have kept on the stack.

The five-question reading loop

Every time you enter a new SPDK function, ask these questions.

1. What object is this about?

Examples:

struct spdk_bdev
struct spdk_bdev_io
struct spdk_bdev_desc
struct spdk_io_channel
struct spdk_thread
struct spdk_poller
struct spdk_nvme_ctrlr
struct spdk_nvme_qpair
struct spdk_lvol
struct spdk_blob
struct spdk_nvmf_request
struct spdk_nvmf_qpair

Do not start with helpers. Start with the object. Find its struct definition. Read the fields. Look for embedded TAILQ_ENTRY, RB_ENTRY, reference counts, state enums, callbacks, and owner pointers.

2. Who owns it?

Ownership in SPDK usually means one or more of:

The current spdk_thread owns access to mutable state.
A module owns a bdev it registered.
A descriptor owns an open claim.
A channel owns per-thread resources.
A controller owns qpairs and namespaces.
A request owns child IOs until completion.
A callback context owns heap memory until the terminal callback frees it.

Ownership bugs are the source of many hard failures. A wrong-thread channel put is not a style issue. It is a correctness issue.

3. Is this synchronous or asynchronous?

SPDK functions often return an integer that only tells you whether submission succeeded. The actual operation completes later. Examples:

bdev IO submission returns before IO completion.
lvol create/delete/resize uses callbacks.
blobstore metadata operations use callbacks.
subsystem initialization is async and must call the next init step.
config replay sends JSON-RPC requests and waits for responses through a poller.

If a function takes a callback and callback argument, assume the return value is not the final result unless the documentation explicitly says otherwise.

4. Which thread must run the next step?

SPDK avoids locks by moving work to the owner thread. That means you must trace messages:

spdk_thread_send_msg(thread, fn, ctx);

This does not call fn immediately unless you are using a helper that explicitly executes inline for the current thread. It enqueues a message to another spdk_thread. The callback runs when that target thread is polled.

5. Who completes and who frees?

Every async path needs a terminal event. Find it.

For bdev IO, the terminal event is usually:

spdk_bdev_io_complete(bdev_io, status);

For JSON-RPC, it may be:

spdk_jsonrpc_send_result(request, w);

For subsystem init:

spdk_subsystem_init_next(rc);

For channel iteration:

spdk_for_each_channel_continue(i, status);

When debugging a hang, ask: which required completion was never called?

Why edge cases matter more than happy path

Storage systems spend most of their complexity on edge cases:

device disappears
controller resets
queue is full
CQ has no free slots
allocation returns -ENOMEM
bdev is being removed
descriptor is still open
lvolstore metadata is not loaded yet
duplicate name after restart
reset is draining outstanding IO
channel destruction is deferred
config replay issues an RPC in the wrong phase

A happy-path-only tutorial is actively dangerous because it teaches the wrong shape. SPDK code is built around preserving invariants when these edge cases happen.

The source tour method

When you read a file, do this:

Find the public entry points.
Find the state structs.
Find the registration macro.
Find the callback types.
Find the completion function.
Find error cleanup labels.
Find reset/remove/shutdown paths.
Find tests.

For example, in a bdev module:

public RPC handler in *_rpc.c
module registration with SPDK_BDEV_MODULE_REGISTER
bdev function table with submit_request
channel create/destroy
base bdev event callback if virtual
spdk_bdev_io_complete
destruct callback

Labs

Lab 1: classify a function

Open lib/thread/thread.c and find spdk_thread_poll. Answer:

What object does it operate on?
Does it block?
What callbacks can it execute?
What does its return value mean?
What state does it temporarily set?

Lab 2: classify a diskengine RPC wrapper

Open /home/lolwierd/Projects/excloud/diskengine/diskengine/internal/spdkclient/wrappers.go. Pick BdevNvmeAttachController.

What JSON-RPC method is sent?
What does the Go wrapper consider success?
Does success mean the resulting bdev is fully examined?
Where in SPDK is that RPC handler registered?

Lab 3: find the terminal callback

Open module/bdev/null/bdev_null.c. Find the path for a read. Where does the module eventually call spdk_bdev_io_complete? What would happen if it forgot?

Self-check

Why is a callback context object like a manual stack frame?
Why is "who frees this" as important as "what does this do"?
What is the difference between submission success and operation success?
Why can a missing spdk_for_each_channel_continue hang a system?