This is a book-length, first-principles course for understanding SPDK deeply enough to read the C source, debug production failures, and extend the system. It starts below SPDK at NAND, NVMe, PCIe, DMA, VFIO, and hugepages, then climbs through reactors, bdev, lvol, RAID, NVMe-oF, vhost/vfio-user, JSON-RPC, and excloud diskengine.
The current build is intentionally structured for expansion: polished front matter lives under content/chapters, long-form draft chapters live under drafts, and the blueprint lives at BOOK_BLUEPRINT.md.
Part 0: How To Use This Book
Part 1: Storage Hardware From Zero
Chapter 3: What A Block Device Promises
A block device is the lie that makes storage programming possible. It tells software: "Give me a logical block address and a length, and I will read or write that range." It...
04Chapter 4: NAND Flash And SSD Internals
SPDK programmers do not usually program NAND directly. They program bdevs, NVMe namespaces, queue pairs, and DMA buffers. But SSD behavior leaks through every abstraction:...
05Chapter 5: NVMe SSDs As Queue Machines
NVMe is not "a faster disk command set." It is a queue protocol designed for many-core hosts and parallel SSD controllers. The host and controller communicate mostly through...
06Chapter 6: PCIe, MMIO, DMA, IOMMU, VFIO, And Hugepages
SPDK is fast because it puts userspace code close to hardware. That means SPDK applications must understand what the kernel normally hides: PCIe discovery, BAR mapping, MMIO...
Part 2: Why SPDK Exists
Chapter 7: Linux Storage Path vs SPDK Path
A first-principles comparison of the normal Linux storage stack and SPDK's userspace, polling, DMA-first storage path.
08Chapter 8: DPDK EAL And SPDK Env
By the end of this chapter, a beginner should be able to explain why an SPDK process starts by initializing an "environment" before it initializes storage subsystems, why that...
Part 3: SPDK Execution Model
Chapter 9: App Startup And Subsystems
By the end of this chapter, a beginner should be able to trace an SPDK event-framework application from `spdk_app_start()` to the user start callback, explain why subsystem...
10Chapter 10: Reactors, `spdk_thread`, Messages, Pollers
By the end of this chapter, a beginner should be able to explain the difference between an OS thread, an SPDK reactor, and an `spdk_thread`; trace a message sent by...
11Chapter 11: `io_device` And `io_channel`
By the end of this chapter, a beginner should be able to explain why SPDK has `io_device` and `io_channel`, how channels provide per-thread resources, why...
12Chapter 12: Memory, Iobuf, Mempools, Zero Copy
By the end of this chapter, a beginner should be able to explain SPDK's memory categories, why DMA-safe allocation is different from ordinary allocation, how mempools and iobuf...
Part 4: bdev, The Central Abstraction
Chapter 13: bdev Object Model
By the end of this chapter you should be able to look at a block device in SPDK and answer five practical questions:
14Chapter 14: bdev I/O Path In Detail
By the end of this chapter you should be able to trace a single write from the public bdev API to the module `submit_request()` callback, then trace the completion back to the...
15Chapter 15: Writing A bdev Module
By the end of this chapter you should be able to sketch a small bdev module and know where the hard parts are. You will know the difference between a physical bdev module and a...
16Chapter 16: QoS, Reset, Remove, Hotplug, Events
By the end of this chapter you should be able to explain the non-happy-path machinery around bdev I/O: rate limits, reset, unregister, hotremove, media-management events, and...
17Chapter 17: NVMe Initiator Library
By the end of this chapter you should be able to explain SPDK's NVMe initiator as a queue-machine library. You should be able to find the source paths for probe, connect,...
18Chapter 18: NVMe bdev Module
By the end of this chapter you should be able to trace `bdev_nvme_attach_controller` from JSON-RPC to NVMe connect to namespace bdev registration, then trace a read or write...
Part 5: Concrete SPDK Storage Layers
Chapter 19: Blobstore
Blobstore is SPDK's small storage engine for "large named chunks of blocks" called blobs. If a normal block device gives you one flat address range, blobstore turns that range...
20Chapter 20: lvol And lvol bdev
lvol is SPDK's logical-volume layer on top of blobstore. A logical volume store, or lvolstore, is a blobstore plus lvol-specific metadata. An lvol is a blob plus lvol-specific...
21Chapter 21: RAID And Virtual bdev Stacking
RAID in SPDK is a virtual bdev module. It takes several base bdevs and exposes one logical bdev. The RAID bdev's `submit_request` function maps logical offsets to base-device...
Part 6: Network And VM Transports
Chapter 22: NVMe-oF Target
After this chapter, the reader should be able to explain how SPDK exposes a local bdev as a remote NVMe namespace. They should know the difference between a target, transport,...
23Chapter 23: NVMe-oF Initiator Through `bdev_nvme`
This chapter explains how SPDK acts as an NVMe initiator and then exposes connected namespaces as SPDK bdevs. For diskengine, this is the core of baremetal mode: compute-side...
24Chapter 24: vhost-blk And QEMU Exposure
This chapter explains how SPDK exposes a bdev to a VM through vhost-blk. By the end, the reader should know what the vhost-user socket represents, how QEMU virtio-blk requests...
25Chapter 25: vfio-user NVMe Exposure
This chapter explains vfio-user as a way to expose an emulated NVMe controller through a Unix socket, with guest-visible PCI/NVMe semantics and SPDK bdev-backed storage. The...
Part 7: Control Plane
Chapter 26: JSON-RPC And Configuration
This chapter teaches the SPDK control plane from the point of view of a new operator.
27Chapter 27: Config Save, Replay, And --wait-for-rpc
How SPDK turns JSON-RPC state into reproducible configuration, why startup can pause for control-plane replay, and what goes wrong in diskengine restore loops.
28Chapter 28: Observability And Debug Tools
This chapter teaches how to look inside a running SPDK application without first attaching a debugger.
Part 8: excloud diskengine
Chapter 29: diskengine Storage Node Mode
This chapter explains diskengine storage-node mode as a set of reconciliation loops around SPDK. The reader should understand how local NVMe devices are discovered, bound for...
30Chapter 30: diskengine Baremetal Mode
This chapter explains diskengine baremetal mode as the compute-side reconciler. The reader should understand how it attaches remote NVMe-oF lvol namespaces, builds and heals...
31Chapter 31: Complete VM Write To SSD Walkthrough
This chapter follows one guest write from a VM down to a physical SSD and then follows completion back up. It ties together vhost-blk, RAID, NVMe-oF initiator, NVMe-oF target,...
Part 9: Debugging And Extension
Chapter 32: Failure Mode Taxonomy
This chapter gives a structured method for debugging SPDK failures.
33Chapter 33: Debugging Playbooks
Concrete symptom-driven playbooks for common SPDK and diskengine failures: missing volumes, RAID configure loops, reconnect storms, guest IO hangs, replay failures, high latency, and NOMEM.
34Chapter 34: Extending SPDK Source
This chapter gives practical extension projects for learning SPDK by changing it in small, controlled ways.
34.5Deep Dive: Reading SPDK C Source
This chapter gives you a practical method for reading SPDK C source as a beginner.
35Chapter 35: Build, Test, Run, Contribute
This chapter teaches how to build SPDK, choose the right tests, and prepare a contribution.
Appendix: Reader Tools
Appendix: Blobstore, lvol, RAID Deep Dives
Chapter: bdev_examine And Virtual bdev Stacking
SPDK bdev modules do not only create physical bdevs. Many modules create virtual bdevs on top of other bdevs. A virtual bdev module needs a way to notice that a base bdev...
800.5Chapter: Snapshots, Clones, Resize, Delete, And Stacking Edge Cases
This chapter is the "what breaks in production" companion to the blobstore, lvol, bdev examine, and RAID chapters. It focuses on operations that appear simple from RPC names...