Storage
Primitives
Data has gravity. Moving it is expensive. Understanding the physical medium—spinning rust vs. flash chips—determines your database choices.
The Storage Hierarchy
Sequential vs Random Access
Sequential
Reading blocks of memory one after another. No seeking.
- ✓ Predictable pre-fetching
- ✓ Extremely fast on both HDD & SSD
- ✓ Used by Kafka logs, LSM Trees
Random
Jumping to arbitrary locations on disk. Requires seeking.
- ⚠ Slow on HDD (physical arm movement)
- ~ Okay on SSD (but still slower)
- ⚠ Used by B-Trees, Hash Indexes
Rule of Thumb: Sequential access on HDD is faster than Random access on SSD.
HDD vs SSD Mechanics
HDD (Hard Disk Drive)
Spinning magnetic platters with a mechanical arm.
- Seek Time: Moving the arm (latency). ~10ms.
- Rotational Latency: Waiting for the sector to spin under the head.
- Best for: Archival data, Sequential logs.
SSD (Solid State Drive)
NAND Flash memory. No moving parts.
- IOPS: Can handle 100x more random IOPS than HDD.
- Wear Leveling: Flash cells degrade after writes.
- Best for: Databases, Boot drives, Application servers.
Durability & WAL
Write-Ahead Logging (WAL)
Writing to disk is slow. Writing to memory is volatile. How do databases survive a crash without being slow?
If crash occurs, replay the WAL to reconstruct memory state.
B-Trees vs LSM Trees
B-Trees
Standard in Postgres, MySQL.
- Opt: Read heavy workloads.
- Cons: Disk seek for every leaf. Random writes.
- "Updates in-place. Highly consistent but slow for massive writes."
LSM Trees
Cassandra, DynamoDB, LevelDB.
- Opt: Write heavy workloads.
- Cons: Compaction overhead. Slower reads.
- "Append-only. Leverages sequential writes for speed."
Architect's Summary: Use B-Trees for stability and read speed (Postgres). Use LSM Trees for extreme write volume (Cassandra).