hard_drive

Storage
Primitives

Data has gravity. Moving it is expensive. Understanding the physical medium—spinning rust vs. flash chips—determines your database choices.

The Storage Hierarchy

Registers
< 1nsBytes
L1/L2 Cache
~10nsKB/MB
Main Memory (RAM)
~100nsGB
SSD (Flash)
~100µsTB
HDD (Magnetic)
~10msPB
"Faster storage is smaller and more expensive. Slower storage is massive and cheap. Caching is just moving data up this pyramid."

Sequential vs Random Access

fast_forward

Sequential

Reading blocks of memory one after another. No seeking.

  • Predictable pre-fetching
  • Extremely fast on both HDD & SSD
  • Used by Kafka logs, LSM Trees
shuffle

Random

Jumping to arbitrary locations on disk. Requires seeking.

  • Slow on HDD (physical arm movement)
  • ~ Okay on SSD (but still slower)
  • Used by B-Trees, Hash Indexes

Rule of Thumb: Sequential access on HDD is faster than Random access on SSD.

HDD vs SSD Mechanics

HDD (Hard Disk Drive)

Spinning magnetic platters with a mechanical arm.

  • Seek Time: Moving the arm (latency). ~10ms.
  • Rotational Latency: Waiting for the sector to spin under the head.
  • Best for: Archival data, Sequential logs.

SSD (Solid State Drive)

NAND Flash memory. No moving parts.

  • IOPS: Can handle 100x more random IOPS than HDD.
  • Wear Leveling: Flash cells degrade after writes.
  • Best for: Databases, Boot drives, Application servers.

Durability & WAL

save

Write-Ahead Logging (WAL)

Writing to disk is slow. Writing to memory is volatile. How do databases survive a crash without being slow?

1. Append to LogSequential write to disk. Fast. (The WAL)
arrow_forward
2. Update MemoryModify the in-memory structure (B-Tree/Memtable).
arrow_forward
3. Async FlushLater, write memory to disk data files.

If crash occurs, replay the WAL to reconstruct memory state.

B-Trees vs LSM Trees

B-Trees

Standard in Postgres, MySQL.

  • Opt: Read heavy workloads.
  • Cons: Disk seek for every leaf. Random writes.
  • "Updates in-place. Highly consistent but slow for massive writes."

LSM Trees

Cassandra, DynamoDB, LevelDB.

  • Opt: Write heavy workloads.
  • Cons: Compaction overhead. Slower reads.
  • "Append-only. Leverages sequential writes for speed."

Architect's Summary: Use B-Trees for stability and read speed (Postgres). Use LSM Trees for extreme write volume (Cassandra).