link

Design a
URL Shortener

tinyurl.com, bit.ly. It seems simple: Map a long string to a short string. But how do you generate 100 billion unique short strings without collisions?

Requirements

Functional

  • 1. Given a Long URL, return a unique Short URL.
  • 2. Clicking the Short URL redirects to the Long URL.
  • 3. Short URLs should be as short as possible.
  • 4. Optional: Custom aliases (e.g., bit.ly/my-cool-link).

Non-Functional

  • 1. High Availability: Redirects must not fail.
  • 2. Low Latency: Redirection must be instant.
  • 3. Read Heavy: 100:1 Read/Write ratio.

API Design

POST /api/v1/shorten
Request: { "longUrl": "https://google.com" }
Response: { "shortUrl": "https://tiny.url/AbCd12" }
GET /api/v1/{short_alias}
Response: 301 Redirect to Long URL

301 vs 302 Redirect?
301 (Permanent) = Browser caches it. Less load on server, but no analytics.
302 (Temporary) = Always hits server. Good for analytics, higher load.

Database Schema

storage

We need to store billions of records. No relationships needed. NoSQL (DynamoDB, Cassandra) or Key-Value (Riak) is best.

UrlMappingTable
PK: short_keyvarchar(7)
long_urlvarchar(2048)
created_attimestamp
user_iduuid (opt)

Hash Function vs ID Generation

Approach 1: Hashing

MD5(long_url) = 206b...837 (128-bit). Take first 7 chars?

Problem: Collisions

If two users shorten the same URL, they get the same hash. Or worse, different URLs collide on first 7 chars.

Approach 2: Unique ID + Base62

Generate a unique integer ID (1, 2, ... 100000). Convert to Base62 ([a-z], [A-Z], [0-9]).

ID: 100 → "1C"
ID: 1000000000 → "15FTGg"
Solution:

Guaranteed unique. Shortest possible string.

Generating Unique IDs

fingerprint

Distributed ID Generator

A single database `AUTO_INCREMENT` won't scale. We need a distributed way to get unique numbers.

Snowflake

Twitter's algorithm. Uses timestamp + worker ID + sequence. Fast & sortable.

Key Generation Service (KGS)

Pre-generate 1M keys. Load them into memory. Hand them out. Fast but complex state.

DB Range

Server 1 takes [1-1M]. Server 2 takes [1M-2M]. Simple.

Scaling Reads & Writes

Since reads are 100x writes, we prioritize read speed.

personUser
arrow_forward
Cache (Redis)LRU Policy (20% of URLs = 80% traffic)
arrow_forward
DatabasePersistent Store

Checks Cache first. If miss, check DB & update Cache.
Reads are ~1ms.