Skip to Content
DocsArchitectureBack of the Envelope Estimation

Back of the Envelope Estimation

Jeff Dean: back-of-the-envelope calculations are estimates you create using a combination of thought experiments and common performance numbers to get a good feel for which designs will meet your requirements.

Power of two

To obtain correct calculations, it is critical to know the data volume unit using the power of 2.

PowerApproximate ValueFull NameShort Name
2^101 thousand1 kilobyte1 KB
2^201 million1 megabyte1 MB
2^301 billion1 gigabyte1 GB
2^401 trillion1 terabyte1 TB
2^501 quadrillion1 petabyte1 PB

Latency numbers

Approximate latency numbers (based on modern systems):

OperationLatencyNotes
L1 cache reference~1 nsFastest memory access
Branch mispredict~5 nsCPU pipeline stall
L2 cache reference~7 nsStill very fast
Mutex lock/unlock~25 nsSynchronization overhead
Main memory reference~100 ns100x slower than L1
Compress 1KB with Zippy~3 μs (3,000 ns)Snappy compression
Send 2KB over 1 Gbps network~20 μs (20,000 ns)Network bandwidth limit
Read 1MB sequentially from network~10 msVaries with network quality
Read 1MB sequentially from disk~1 ms (SSD) to ~20 ms (HDD)SSD is 20x faster

Key takeaways for distributed systems:

  • Memory hierarchy matters: L1 -> L2 -> RAM shows 100x jumps in latency
  • Network is expensive: event fast networks are ~20,000x slower than RAM access
  • Sequential disk reads: SSDs make a huge difference (20x improvement over HDDs)
  • Cache locality: Keeping data in L1/L2 cache can dramatically improve performance

Availability numbers

High availability is the ability of a system to be continuously operational for a desirably long period of time. It is usually measured in nines, the more the nines, the better.

Availability %Downtime per dayDowntime per year
99% (2 nines)14.40 minutes3.65 days
99.9% (3 nines)1.44 minutes8.77 hours
99.99% (4 nines)8.64 seconds52.60 minutes
99.999% (5 nines)864.00 milliseconds5.26 minutes
99.9999% (6 nines)86.40 milliseconds31.56 seconds

Example

Design Twitter’s QPS and storage system

  1. Make assumptions
  • 500M daily active users (DAU)
  • Each user posts 2 tweets per day on average
  • Average tweet size: 300 characters
  • 20% of users post, 80% just read
  • 10% of tweets have media (images/videos)
  • Data is stored for 5 years
  1. Calculate tweets per day
  • Active posters: 500M * 20% = 100M users
  • Tweets per day = 100M * 2 = 200M tweets/day
  • Tweets QPS: 200M / 24 hour / 3600s = ~2300
  • Peek QPS: 3 * QPS = ~7000
  1. Calculate tweet storage
  • Metadata per tweet
    • User ID: 8 bytes
    • Tweet ID: 8 bytes
    • Timestamp: 8 bytes
    • Likes/retweets counts: 8 bytes
    • other metadata: ~32 bytes
    • total: ~64 bytes
  • Tweet
    • 300 characters * 2 bytes (UTF-8) = 600 bytes
  • Total: (64 + 600) * 200M = 132.8 GB/day
  1. Calculate media storage
  • Tweets with media: 200M * 10% = 20M/day
  • Average media size: 200 KB (compressed image)
  • Media storage: 20M * 200 KB = 4 TB/day
  1. Total daily storage
  • 132.8 GB/day + 4 TB/day = 4.13 TB/day
  1. Calculate 5-year storage
  • Storage: 4.13 TB/day _ 365 days _ 5 years = ~7.5 PB
  • Add 30% for replication/backups, total ~= 10PB
Last updated on