Cassandra
A distributed log that happens to look like a database. Cassandra trades query flexibility for always-on availability, massive write throughput, and linear scalability across datacenters.
Architecture & The Ring
Masterless ring topology, gossip protocol, coordinators, vnodes, and snitches — the foundation that makes Cassandra different.
Replication & Consistency
Replication factor, strategies, tunable consistency levels, and the R + W > RF formula.
Data Model & CQL
Partition keys, clustering columns, CQL vs SQL, and the query-first design philosophy.
Write & Read Path
Commit log, memtable, SSTables, bloom filters, and why writes are fast but reads are complex.
Compaction & Operations
STCS, LCS, TWCS, tombstones, gc_grace_seconds, nodetool, and cluster management.
Multi-DC & Performance
Multi-datacenter deployments, LOCAL_QUORUM, active-active writes, and production tuning.
Security, Backup & ScyllaDB
Authentication, encryption, snapshots, disaster recovery, and when to consider ScyllaDB instead.
Why Cassandra?
Cassandra is built from two foundational papers: Amazon Dynamo (partitioning, replication) and Google Bigtable (data model, SSTables). The result is a system where writes almost never fail, reads are fast for known access patterns, and there is no single point of failure.
- ✓Masterless architecture — no single point of failure, every node is equal.
- ✓Linear scalability — add nodes, get proportional throughput increase.
- ✓Tunable consistency — choose per-operation between availability and consistency.
- ✓Multi-datacenter native — active-active writes across regions with LOCAL_QUORUM.
- ✓Write-optimized — commit log + memtable path achieves millions of writes per second.