SQL/NoSQL Public Diary: Memtable and SStable In Cassandra

What are Memtable and SStable In Cassandra?

Cassandra processes data at several stages on the write path, starting with the immediate logging of a write and ending in with a write of data to disk:

1. Logging data in the commit log
2. Writing data to the memtable
3. Flushing data from the memtable
4. Storing data on disk in SSTables

memtable

When a write occurs, Cassandra stores the data in a memory structure called memtable, and to provide configurable durability, it also appends writes to the commit log on disk. The commit log receives every write made to a Cassandra node, and these durable writes survive permanently even if power fails on a node. The memtable is a write-back cache of data partitions that Cassandra looks up by key. The memtable stores writes in sorted order until reaching a configurable limit, and then is flushed.

When memtable contents exceed a configurable threshold or the commitlog space exceeds the commitlog_total_space_in_mb, the memtable data, which includes indexes, is put in a queue to be flushed to disk. To flush the data, Cassandra sorts memtables by partition key and then writes the data to disk sequentially. The process is extremely fast because it involves only a commitlog append and the sequential write.

Data in the commit log is purged after its corresponding data in the memtable is flushed to the SSTable. The commit log is for recovering the data in memtable in the event of a hardware failure.

SSTable:

A sorted string table (SSTable) is an immutable data file to which Cassandra writes a memtable. Cassandra flushes all the data in the memtables to the SSTables once the memtables reach a threshold value. Consequently, a partition is typically stored across multiple SSTable files. A number of other SSTable structures exist to assist read operations:

For each SSTable, Cassandra creates these structures:

Data (Data.db)

The SSTable data

Primary Index (Index.db)

Index of the row keys with pointers to their positions in the data file

Bloom filter (Filter.db)

A structure stored in memory that checks if row data exists in the memtable before accessing SSTables on disk

Compression Information (CompressionInfo.db)

A file holding information about uncompressed data length, chunk offsets and other compression information

Statistics (Statistics.db)

Statistical metadata about the content of the SSTable

Digest (Digest.crc32, Digest.adler32, Digest.sha1)

A file holding adler32 checksum of the data file

CRC (CRC.db)

A file holding the CRC32 for chunks in an uncompressed file.

SSTable Index Summary (SUMMARY.db)

A sample of the partition index stored in memory

SSTable Table of Contents (TOC.txt)

A file that stores the list of all components for the SSTable TOC

Secondary Index (SI_.*.db)

Built-in secondary index. Multiple SIs may exist per SSTable

References: Datastax Docs

SQL/NoSQL Public Diary

Sunday, September 1, 2019

Memtable and SStable In Cassandra

No comments:

Post a Comment