Friday, July 24, 2020

Log Buffer In Cassandra


Do we have anything  like Log Buffer in memory related to commit log file???

I have started learning Cassandra few months ago and was confused with term "Log buffer" in cassandra. I have gone through several links/documents and they were just telling me about commit log. If you search for Cassandra architecture diagram, it says when a write occurs, Cassandra writes to the memtable(In Memory) and commit log (On DISK).

So, I was more curious to know about log buffer and its functionality. As if it is in memory so what will happen during sudden failure...

I tried one more time to look for the answers and I got this time :)

Summary of this Blog is: Yes, 😊 We have a log buffer in Cassandra which is in OS as Cassandra don’t have any OS like SQL. But the writes are not acknowledged in Cassandra until unless it is written to Disk (It is tunable). 

The setting which control this is: commitlog_sync  (Batch Or periodic) 

By default cassandra support BatchWhich says writes are not acknowledged until fsynced to disk.

PeriodicYou can think of this as delayed durability in SQL server. If you are from SQL Background like me.

==============================================================
Below are detailed info:

When we try to read in detail about cassandra commit log, this is where you will get confused..Only I f you are a beginner like me ðŸ˜Š You will find a term "Log Buffer" and will not get much information about it.  So, The reference to "commit log buffer in memory" is talking about OS buffer cache, not a memory structure in Cassandra.  We can also refer the code regarding same if you know little bit about java programming. There is no separate in-memory structure for the commit log, but rather the mutation is serialized and written to a file-backed buffer.

Cassandra comes with two strategies for managing fsync on the commit log. (Batch and Periodic)

What Cassandra Documentation Says:

Commitlogs are an append only log of all mutations local to a Cassandra node. Any data written to Cassandra will first be written to a commit log before being written to a memtable. This provides durability in the case of unexpected shutdown. On startup, any mutations in the commit log will be applied.

 commitlog_sync: may be either “periodic” or “batch.”

Batch: In batch mode, Cassandra won’t ack writes until the commit log has been fsynced to disk. It will wait “commitlog_sync_batch_window_in_ms” milliseconds between fsyncs. This window should be kept short because the writer threads will be unable to do extra work while waiting. You may need to increase concurrent_writes for the same reason.

commitlog_sync_batch_window_in_ms: Time to wait between 
                                   “batch” fsyncs
Default Value: 2

Periodic: In periodic mode, writes are immediately ack’ed, and the CommitLog is simply synced every “commitlog_sync_period_in_ms” milliseconds.
commitlog_sync_period_in_ms: Time to wait between 
                                 “periodic” fsyncs
    Default Value: 10000


Default Value in cassandra is: batch which says that there will be no data loss in case of sudden failure..


Thank You For Reading this Blog

No comments:

Post a Comment