Key Cassandra metrics
By monitoring Cassandra performance, you can identify bottlenecks, slowdowns, or resource limitations and address them in a timely manner.
Node metrics
When inspecting Cassandra nodes for performance issues, the following metrics are the most
helpful in determining the root cause:
- The count of errors and warnings in the logs.
- The indication of free disk space and Garbage Collection metrics.
- Such CPU metrics as
system_io_wait
anduser_wait
. - Such disk metrics as disk queue length and throughput.
- Such network metrics as latency.
Cassandra metrics
Monitor the following Cassandra metrics for troubleshooting and fault prevention:
Metric | Details | Threshold | Additional information |
---|---|---|---|
SSTable count | The nodetool cfstat command provides the SSTable
countThe ./nodetool cfstats | grep "SSTable count" | awk
'{print $3}' | sort -n provides the SSTable sorted
count |
Less than 30 | |
Partition size | The nodetool tablehistograms setting provides
the partition sizes. 2GB is the maximum value. However, any
approximate value close indicates an issue. |
Less than 10MB | The SizeTieredCompactionStrategy strategy must
come with at least 50% free disk space to allow C*
to write data during compaction. You can use the
LeveledCompactionStrategy only if 90% of requests are
read. |
Nodetool tpstats |
This setting provides details for dropped mutations or messages that were not saved to disk yet but are stored in memory. | 50 mutations or messages. If the data is prevented from being
saved after the nodetool flush command, there is an
issue with the data model. |
|
Node status | Run the nodetool status command to check the
cluster status. |
||
Compaction rate | Saves the data from memtable to sstables. The default value is 16 MB/sec |
Useful commands
The following is a list of the most useful Cassandra commands that are helpful in
maintaining the good health of the cluster:
- nodetool flush
- Writes data from memtables to SSTables in the file system. Run this
command if the
nodetool tpstats
command returned a high count of thread pools. - nodetool cleanup
- Removes unwanted data, that is, the data that us no longer owned by node. Run this command after a new node joins the cluster and after data redistribution.
- nodetool repair
- Repairs one or more nodes in a cluster and provides options for
restricting repair to a set of nodes. The following additional repair
modes are available with the
nodetool repair
command:incremental
– Separates fixed data from to be fixed data. Examines all sstables but repairs only damaged ones.full
– Examines and repairs all sstables. Irrespective of an SSTable being damaged or not.seq
– Sequential repair. Puts less load on the cluster during repair and takes more time.par
– Parallel repair. Puts more load on the cluster during repair and takes less time.
- nodetool bootstrap
- Checks the status of addition of a new node to the cluster. Run the
nodetool cleanup
on each of already existing nodes to remove unwanted data in them. Also in cassandra.yaml file, set theautobootstrap
setting to false to prevent automatic token transfer as soon as you add a node. To start the transfer manually, run thenodetool bootstrap resume
command.