Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Operating system metrics on Cassandra nodes

Updated on July 5, 2022

This content applies only to On-premises and Client-managed cloud environments

Detect problems with Cassandra nodes by analyzing the operating system (OS) metrics.

vmstat
Identifies IO bottlenecks.
In the following example, the wait-io (wa) value is higher than ideal and is likely contributing to poor read/write latencies. The output of this command over a period of time with high latencies can show you if you are IO bound and if that may be a possible cause of latencies.

root@ip-10-123-5-62:/usr/local/tomcat# vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 4 0 264572 32008 15463144 0 0 740 792 0 0 6 1 91 2 0 2 3 0 309336 32116 15421616 0 0 55351 109323 59250 89396 13 2 72 13 0 2 2 0 241636 32212 15487008 0 0 57742 50110 61974 89405 13 2 78 7 0
2 0 0 230800 32632 15498648 0 0 63669 11770 64727 98502 15 3 80 2 0 3 2 0 270736 32736 15456960 0 0 64370 94056 62870 94746 13 3 75 9 0
Netstat -anp | grep 9042
Shows if network buffers are building up.
The second and third columns in the output show the tcp Recv and Send buffer sizes. Consistently large numbers for these values indicate the inability of either the local Cassandra node or the client to handle processing of the network traffic. See the following sample output:

root@ip-10-123-5-62:/usr/local/tomcat# netstat -anp | grep 9042
tcp 0 0 10.123.5.62:9042 0.0.0.0:* LISTEN 475/java
tcp 0 0 10.123.5.62:9042 10.123.5.58:36826 ESTABLISHED 475/java
tcp 0 0 10.123.5.62:9042 10.123.5.19:54058 ESTABLISHED 475/java
tcp 0 138 10.123.5.62:9042 10.123.5.36:38972 ESTABLISHED 475/java
tcp 0 0 10.123.5.62:9042 10.123.5.75:50436 ESTABLISHED 475/java
tcp 0 0 10.123.5.62:9042 10.123.5.23:46142 ESTABLISHED 475/java
Log files
Shows the reasons why Cassandra has stopped working on the node. Usually provided in the /var/log/* directory.
In some cases the process might have been killed by the OS to prevent system from bigger failure caused by lack of resources. Common case is lack of memory which is indicated by the appearance of OOMKiller message in logs.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us