Stream service node status information

Monitor the performance of each Stream node that is part of the Stream service by checking the metrics on the Stream tab. Detailed status information about a node is helpful when you need to troubleshoot the node.

  • Node ID – The identification number of the node in the cluster.

  • Disk usage – The disk space used by the Stream records on this node.

  • Free disk space - The free disk space that is allocated to this node.

  • Partition
    • Total - The number of partitions in Stream data sets that use the Stream service.

    • Under-replicated - The number of partitions that are not duplicated on multiple nodes. For example, under-replication can occur when a Stream node fails and processing of the Stream records continues on a single node.
      Note: When you notice under-replicated partitions, check the status of your Stream nodes and troubleshoot them.
    • Offline - The number of partitions that are not processed even on a single Stream node. Partitions become offline when the Stream nodes that process these partitions fail.
      Note: When you notice offline partitions, check the status of your Stream nodes and troubleshoot them.
    • Leaders - The number of leaders that handle all of the read and write requests for partitions. For more information, see the Apache Kafka documentation.

  • Incoming byte rate - The incoming bytes rate over specified periods of time and the overall mean value.

  • Outgoing byte rate - The outgoing bytes rate over specified periods of time and the overall mean value.

  • Incoming message rate - The rate of incoming records over specified periods of time and the overall mean value.

  • Processors
    • Network processors idle time - The average fraction of time that the network processor is idle.

    • Request handler threads idle time - The average fraction of time that the request handler threads are idle.

    Note: The idle time can have value between 0 and 1, where 0 means that the processor is 100% busy, 1 means that the processor is 100% free. When the idle time is lower than 0.3, it means that the processor is 70% busy, and a warning is displayed in the Stream tab. Check to see what is causing such a high demand on the processor and consider adding additional Stream nodes.
  • Metrics
    • Replication max lag - The lag in processing the same partition on multiple nodes.

    • Is controller - When its value is 1, the node is the active controller in this cluster. There can be only one active controller in the cluster.

    For more information about the node metrics, see the Apache Kafka documentation.