Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Configuring compaction settings for SSTables (deprecated)

Updated on May 17, 2024

This content applies only to On-premises and Client-managed cloud environments

Maintain the good health of the Cassandra cluster by tuning compaction throughput for write-intensive workloads.

Cassandra might write multiple versions of a row to different SSTables. Often, each version has a unique set of columns that Cassandra stores with a different time stamp. As a result, the size of the SSTables grows, and the data distribution might require accessing an increasing number of SSTables to retrieve a complete row of data. Cassandra periodically merges SSTables and discards old data through compaction, to keep the cluster healthy.

Note: By default, Pega Platform provides a compaction throughput of 16 MB per second for Cassandra 2.1.20, and 1024 MB per second for Cassandra 3.11.3 (8 concurrent compactors). For high write-intensive workloads, you can increase the default compaction throughput to a minimum of 256 MB per second.

Note:

Starting in Pega Platform version 8.6, the use of an internal Cassandra database is deprecated. On-premises and client-managed cloud systems that have been updated from earlier versions of Pega Platform can continue to use Cassandra in embedded mode. However, to ensure future compatibility, do not create any new environments using embedded Cassandra.

  1. For every Decision Data Store (DDS) node, add the following dynamic system settings.
    1. In the Pega-Engine ruleset, set the same number of concurrent compactors by adding the prconfig/dnode/yaml/concurrent_compactors/default property with the value that represents the number of CPU cores.
    2. In the Pega-Engine ruleset, configure the compaction throughput by adding the prconfig/dnode/yaml/compaction_throughput_mb_per_sec/default property with the following value: 256.

      For more information, see Configuring dynamic system settings.

      Determining the most appropriate compaction throughput setting is an iterative process. You can use the nodetool to adjust the compaction throughput for one node at a time, without requiring a node restart. In that case, any changes are reverted after the restart. For more information about the nodetool commands for compaction throughput, see the Apache Cassandra documentation.

  2. Restart all DDS nodes.
    For more information, see Managing decision management nodes.
    • Previous topic Creating Java keystores and truststores for Cassandra encryption
    • Next topic Best practices for disk space management

    Have a question? Get answers now.

    Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

    Did you find this content helpful?

    Want to help us improve this content?

    We'd prefer it if you saw us at our best.

    Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

    Close Deprecation Notice
    Contact us