By default, when Pega Platform connects to Cassandra, the DataStax token aware policy routes requests to Cassandra nodes. The goal of that policy is to always route requests to nodes that hold the requested data, which reduces the amount of Cassandra-to-Cassandra network activity through the following actions:
- Calculating the token for the request by creating a murmur3 hash function of the partition key for the requested or written data.
- Determining the list of potential nodes to which to send data by creating a group of nodes whose token range contains the token that you calculated.
- Choosing one of the nodes in the list to which to send the request, with the local data center as the priority.
- Enable the token range partitioner by setting the
system setting to
com.pega.dsm.dnode.impl.dataset.cassandra.TokenRangePartitioner.When the DDS data set browse operation is part of a data flow, the DDS data set breaks up the retrieved data into chunks, so that these chunk requests can be spread across the batch data flow nodes. By default, these chunks are defined as evenly split token ranges which do not take into account where the data resides. In a large cluster, a single token range may require data from multiple nodes. By configuring this DSS setting, you can ensure that no chunk range query requires data from more than one Cassandra node.
- Enable the extended token aware policy by setting the
dynamic system setting to true.When a Cassandra range query runs, the extended token aware policy selects a token from the token range to determine the Cassandra node to which to send the request, which is effective when the token range partitioner is configured.
- Enable the additional latency aware routing policy by setting the
dynamic system setting to true.In Cassandra clusters, individual node performance might vary significantly because of internal operations on the load (for example, repair or compaction). The latency aware routing policy is an additional DataStax client mechanism that can be loaded on top of the token aware policy to route queries away from slower nodes.
- Optional: To configure the additional latency aware routing policy parameters, configure
the following dynamic system settings:
For more information, see the Apache Cassandra documentation.
- Specify when the policy excludes a slow node from queries by setting
dynamic system setting to a number that represents how many times slower
the node must be from the fastest node to get excluded.
For example: If you set the exclusion threshold to 3, the policy excludes the nodes that are more than 3 times slower than the fastest node.
- Specify how the weight of older latencies decreases over time by setting the prconfig/dnode/cassandra_latency_aware_policy/scale/default dynamic system setting to a number of milliseconds.
- Specify how long the policy can exclude a node before retrying a query by setting the prconfig/dnode/cassandra_latency_aware_policy/retry_period/default dynamic system setting to a number of seconds.
- Specify how often the minimum average latency is recomputed by setting the prconfig/dnode/cassandra_latency_aware_policy/update_rate/default dynamic system setting to a number of milliseconds.
- Specify the minimum number of measurements per host to consider for the latency aware policy by setting the prconfig/dnode/cassandra_latency_aware_policy/min_measure/default dynamic system setting.
- Specify when the policy excludes a slow node from queries by setting the prconfig/dnode/cassandra_latency_aware_policy/exclusion_threshold/default dynamic system setting to a number that represents how many times slower the node must be from the fastest node to get excluded.