Achieve high performance in terms of data replication and consistency by
estimating the optimal database size to run a Cassandra cluster.Note:
If your Cassandra database is managed by a service provider, engage with your system
provider in the sizing process.
Before you begin: Obtain the sizing calculation tool by sending an email to
- On a production system on which you want to run a Cassandra cluster, select at
least three nodes.
Note: You can run multiple nodes on the same server provided that each node has
a different IP address.
- In the sizing calculation tool, in the fields highlighted in red, provide the
required information about records size for each of the following decision management services:
- In the DDS_Data_Sizing tab, provide information
about Decision Data Store (DDS), such as the number of records and the
average record key size.
- In the Delayed_Learning_Sizing tab, provide
information about adaptive models delayed learning, such as the number
of decisions per minute and the average record key size.
- In the VBD_Sizing tab, provide information about
business monitoring and reporting, such as the number of dimensions and
- In the Model_Response_Sizing tab, provide
information about collecting the responses to your adaptive models, such
as the number of incoming responses in 24 hours.
- Calculate the required database size for your Cassandra cluster by summing up
the values of the Total required disk space fields from
- Ensure that you have enough disk space to run the DDS data sets by dividing the
database size that you calculated in step 3 by the number of available nodes and ensuring that the size
of each node does not exceed 50% of the database size.
- If you use the cluster for simulations and data flow runs, increase processing
speed by adding nodes to the cluster.