On-premises clients can restart a node that is unavailable by performing a node
recovery procedure. For nodes running in Pega Cloud Services environments and
client-managed cloud environments this task is not supported; the deployment automatically
manages any required node recovery.
-
Decommission the failed node:
-
In the header of Dev Studio, click .
-
Select the service with the failed node by clicking the corresponding
tab.
-
For the failed node, in the Execute list, select
Decommission.
-
Fix the root cause of the failure.
For example: Replace failed hardware parts, or the entire node.
-
Recover the data:
- If the data was previously owned by the failed node and is available on
replica nodes, delete the Cassandra commit log and data folders.
- If the data was previously owned by the failed node and is not available
on any replica node, perform data recovery from a backup file.
-
Restart the node and add it back to the applicable service.
-
Run the nodetool repair operation.
-
Remove unused key ranges by running the nodetool cleanup operation on all
decision management nodes.