Defining partition keys for stream data sets

You can define a set of partition keys in a Data Set rule of type Stream to use the Pega-provided load balancer to test how data flow processing is distributed across Data Flow service nodes in a multinode Pega Decision Management environment. For example, you can test whether the intended number and type of partitions negatively affects the processing of a Data Flow rule that references an event strategy.

Use this feature only for testing purposes, in application environments where the Production level system setting is set to 1, 2 ,or 3, and in scenarios when your custom load balancer for stream data sets is unavailable or busy.

If you change the Production level system setting to 4 or 5, any data set of type Stream that has at least one partition key defined continues to process data, but is no longer distributed across multiple nodes.

An active Data Flow rule that references a Data Set rule of type Stream and has at least one partition key defined continues processing when nodes are added or removed from the cluster, for example, as a result of node failure or an intentional change in the node topology. However, any data that was not yet processed on the failed or disconnected node is lost.

  1. Create a Data Set rule of type Stream. Use this data set only for testing purposes.
    1. In Designer Studio, click + Create > Data Model > Data Set.

    2. Specify the data set Label and Identifier.

    3. From the Type drop-down list, select Stream.

    4. Specify the ruleset, Applies To class, and ruleset version of the data set.

    5. Click Create and open.

  2. On the Stream tab, in the Partition key(s) section, configure any number of partition keys:
    1. Note: Click Add key.
    2. In the Key field, press the Down Arrow key and select a property to use as a partition key. The available properties are defined in the Applies To class of the data set.

      If the stream data set is the source that feeds event data to an Event Strategy rule, you can define a single partition key for that data set only. That partition key must be the same as the event key that is defined in the Real-Time Data shape on the Event Strategy form. Otherwise, the Data Flow rule cannot be run and will fail.

    3. Repeat steps a through b to add more partition keys.
  3. Click Save.