Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Creating a data flow

Updated on July 5, 2022

Create a data flow to process and move data between data sources. Customize your data flow by adding data flow shapes and by referencing other business rules to do more complex data operations. For example, a simple data flow can move data from a single data set, apply a filter, and save the results in a different data set. More complex data flows can be sourced by other data flows, can apply strategies for data processing, and open a case or trigger an activity as the final outcome of the data flow.

  1. In the header of Dev Studio, click Create Data ModelData Flow.
  2. In the Create Data Flow tab, create the rule that stores the data flow:
    1. In the header of Dev Studio, click Create Data ModelData Flow.
    2. On the Create form, enter values in the fields to define the context of the flow.
    3. In the Label field, describe the purpose of the data flow.
    4. Optional: To change the default identifier for the data flow, click Edit, enter a meaningful name, and then click OK.
    5. In the Apply to field, press the Down arrow key, and then select the class that defines the scope of the flow.
      The class controls which rules the data flow can use. It also controls which rules can call the data flow.
    6. In the Add to ruleset field, select the name and version of a ruleset that stores the data flow.
    7. Click Create and open.
  3. In the Edit Data flow tab, double-click the Source shape.
  4. In the Source configurations window, in the Source list, define a primary data source for the data flow by selecting one of the following options:
    • To receive data from an activity or from a data flow with a destination that refers to your data flow, select Abstract.
    • To receive data from a different data flow, select Data flow. Ensure that the data flow that you select has an abstract destination defined.
    • To receive data from a data set, select Data set. If you select a streaming data set, such as Kafka, Kinesis, or Stream, in the Read options section, define a read option for the data flow:
      • To start reading any records that exist in the data set, select Read existing and new records.
      • To discard any existing records and only read records that appear after the run is in the In-progress status, select Only read new records.

      For more information on data set types, see Types of Data Set rules.

      For more information on the correlation between Read options and data flow run management for streaming data sets, see The use of streaming data sets in data flows.

    • To retrieve and sort information from the PegaRULES database, an external database, or an Elasticsearch index, select Report definition.
    Note: Secondary sources appear in the Data Flow tab when you start combining and merging data. Secondary sources can originate from a data set, data flow, or report definition.
  5. In the Source configurations window, click Submit.
  6. Optional: To facilitate data processing, transform data that comes from the data source by performing one or more of the following procedures:
  7. Optional: To apply advanced data processing on data that comes from the data source, call other rule types from the data flow by performing one or more of the following procedures:
  8. In the Edit Data flow tab, double-click the Destination shape.
  9. In the Destination configurations window, in the Destination list, define the output point of the data flow by selecting one of the following options:
    • If you want other data flows to use your data flow as their source, select Abstract.
    • If you want an activity to use the output data from your data flow, select Activity.
    • If you want to start a case as the result of a completed data flow, select Case. The created case contains the output data from your data flow.
    • If you want to send output data to a different data flow, select Data flow. Ensure that the data flow that you select has an abstract source defined.
    • To save the output data into a data set, select Data set.
      Note: Do not save data into Monte Carlo, Stream, or social media data sets.

      For more information, see Data Set rule form - Completing Data Set tab.

  10. If you selected a Database Table data set, in the Save options section, choose how you want to save records into the data set:
    • To insert new records without updating any existing records, select Only insert new records.
    • To insert new records and update existing records, select Insert new and overwrite existing records.
      Note: By default, this option uses the SQL merge statement to write records, which generally ensures faster processing. However, for older Postgres versions, this logic may fail. Additionally, in some scenarios, the merge statement may not perform as well as the delete and insert logic that was used in earlier Pega Platform versions. If you experience any such issues, you can revert to using the older logic (delete and insert) by creating the decision/datasets/db/useMergeStatementForUpdates dynamic system setting and setting it to false. For more information, see Failure or performance issues when saving records to the database.
    A Database Table data set configured as a data flow destination
    A Customer Table data set is selected as the destination. The option to insert new and overwrite existing records is selected.
  11. In the Source configurations window, click Submit.
  12. In the Edit data flow tab, click Save.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us