Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Scaling decisions and managing data

Updated on February 18, 2020

Data flows are scalable and resilient data pipelines that you can use to ingest, process, and move data from one or more sources to one or more destinations. Each data flow consists of components that transform data in the pipeline and enrich data processing with event strategies, strategies, and text analysis. The components run concurrently to handle data starting from the source and ending at the destination.

Data Flow definition

You can create a data flow instance from the Data Model category. Select its source, add components, and select its destination. The data flow components that you can use depend on the type of data flow that you want to build. A simple data flow can move data from a single source dataset, apply a filter, and save the results in a single destination data set. More complex data flows can be sourced by other data flows, and can compose the source data with secondary data sources, apply strategies for data processing, and open a case or trigger an activity.

Thumbnail

A Data Flow that calls a Strategy to determine the next best action and writes results to the Strategy Result class

For more information, see Types of data flows.

To run a data flow on all service nodes, you can configure the Data Flow service from the Services landing page. Add more nodes to the Data Flow service to scale data processing. A data flow that is run through the Data Flows landing page uses the checked-in instance of the data flow and the referenced rules. 

Thumbnail

Configuring the Data Flow Service

You can run a data flow on the current node and in the context of the operator by clicking Actions > Run in the rule form. The operator context includes checked-out rules. You can run the data flow from the rule form to test your local changes to the data flow.

For more information, see Configuring the Data Flow service.

Data Flow management

On the Data Flows landing page, you can run and manage batch, real-time, single case and external data flows. 

Thumbnail

An active Data Flow on the Data Flows landing page

For more information, see Data Flows landing page.

Advanced features of data flows

You can build advanced use cases, use data flows programmatically, and train your adaptive models by running data flows.

For more information, see Configuring advanced features of data flows.

API methods for running Data Flows

Apart from using the standard UI-driven process of running and managing data flows, you can configure certain operations to run automatically through API methods. When you know how to create and configure Pega Platform activities, you can run data flows programmatically by using the DataFlow-Execute method. For example, you can configure an activity to start a data flow at a specified time.

  • Previous topic Stage-based revision and change request management
  • Next topic Headless decisioning beginning with Pega 7.2.2

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us