Processing data with data flows
Data flows are scalable and resilient data pipelines that you can use to ingest, process, and move data from one or more sources to one or more destinations.
Each data flow consists of components that transform data in the pipeline and enrich data processing with event strategies, strategies, and text analysis. The components run concurrently to handle data starting from the source and ending at the destination.- Creating a data flow
Create a data flow to process and move data between data sources. Customize your data flow by adding data flow shapes and by referencing other business rules to do more complex data operations. For example, a simple data flow can move data from a single data set, apply a filter, and save the results in a different data set. More complex data flows can be sourced by other data flows, can apply strategies for data processing, and open a case or trigger an activity as the final outcome of the data flow.
- Making decisions in data flow runs
Create batch runs for your data flows to make simultaneous decisions for large groups of customers. You can also create a batch run for data flows with a non-streamable primary input.
- Managing data flow runs
Control record processing in your application by starting, stopping, or restarting data flows. Monitor data flow status to achieve a better understanding of data flow performance.
- Data flow run limits
A large number of data flow runs that are active at the same time can deplete your system resources. To ensure the efficient processing of data flows, you can configure dynamic system settings to limit the number of concurrent active data flow runs for a node type.
- Data flow run priorities
Use the data flow service to protect the system against running out of resources. The data flow service can automatically queue and execute data flow runs according to configured priorities that indicate the relative importance of each run. For example, you can set the priority of an important data flow run to High to ensure that it will be performed first.
- Data flow methods
Data flows can be run, monitored, and managed through a rule-based API. Data-Decision-DDFRunOptions is the container class for the API rules and provides the properties required to programmatically configure data flow runs. Additionally, the DataFlow-Execute method allows you to perform a number of operations that depend on the design of the data flow that you invoke.
- Data Flow limitations
The following limitations apply to the configuration of Data Flows in Pega Platform. Ensure that you understand how these limitations affect your Data Flows to configure them correctly and avoid errors.
Previous topic Configuring the DataSet-Execute method for Visual Business Director Next topic Creating a data flow