Creating an external data flow run

You can specify where to run external data flows and manage and monitor running them on the External processing tab of the Data Flows landing page. External data flows run in an external environment (data set) that is referenced by a Hadoop record on the Pega Platform platform.

Before you can create an external data flow run, you must:

  • Create a Hadoop record that references the external data set on which you want to run the data flow.
  • Create an external data flow rule that you want to run on an external data set.

To specify where to run an external data flow:

  1. In the header of Dev Studio, click Configure > Decisioning > Decisions > Data Flows > External Processing.
  2. Click New.
  3. On the form that opens, provide details about where to run the external data flow:
    • Applies to – The class on which the external data flow is defined.
    • Access group – An instance of Data-Admin-Operator-AccessGroup rule.
    • External data flow – The name of the external data flow rule that you want to use for external processing.
    • Hadoop – The Data-Admin-Hadoop record instance where you want to run the data flow. This field is auto-populated with the Hadoop record that is configured as the source for the selected external data flow rule.
      Note: You can configure multiple instances of a Hadoop record that point to the same external data set but have different run-time settings.
  4. Click Create. The run object is created and listed on the External processing tab.
  5. Optional: In the External Data Flow Run window that is displayed, click Start to run the external data flow. In this window, you can view the details for running the external data flow.
    Note: Depending on the current status of the external data flow, you can also stop running or restart the external data flow from this window or on the External processing tab of the Data Flows landing page.
  6. Optional: On the External processing tab, click a run object to monitor its status on the External Data Flow Run window.