BIX performance
Understanding the performance trade-offs of the BIX options will help you to maximize the throughput of data extracts using BIX. In addition, you may need to perform some tuning in the Pega Platform database.
Performance trade-offs
The three most important factors that determine the throughput of the BIX extraction process, (that is, how many instances can be extracted from a class per hour), are:
-
The number of instances extracted.
Use the options on the
Filter Criteria
tab of the Extract rule form to limit extraction to just those instances needed. In particular, use the incremental extraction option (the
Use Last Updated Time as Start
option on the Filter Criteria tab) to limit the instances extracted to those that have been created or changed since the last time the Extract rule was run:
-
If you select the incremental extraction option, a filter condition is automatically included during an extract that restricts extraction to class instances where the
pxCommitDateTime
property value is greater than or equal to the last date and time when the extract ran.
pxCommitDateTime
is automatically set for each class instance when it is changed in the database by the Pega Platform database regardless of how the data was created or changed (interactively by an end user, programmatically by an activity, through an import, or by any other means).
Note: The pxCommitDateTime property may sometimes need to be added as a database column to the class table. For example, it is not added automatically on upgrades. An extract using the incremental extraction property will fail if this column does not exist in the class table.
-
As an alternative to using the incremental extraction option, you can manually specify a filter condition that compares the symbolic value Last Extraction Time to any optimized DateTime property in the class. At runtime, this value is replaced with the date and time when the extract rule last ran. If you do so, this database column should be indexed for better performance.
- Also note that when incremental extraction is performed, class instances with NULL values of the pxCommitDateTime property will be skipped and when using the -c command line option to extract from child classes, those child classes whose class tables do not include the pxCommitDateTime property will be skipped.
- When it is necessary to perform extraction within a small time window, for example, extract all instances within 15 minutes to accommodate scheduled downstream processes, you may want to run multiple extracts from the same class in parallel. You can do this by using different filter criteria on each run to select distinct sets of class instances. For example, use filter criteria that select data for different product lines, geographies, etc.
-
If you select the incremental extraction option, a filter condition is automatically included during an extract that restricts extraction to class instances where the
pxCommitDateTime
property value is greater than or equal to the last date and time when the extract ran.
pxCommitDateTime
is automatically set for each class instance when it is changed in the database by the Pega Platform database regardless of how the data was created or changed (interactively by an end user, programmatically by an activity, through an import, or by any other means).
- The output format selected . XML is much faster than CSV output or writing directly to a destination database, since data in embedded page lists and page groups does not need to be normalized as it is written.
- The number of properties extracted, and how deeply embedded those properties are . The more deeply nested the properties are in the class data model, the slower performance will be.
- For more details on these last two factors, see details of the BIX performance benchmark.