Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Preparing data

Updated on July 5, 2022

The Data preparation step begins when you connect to a database or upload your data from a data set or a CSV file.

The columns in the data source are used as predictors but you can later define their roles. For more information, see Defining the predictor role.

The data is necessary to create a statistically relevant sample with customer details that can be further segregated into different dataset types such as development, validation, and testing. The customer data that goes into development sample is used to develop predictive models. Data in the validation and test sample is used to validate and test model accuracy.

The data source contains customer and their previous behavior information. It should contain one record per customer, each record presented in the same structure. Ideally, the data should be present for all fields and customers but in most circumstances some missing data can be tolerated.

Based on your model selection and outcome field categorization, Prediction Studio generates data that you can view in the Graphical view tab and Tabular view tab. For more information, see Defining an outcome.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us