A sample is a subset of historical data that you can extract when you apply a selection or sampling method to the data source. A sample construction helps to construct development, validation, and test data sets for analysis and modeling.
step, in the
workspace, from the
Select the weight field if
drop-down list, click an available weight field.
Typically, a weight field is available when you sample the data before using it in the Prediction Studio portal. If you do not specify the field, each case counts as one.
Select the fields to sample
grid, specify the fields you
want to include in the sample:
column, select a field type from the drop-down
Select the Not used type for fields that you want to exclude from the sample.
- Optional: In the Description column, enter a field definition.
- Optional: In the User defined field, type a new name for a field.
- In the Type column, select a field type from the drop-down list.
- Select a sampling method:
If Then If you want to sample a simple proportion of cases, select the Uniform sampling option.
This method fills the sample table with a random selection of records from the source. The probability of selection is set to achieve the specified percentage or number of cases.
If you want to sample a different proportion of each value for the selected field (stratum) that represents the behavior to be predicted, perform the following actions:
- Select the Stratified sampling option.
- From the Stratum field drop-down list, select the field you want to sample.
- In the table with stratum values, in the Ratio column, set the proportion of population cases to source records.
- In the Sample percentage column, enter the percentage of records that you want to sample.
This method fills the sample table with random selections of each class.
section, define the sample percentage that
you want to use for development, validation, and testing:
- To divide cases among the sets, select the Setting percentages for each set option.
- To divide cases that are available for the field, select the User defined field option.
- Optional: Select a field from the data source to assign the records with the same value to one
- Confirm the sample construction by clicking Next.