In the Sample construction step, split the data into the set that is used to train the model and the set that is used to test the model's accuracy.
Select the User-defined sampling based on 'Type' column check box to assign only the records whose Type field in the file that you uploaded is set to Test to the testing sample. Use this option if you have specific sentences to be tested with every model generation for accuracy.
Select the Uniform sampling check box to manually specify the percentage of records that are randomly assigned to the training sample.
The example issues that can be found include the following items:
Improperly formatted columns or missing values
The categories from the taxonomy that do not have a match in the training and testing sample
The categories from the training and testing sample that do not have a match in the taxonomy
Note: It is recommended that you correct any missing values, file formatting, inconsistencies between the taxonomy and the training and testing sample, and any other issues to increase the quality of the model.
Previous: Uploading data fro training and testing |
Next: Creating model |