- Ensure that the system locale language settings are set to UTF-8.
- Specify a repository for text analytics models. For more information, see Specifying a database for Prediction Studio records.
- Setting up a machine learning topic model
Start building a topic model based on machine learning by specifying the model name, language, and corresponding ruleset.
- Uploading data for training and testing of the topic model
Upload sample records to train the model and to test whether the model assigns the topics correctly.
- Defining the training and testing samples for topic detection
Split the uploaded data into a set for training the model and a set for testing the model accuracy.
- Reviewing the taxonomy for machine learning topic detection
Verify the correctness of the taxonomy of topics that Prediction Studio generated from the training data. If you updated an older version of a model, the taxonomy might include topics from that version. Clean up your model by deleting topics that have no training data, and improve the model's predictions by adding keywords.
- Training and testing the topic model
Select the algorithms that Prediction Studio uses to build the model, and then start the building process.
- Reviewing the topic model
Review the created model by analyzing the results of testing against the provided training data.
- Saving the topic model
Save the model to use it as part of the Pega Platform text analytics feature. You can also download a file that contains the model that you created.