Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Reviewing the taxonomy for machine learning topic detection

Updated on May 17, 2024

Verify the correctness of the taxonomy of topics that Prediction Studio generated from the training data. If you updated an older version of a model, the taxonomy might include topics from that version. Clean up your model by deleting topics that have no training data, and improve the model's predictions by adding keywords.

Keywords influence the behavior of a machine learning model, but they are not exact rules. The Should, Must, and And words act as positive features for matching a text to a topic, while the Not words act as negative features. The training and testing data have the greatest impact on your machine learning model, while keywords have a smaller impact.

You cannot add topics in this step. If you want to add topics, go back to the Source selection step. For more information, see Uploading data for training and testing of the topic model.

  1. In the Taxonomy review wizard step, review the taxonomy details, and then expand the taxonomy to view the topics.
    The hierarchy of the taxonomy is used to group topics. Do not add training data or keywords to grouping topics.
  2. Review the summary of training and test data for individual topics by selecting the topics in the list.
  3. Optional: To add positive or negative features for matching a text to a topic, add keywords to the topic:
    1. Select the topic, and then click the Manage keywords tab.
    2. In the Keywords section, enter keywords to influence the model's predictions.
      Keywords can be words or phrases. You can enter several keywords in each category.
      For example:
      Should words
      phonetelephonemobile
      And words
      call
  4. Optional: To delete topics that do not contain any training data, select a topic, and then click Delete.
    Topics without any training data might appear in the taxonomy when you start with a keyword-based model, and then update it to a machine learning model. If the training data that you use to train the new model contains a smaller number of topics than the original keyword-based model, only that number of topics get trained, and the remaining topics are without training data.
  5. Click Next.
What to do next: Select the algorithms that Prediction Studio uses to build the model, and then start the building process. For more information, see Training and testing the topic model.
  • Previous topic Defining the training and testing samples for topic detection
  • Next topic Training and testing the topic model

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us