Text analytics accuracy measures

Models predict an outcome, which might or might not match the actual outcome. The following measures are used to examine the performance of text analytics models. When you create a sentiment or classification model, you can analyze the results by using the performance measures that are described below.

True positives

The total number of outcomes that are predicted correctly, that is, the predicted outcome matches the actual outcome.

Actual count

The total number of times when a text is classified with this actual outcome, the expected outcome.

Predicted count

The total number of times when the model predicted a text to belong to this outcome.


The fraction of predicted instances that are correct. Precision measures the exactness of a classifier. A higher precision means less false positives, while a lower precision means more false positives. The most effective way to improve precision is to decrease recall.

The following formula is used to determine the precision of a classifier: precision = true positives / predicted count


The fraction of correctly predicted instances. Recall measures the completeness, or sensitivity, of a classifier. Higher recall means less false negatives, while lower recall means more false negatives. Improving recall can often decrease precision because it gets increasingly harder to be precise as the sample space increases.

The following formula is used to determine the recall of a classifier: recall = true positives / actual count


Precision and recall can be combined to produce a single metric known as F-measure, which is the weighted harmonic mean of precision and recall.

The following formula is used to determine the F-score of a classifier: F-score = 2 * precision * recall / (precision + recall)