Text Analytics in Pega 7.3
Pega Platform provides the following techniques that you can use to process and structure text data from the Facebook, Twitter, and YouTube social media platforms:
- Sentiment Analysis – Detect and analyze the feelings (attitudes, emotions, opinions) that characterize a unit of text.
- Classification Analysis – Assign one or more classes or categories to a text sample to make that text easier to manage and sort.
- Entity Extraction – Extract named entities from text data and assign them to predefined categories, such as names of organizations, locations, people, quantities, or values.
- Intent analysis – Determine a user's intent in social media posts, comments, messages, emails, and so on, to find out whether that user is likely to subscribe to your services, or buy your products.
To enable text analysis in your application, configure and customize the underlying infrastructure in the form of data sets, text analyzers, and data flows.
Try text analytics with NLP Sample
Use the NLP Sample application to explore the natural language processing capabilities of Pega Platform. The application is available as an archive that you can download and install. The application includes the following components:
- The NLP Sample portal.
- A set of rules that constitute the text analytics infrastructure in your application, including a sample text analyzer that supports sentiment, classification, and entity extraction analysis.
- A collection of taxonomies that demonstrate various use cases for classification analysis, for example, telecom, banking, customer service, and automobile.
You can use the provided rules and text samples for text analysis or you can configure your own rules and models to explore text classification, sentiment analysis, topic detection, and entity extraction.
Exploring text analytics with the NLP Sample application
Configure text analyzer
Configure Text Analyzer rules to process content that is extracted from social media (Twitter, Facebook, and YouTube), emails, chat-bot messages, databases, REST APIs, customer support tickets, and so on. Use a variety of tools (such as lexicons, taxonomies, machine learning models) to customize the sentiment, classification, entity extraction, and intent analysis that you want to apply to the text content that interests you.
Build machine learning models
Become a data scientist and use Pega Platform to employ machine learning in your application. Use the Decision Analytics work area to generate custom models for sentiment and classification analysis. Using a specialized wizard for model creation, you can perform the following actions:
- Upload the resources that are required to generate the models (for example, a corpus of documents such as emails or tweets, that has already been classified as having positive, neutral, or negative sentiment, for use as a training sample for sentiment model generation).
- Define the model details and the algorithm that the application uses to train the model: maximum entropy (MaxEnt), Naïve Bayes, or support vector machine (SVM). Depending on your approach and your business goal, you might want to use a specific algorithm type. For example, you can use the SVM algorithm for large sets of training data.
- Review the model configuration. Use a variety of measures, like F-score, precision, and recall to determine the accuracy of the model. You can also test the generated model against any number of test samples.
- Export the generated model or upload it to your application for use in text analyzers.
By uploading machine learning models as part of text analyzer rules, you can enhance the accuracy with which text analyzer rules detect sentiment or classify text.
Create and configure data sets for text content
Retrieve the text content that interests you from a variety of sources. You can retrieve text content from Facebook, Twitter, and YouTube social media platforms, emails, or databases, and analyze it in your application. Additionally, by using stream data sets, you can extract instant messages from the WhatsApp service and access blog posts through the Webhose.io platform.
Tutorial: Analyzing WhatsApp content in NLP Sample in real time
Tutorial: Analyzing content from Webhose.io in NLP Sample in real time
Customize metadata retrieval
Optionally, if you are analyzing text content from Facebook or Twitter, you can customize your social media data set to retrieve additional metadata, such as user verification information, profile pictures, icons, or other information that is relevant to achieving your business goal. You can configure the metadata retrieval criteria on the Social Media Metadata landing page that is available in applications that have access to the Pega-NLP ruleset.
Combine and process
Arrange the text analyzer and other rules that you created into a processing pattern of a data flow or a process flow.
Data flows offer a flexible solution for combining all your data points into a processing pattern that has a source from which the input data is taken and a destination to which the results are saved. Between the source and destination, you can apply various processing instructions in the form of different shapes. In a data flow that is designed for text analytics of Facebook, Twitter, or YouTube, you can reference a social media data set as the source, apply processing instructions in the form of text analyzers, and save the results into a database, activity, or a JSON file. You can also enrich your data flow with additional shapes, such as filters, to process only negative content or the content from the most influential users, and so on.
You can also reference a text analyzer rule in a process flow by using the Utility shape to analyze the text content of emails or customer support tickets.