Links may not function; however, this content may be relevant to outdated versions of the product.
Analyzing text-based content with the NLP Sample application in Pega 7.2
The NLP Sample is a reference application that showcases the text analytics capabilities of the Pega 7 Platform. The NLP Sample application includes the NLP Portal from which you can analyze text-based content, including news feeds, emails, and posts on social media streams such as Facebook, Twitter, and YouTube. This type of information can provide strategic insights and influence enterprise decisions. The NLP Sample application and NLP Portal are included in a special RAP script that you download and import through Designer Studio. This tutorial explains how to install the NLP Sample application and NLP Portal, and analyze the text-based content of tweets, Facebook posts, or YouTube metadata.
- Installing the NLP Sample application
- Testing the text analysis functionality
- Creating rules that support text analysis of social media data
- Analyzing records from social media
- Analyzing text analytics reports
Installing the NLP Sample application
From Designer Studio, import the RAP script that contains the NLP Sample and NLP Portal.
- Download the
NLPSample_7.2.zip archive.
- Log in to the Pega 7 Platform as an administrator.
- Click .
- In the Import wizard, browse your directory and select NLPSample_7.2.zip.
- As you complete the Import wizard, accept the default settings.
- After the import is completed, add the NLPSample:Administrators access group:
- In the Operator. menu, select
- In the Application Access section, click the Add item icon.
- In the Access Group field, enter
NLPSample:Administrators
. - Optional: Click the NLPSample:Administrators access group to make the NLP Sample your default application.
- Log off and log in again with the same credentials.
- In the menu, verify that your application is NLP Sample.
- In the menu, verify that you can access the NLP Portal.
Testing the text analysis functionality
From the NLP Portal, you can test the Pega 7 Platform text analysis functionality without having to complete any prerequisite tasks. For the testing, you can use the example text analysis model or any of your custom models.
Providing input for testing text analysis models
- Open the NLP Portal from Designer Studio by clicking . The portal opens in the browser as a separate tab.
- In the NLP Portal, click Try Text Analysis.
- Select a text analysis model:
- Use example model - Select this option to use the default text analysis model. This option allows you to test the text analysis functionality without the need to configure your own models.
- Select custom model - Select this option to use one of your existing text analysis models.
- In the Applies To Class field, select the class that the text analysis model applies to.
- In the Select Model field, enter the name of the model that you want to use for testing.
- Provide a text sample:
- To use one of your existing text samples:
- Select Input text.
- Paste the text sample into the input field.
- To import a text sample from an external URL:
- Select Input URL.
- In the URL field, specify the address of the text sample, for example, an online article.
- Click Load Content. The Content field is populated with the text obtained from the URL address.
You can edit the imported text.
- To use one of your existing text samples:
- Click Try It to start the analysis process.
- Analyze the results of the test run in the Run result section. The text analysis is divided into the following statistics:
- Overall sentiment - The polarity level at the document level (that is, the expressed opinion in the document). The overall sentiment of the document can be positive, neutral, or negative.
- Text Sentiment - The color-coded sentiment analysis at the sentence level. The following highlight colors identify the sentiment of each sentence:
- Green - Positive
- Gray - Neutral
- Red - Negative
- Categories - The predefined assignment of one or more classes or categories to a document or sentence that makes it easier to manage and sort. The categories are defined in the taxonomy that is part of the text analysis model.
- Entities - The keywords identified in the analyzed sample that belong to predefined categories such as names, organizations, locations, monetary values, and so on.
- Topics - Brand names or their synonyms defined in the text analysis model, for example, Apple, iPhone, Samsung, and so on.
- Features - The terms that are implicitly detected by the text analysis model as the talking point. In the sentence "I am not fond of their webpage as I cannot find anything there," the feature is the term webpage. Currently, feature identification is not supported.
Test run results
Creating rules that support text analysis of social media data
Before you can analyze the text-based content of tweets, Facebook posts, or YouTube metadata, you must complete the following procedures:
- Create and configure an instance of the data set rule that allows you to connect with the Twitter API, Facebook API, or YouTube Data API.
You can also use one of the sample instances of the Data Set rule (Facebook DS, Twitter DS, or YouTube DS) that are delivered in the NLPSample.jar file, but you must change its access details. The data sets do not work with the sample access details and can cause errors.
- Create and configure an instance of the free text model rule.
- Create an instance of the data flow rule to reference the free text model rule and to process the records handled by the data set.
For detailed steps, see the following tutorials:
- Analyzing the text-based content posted on Facebook
- Analyzing the text-based content posted on Twitter
- Analyzing the metadata of YouTube videos
Analyzing records from social media
You can view text analytics statistics for each processed record. For the records to appear in the NLP Portal, activate the data flow that you created for text analysis and keep it active until it processes some records.
- Open the NLP Portal from Designer Studio by clicking . The portal opens in the browser as a separate tab.
- In the NLP Portal, click View Analyzed Records. The list of processed records is displayed. Each record is is defined by a text excerpt, identified sentiment, language, source, data set, and time when the record was processed.
- To analyze a record:
- Double-click a record. The Record Details window displays the results of the test run for that record.
- Optional: In the Record Details window, click Previous or Next to switch between records in the list.
List of analyzed records
Analyzing text analysis reports
On the View Reports page, you can examine statistics about the content from social media that was processed by the NLP Sample application. The accumulated data from all processed records is presented here in different charts that show the sentiment types, categories, and record types classified according to various criteria.
- Open the NLP Portal from Designer Studio by clicking . The portal opens in the browser as a separate tab.
- In the NLP Portal, click View Reports. The page with text analysis reports opens. You can view the following types of reports:
- Overall Sentiment Split - Shows the sentiment distribution for all processed records.
- Volume By Source - Shows the number of records per source processed on each day of the month.
- Sentiment Split Per Source - Shows the number of records with a particular sentiment processed per source.
- Volume By Category - Shows all categories identified by the text analytics model and the number of records (divided by type) for each category.
- Overall Sentiment Split - Shows the sentiment distribution for all processed records.
In this tutorial, you have just completed text analysis of content from one of the social media streams (Facebook, Twitter, or YouTube). You activated a data flow that references an instance of the data set rule and free text model rule. The data set connected to an appropriate API and the Free Text Model rule conducted the text analysis. You have also analyzed the processed records in the NLP Portal by using the available analysis tools.