Analyzing the metadata of YouTube videos in Pega 7.1.9
The Pega 7 Platform provides text analytics so that users can analyze text-based content such as news feeds, emails, and postings on social media streams including Facebook, Twitter, and YouTube. Use this tutorial to learn how to configure the Pega 7 Platform to analyze the metadata of YouTube videos.
The social video-sharing platform YouTube provides several tools for community interaction. Users can upload videos and make them available for others to watch. When uploading a video, users can provide metadata that helps index the video. The metadata includes titles, keywords, descriptions, tags, author's name, and categories.
Using the new text analytics capability of the Pega 7 Platform, you can analyze the video metadata for particular keywords. You can retrieve video URLs and comments to get community feedback, such as implicit knowledge about users, videos, and community interests. Such information can provide strategic insights and influence enterprise decisions.
- Prerequisites
- Creating an instance of the YouTube data set
- Creating an instance of the Free Text Model rule
- Creating an instance of the Data Flow rule
- Analyzing the metadata of YouTube videos
Prerequisites
- Obtain a Google API key from the Google Developers website. This key is necessary to configure the YouTube data set and get access to the YouTube data.
- Add the PEGA-NLP ruleset to your application.
- If you use IBM WebSphere Application Server or Oracle WebLogic Server to run the Pega 7 Platform, you need to configure the Signer and SSL Certificate settings. Without this configuration the YouTube data set does not work.
Creating an instance of the YouTube data set
Create and configure a YouTube data set called YouTubeData to establish a connection with the YouTube Data API.
- Click the Application menu in Designer Studio and switch to your application.
- In the App Explorer, click <app_name> >Data Model >Data Set.
- Right-click YouTube, and click Create.
- Name the data set YouTubeData.
- From the Type list, select YouTube.
- Specify the context where you want to create the data set:
- In the Apply to (class) field, select the Data-Social-YouTube class.
- Click Create and open.
- On the YouTube tab, provide the Google API key.
Optional: Select the Retrieve video URL check box.
If the metadata of a particular YouTube video contains the keywords that you specify, this option retrieves the URL of this video.
Optional: Select the Retrieve comments check box.
If the metadata of a particular YouTube video contains the keywords that you specify, this option retrieves all the user comments about the video.
In the Keywords section, click Add keyword and type the keyword or keywords that you want to find in the video metadata. The metadata that contains the keywords undergoes text analysis.
- Optional: In the Authors section, click Add author and type the names of one or more users whose videos you want to ignore.
- Click Save.
Creating an instance of the Free Text Model rule
Create a Free Text Model rule called SampleModel and configure it to analyze sentiment only. For more information see the Free Text Model rule.
- In the Records Explorer, click Decision > Free Text Model.
- Click Create.
- Name the rule SampleModel.
- Specify the context where you want to create the rule:
- In the Data-Social-YouTube class.
You do not need to create the SampleModel rule in the same class as the YouTubeData data set, but it needs to be in the Data-Social-YouTube class hierarchy. You can use the top level class or the base class. field, select the
- In the Data-Social-YouTube class.
- Click .
- Enable sentiment analysis:
- Select the Enable sentiment analysis check box.
- In the pySentimentLexicon. field, select
- In the pySentimentModels. field, select
- Click the I/O Mapping tab.
- In the .pyText property. field, set the
In the .NLPOutcome property.
field, set theCreate the property if it does not exist. This must be a single-page property defined on the Data-NLP-Outcome class.
Click
.
Creating an instance of the Data Flow rule
Create a data flow called NLPProcess to reference the SampleModel rule and to process the metadata of the YouTube videos that are handled by the YouTube data set.
- In the Records Explorer, click Data Model > Data Flow.
- Click Create.
- Name the rule NLPProcess.
- Specify the context where you want to create the rule:
- In the Data-Social-YouTube class. field, select the
- Click .
- Double-click the Source shape.
- In the Source properties dialog box, from the list select
- From the YouTube and click Submit. list, select
- Navigate to the Source shape and click the green add icon.
- From the list, select Free Text Model.
- Double-click the Free Text Model shape.
- In the Free Text Model properties dialog box, in the field reference the SampleModel rule.
- Click Submit.
- Navigate to the Free Text Model shape and click the green add icon.
- From the list, select Filter.
- Double-click the Filter shape.
- Name the shape Sentiment.
In the Filter conditions section, specify the following condition: .NLPOutcome.pyOverallSentiment = "negative"
The outcome property that you use in the filter, must be the same as the one that you specified in the SampleModel rule.- Click Submit.
- Click the Destination shape.
- In the Destination properties dialog box, from the Destination list, select Activity.
- In the Activity field, reference the following activity: pxSaveSummaryForReporting.
- Click Submit.
- Click Save.
Analyzing the metadata of YouTube videos
Activate the NLPProcess data flow and keep it active until it processes some records.
- Click the Application menu in Designer Studio, and switch to your application.
- Open the NLPProcess data flow.
- Click Actions > Run.
- In the Data Flow Test Run dialog box, click Activate.
- Wait until the data flow processes some records.
You created the NLPProcess data flow, which uses the YouTubeData data set and the SampleModel rule. The YouTubeData data set allows you to filter the metadata of YouTube videos according to the keywords that you specified in it. The SampleModel rule checks the overall sentiment of the metadata content. At the end, the metadata with negative sentiment is saved in the pxSaveSummaryForReporting property. You can use a report definition to retrieve information from this property.