Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Parsing emails

Updated on May 17, 2024

Pega Platform provides the pxEmailParser model that you can use as a preprocessing model to analyze incoming emails and parse their content into logical components: body, signature, and disclaimer. You can define the components on which you want to perform text analysis and which you want to exclude from analysis.

Purpose

A typical use case for using the email parser is when you expect that the signature and disclaimer can adversely affect the outcomes of the topic or sentiment analysis. The email parser ensures that the downstream models work on only the relevant portions of the email. You can define which parts of an email to include for analysis by configuring the text analyzer that is associated with an email channel.

Parsing email content before further text analysis
A flow chart shows that an in incoming email is parsed into three components: body, signature, disclaimer.

Email components

Email components that the email parser can identify hold specific types of information:

Body
Contains the main message of an email.
Disclaimer
Holds a legal notice or warning, for example, a copyright or confidentiality disclaimer. Usually, placed after the signature.
Signature
Contains a sign-off message, the sender's name, contact details, and similar information. Usually, placed at the end of an email.
For example: Here are some examples of how an email is parsed:
EmailExtracted text

Hi,

Can we have a call tomorrow to discuss on this recovery data set.

Thanks & Regards

Dave

-----------------------------------------------------

This message has been prepared by a Sales or Trading function of one or more affiliates of the Bank and is not the product of the Research Dept. It is not a research report.

This should not be construed as an offer to sell or the solicitation of an offer to buy any security in any jurisdiction where such an offer or solicitation would be illegal. It does not constitute a recommendation or take into account the investment objectives, financial conditions, or needs of individual clients.

This email and any files transmitted with it are confidential and intended solely for the person or entity to whom they are addressed and may contain confidential and/or privileged material.

Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer.

<START:body> Hi,

Can we have a call tomorrow to discuss on this recovery data set. <END>

<START:signature> Thanks & Regards

Dave <END>

<START:disclaimer>

-----------------------------------------------------

This message has been prepared by a Sales or Trading function of one or more affiliates of the Bank and is not the product of the Research Dept. It is not a research report.

This should not be construed as an offer to sell or the solicitation of an offer to buy any security in any jurisdiction where such an offer or solicitation would be illegal. It does not constitute a recommendation or take into account the investment objectives, financial conditions, or needs of individual clients.

This email and any files transmitted with it are confidential and intended solely for the person or entity to whom they are addressed and may contain confidential and/or privileged material.

Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer. <END>

Hi John,

We could replicate it in our environment.

Could you please provide the environment details?

Thanks

Mark

<START:body> Hi John,

We could replicate it in our environment.

Could you please provide the environment details? <END>

<START:signature> Thanks

Mark <END>

Hello,

The last merge into Cambridge was performed Friday.

Please change your bug status accordingly.

DISCLAIMER:

The information contained in this e-mail message is for the use of the addressee and is solely intended for the person to whom it has been sent. This message may contain legally privileged and confidential information which may not be made public. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the same. Internet e-mails are not necessarily secure. ADP does not accept responsibility for changes made to this message after it was sent. ADP may monitor e-mails for business and operational purposes.

<START:body> Hello,

The last merge into Cambridge was performed Friday.

Please change your bug status accordingly. <END>

<START:disclaimer>

DISCLAIMER:

The information contained in this e-mail message is for the use of the addressee and is solely intended for the person to whom it has been sent. This message may contain legally privileged and confidential information which may not be made public. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the same. Internet e-mails are not necessarily secure. ADP does not accept responsibility for changes made to this message after it was sent. ADP may monitor e-mails for business and operational purposes. <END>

Configuration

Your application parses emails according to the settings of the text analyzer that is associated with an email channel. In the text analyzer configuration, you can select the default pxEmailParser or a different model as the preprocessing model with which you want to parse email content. You can also decide which type of analysis (topic, sentiment, entity) you want to perform on each email component (body, signature, disclaimer).

Training and testing

You can train the email parser with sample emails from your domain to increase the accuracy with which the email parser identifies the signature, body, and disclaimer. As you train or troubleshoot problems with the email parser, you can test the model to see if it works as expected.

Supported languages

The pxEmailParser model supports several languages.

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us