Skip to main content


         This documentation site is for previous versions. Visit our new documentation site for current releases.      
 

Extract content from image-based files

Updated on January 27, 2022

With the new DocumentOcr component you convert an image-based document that contains text into text that you can manipulate in an automation. Use this component with images, such as a faxed document, or documents that contain both text and images. This component works with image files (such as .png, .jpeg, .tiff), PDF files, and Microsoft Word documents.

  • Use in automations to extract data from the supported file formats.

  • Use in automations for searching and copying data from unstructured documents.

For more information, see DocumentOcr Component.

  • Previous topic Integrate robotic automations into your application by using the Connect Robot rule (8.2)
  • Next topic Screen scrape Windows applications

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us