Document Processing Service profiles
You specify the OCR and highlighting service profiles that you want to use for DPS in the ocrPredefinedProfile and highlightPredefinedProfile parameters of the configureDPSABBYY data transform, respectively. For more information, see Configuring the Document Processing Service component.
The list below describes the available profiles:
- BarcodeRecognition_Accuracy
-
Used for barcode extraction. In this profile, the system extracts only barcodes and text, and does not detect pictures and tables. This profile optimizes settings for accuracy.
For compatibility purposes, you can also access this profile by using the BarcodeRecognition name.
- BarcodeRecognition_Speed
-
Used for barcode extraction. In this profile, the system extracts only barcodes, and does not detect pictures, text or tables. This profile optimizes settings for processing speed.
- BookArchiving_Accuracy
Used for creating an electronic library. The settings for this profile are optimized for accuracy and best quality. This profile enables font style detection and full synthesis of the logical structure of the document.
- BookArchiving_Speed
-
Used for creating an electronic library. This profile optimizes settings for processing speed in the following way:
- Best quality. Enables font style detection and full synthesis of the logical structure of the document.
- The document analysis and recognition process works faster.
- Default
-
Sets all the processing parameters to the values of the BookArchiving_Accuracy profile.
- DocumentArchiving_Accuracy
-
Used for creating an electronic archive. This profile optimizes settings for accuracy in the following way:
- The system enables the detection of the maximum amount of text in an image, including text that is embedded in the image.
- The system does not perform skew correction.
- The system does not detect fonts and styles.
- The system does not perform full synthesis of the logical structure of the document.
- DocumentArchiving_Speed
-
Used for creating an electronic archive. This profile optimizes settings for processing speed in the following way:
- The system enables the detection of the maximum amount of text in an image, including text that is embedded in the image.
- The system does not perform skew correction.
- The system does not detect fonts and styles.
- The system does not perform full synthesis of the logical structure of a document.
- The document analysis and recognition process works faster.
- DocumentConversion_Accuracy
-
Used for converting documents. The settings for this profile are optimized for accuracy and best quality. This profile enables font style detection and full synthesis of the logical structure of the document.
- DocumentConversion_Speed
-
Used for converting documents. This profile optimizes settings for processing speed in the following way:
- Best quality. Enables font style detection and full synthesis of the logical structure of the document.
- The document analysis and recognition process works faster.
- EngineeringDrawingsProcessing
-
Used for recognizing technical drawings. The system takes into account the large size and the complexity of engineering diagrams, as well as different text orientations within the image. The purpose of this profile is to convert the images into a searchable PDF format. This profile uses the following settings:
- The system enables the detection of all text in an image, including text blocks in a vertical orientation.
- The system does not perform full synthesis of the logical structure of a document.
- HighCompressedImageOnlyPdf
-
Used for creating high-compression PDF files that contain full documents saved as pictures. This profile uses the following settings:
- The system does not perform document recognition and synthesis of the logical structure of a document.
- The system does not perform skew correction.
- PDF export is optimized for minimum size of the output file.
- The entire document is saved as a picture using the PEM_ImageOnly mode.
- TextExtraction_Accuracy
-
Used for extracting text from a document. This profile optimizes settings for accuracy in the following way:
- The system enables the detection of all text in an image, including small text areas that are of low quality. Pictures and tables are not detected.
- The system does not detect fonts and styles.
- The system does not perform full synthesis of the logical structure of a document.
- TextExtraction_Speed
-
Used for extracting text from a document. This profile optimizes settings for processing speed in the following way:
- The system enables the detection of all text in an image, including small text areas that are of low quality. Pictures and tables are not detected.
- The system does not detect fonts and styles.
- The system does not perform full synthesis of the logical structure of a document.
- The document analysis and recognition process works faster.