Optical Character Recognition (OCR) PDF

OCR PDF generates a searchable PDF file in Object Storage. For example, Document Understanding can take a PDF file with text and images, and return a PDF file where you can search for the text in the PDF.

Supported features:
  • Generate searchable PDF
  • Single request
  • Batch request

OCR PDF Example

An example of OCR PDF use in Document Understanding.

Input
OCR PDF Input Page from a PDF document API Request:
{ 
            "processorConfig": {  
            "processorType": "GENERAL",  
            "features": [   
            {    
            "featureType": "TEXT_EXTRACTION",    
            "generateSearchablePdf": true   
            }  
            ] 
            }, 
            "inputLocation": {  
            "sourceType": "OBJECT_STORAGE_LOCATIONS",  
            "objectLocations": [   
            {    
            "source": "OBJECT_STORAGE",    
            "namespaceName": "",    
            "bucketName": "",    
            "objectName": ""   
            }  
            ] 
            }, 
            "compartmentId": "", 
            "outputLocation": {  
            "namespaceName": "",  
            "bucketName": "",  
            "prefix": "" 
            }
            }
Output:
Searchable PDF.