You are here: GdPicture.NET > Documentation > Classes > GdPictureImaging Class > GdPictureImaging Methods > OCR Extension (Tesseract)
logo.gif
ContentsIndexHome
PreviousUpNext
OCR Extension (Tesseract)

These methods are available in:

  • GdPicture.NET Document Imaging SDK Ultimate.
  • GdPicture.NET Document Imaging SDK *.
  • GdPicture.NET Image SDK. *
  • GdPicture.NET TWAIN PRO SDK. *
  • GdPicture.NET TWAIN SDK. *
 

 

* A license of the optional GdPicture Tesseract Plugin is also required. This Plugin is freely included in the Ultimate edition. 

 

Note: See PDF Creation / Building Searchable PDF & PDF/A section for PDF/OCR purpose.

 
Name 
Description 
 
this function to release from memory all information about the last OCR process done by the OCRTesseractDoOCR() function.  
 
Starts a character recognition process on a GdPicture image or on an area of a GdPicture image defined by SetROI() function.
For each language of recognition, you have to deploy specific files. See the Dictionary parameter.  
 
This function returns the bottom position in pixel of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the ASCII code of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the confidence of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the number of characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the left position in pixel of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the line position of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the right position in pixel of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the number of spaces detected before a character recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This function returns the top position in pixel of one of the characters recognized during the last OCR process done by the OCRTesseractDoOCR() function.  
 
This is the overview for the OCRTesseractGetOrientation method overload. 
 
The tesseract engine is based on a learning algorithm. Therefore, accuracy of a character recognition process can depend on previous OCR process.
this function to reinitialize the Tesseract engine in its default configuration.  
 
Defines a maximum height for recognized characters.  
 
Defines a maximum height for recognized characters.  
 
Defines a minimum height for recognized characters.  
 
Defines a minimum width for recognized characters.  
 
Set the number of 'learning pass' for each recognition process (1 by default).
A value between 2 and 5 can increase quality of recognition.
A value of 2 is recommended for full page OCR process.  
What do you think about this topic? Send feedback!
Copyright (c) 2009-2011 www.gdpicture.com. All rights reserved.