You are here: GdPicture.NET > Documentation > Classes > GdPictureImaging Class > GdPictureImaging Methods > PDF Creation > Building Searchable PDF & PDF/A > PdfAddGdPictureImageToPdfOCR Method
logo.gif
ContentsIndexHome
PreviousUpNext
GdPictureImaging.PdfAddGdPictureImageToPdfOCR Method

Appends a GdPicture Image into the multipage PDF file created by the PdfOCRStart() or PdfOCRStartStream() function. 

For each language of recognition, you have to deploy specific files. See the Dictionary parameter.

C#
public String PdfAddGdPictureImageToPdfOCR(
    int PdfID, 
    int ImageID, 
    TesseractDictionary Dictionary, 
    String DictionaryPath, 
    String CharWhiteList
);
Visual Basic
Public Function PdfAddGdPictureImageToPdfOCR(
    ByVal PdfID As Integer, 
    ByVal ImageID As Integer, 
    ByVal Dictionary As TesseractDictionary, 
    ByVal DictionaryPath As String, 
    ByVal CharWhiteList As String
) As String
Parameters
Parameters 
Description 
PdfID 
A PDF identifier returned by a PdfOCRStart() or PdfOCRStartStream()function.  
ImageID 
The GdPicture Image to add to the multipage PDF.  
Dictionary 
The dictionary to use. A member of the TesseractDictionary enumeration.  
DictionaryPath 
The path into which the engine can find the dictionary files (see the TesseractDictionary enumeration).  
CharWhiteList 
This parameter can be used to specify your own white list of chars. IE:
  • If you want to recognize only numeric you can use "0123456789".
  • If you want to recognize only uppercase alpha you can use "ABCDEFGHIJKLMNOPQRSTUVWXYZ"...
Use empty string to recognize all characters.  
Returns

The recognized text.

This function requires the optional GdPicture Tesseract Plugin. This plugin must be unlocked with the SetLicenseNumberOCRTesseract() function. 

Use the GetStat() function to determine if this function succeeded.

How to scan pages of a document feeder to a multipage searchable PDF file.

Dim oGdPictureImaging As New GdPictureImaging
Dim ImageID As Integer
Dim bContinue As Boolean = True
Dim PdfID As Integer
 
oGdPictureImaging.SetLicenseNumber("XXX") 'Replace XXX by a demo or commercial license key
 
 
oGdPictureImaging.TwainOpenDefaultSource(Me.Handle)
oGdPictureImaging.TwainSetAutoFeed(True)
oGdPictureImaging.TwainSetAutoScan(True)
PdfID = oGdPictureImaging.PdfOCRStart("output.pdf", True, "", "", "", "", "")
 
If PdfID <> 0 Then
   While bContinue
         ImageID = oGdPictureImaging.TwainAcquireToGdPictureImage(Me.Handle)
         oGdPictureImaging.PdfAddGdPictureImageToPdfOCR(PdfID, ImageID, TesseractDictionary.TesseractDictionaryEnglish, "C:Program FilesGdPicture.NETRedistOCR", "")
         oGdPictureImaging.ReleaseGdPictureImage(ImageID)
         bContinue = (oGdPictureImaging.TwainGetState > TwainStatus.TWAIN_SOURCE_ENABLED)
   End While
   oGdPictureImaging.PdfOCRStop(PdfID)
End If
oGdPictureImaging.TwainCloseSource()
What do you think about this topic? Send feedback!
Copyright (c) 2009-2011 www.gdpicture.com. All rights reserved.