You are here: GdPicture.NET > Documentation > Classes > GdPictureImaging Class > GdPictureImaging Methods > PDF Creation > Building Searchable PDF & PDF/A > PdfOCRCreateFromMultipageTIFF > PdfOCRCreateFromMultipageTIFF Method (Integer, TesseractDictionary, String, String, String, Boolean, String, String, String, String, String)
logo.gif
ContentsIndexHome
PreviousUpNext
GdPictureImaging.PdfOCRCreateFromMultipageTIFF Method (Integer, TesseractDictionary, String, String, String, Boolean, String, String, String, String, String)

Creates a multipage searchable PDF from a multipage tiff image and performs an OCR recognition. This image can be a read only multipage tiff image or an editable multipage tiff image. The recognized text is written invisibly to the PDF in order to simplify indexing and search. 

 

For each language of recognition, you have to deploy specific files. See the Dictionary parameter.

C#
public String PdfOCRCreateFromMultipageTIFF(
    int ImageID, 
    TesseractDictionary Dictionary, 
    String DictionaryPath, 
    String CharWhiteList, 
    String FilePath, 
    Boolean PDFA, 
    String Title, 
    String Author, 
    String Subject, 
    String Keywords, 
    String Creator
);
Visual Basic
Public Function PdfOCRCreateFromMultipageTIFF(
    ByVal ImageID As Integer, 
    ByVal Dictionary As TesseractDictionary, 
    ByVal DictionaryPath As String, 
    ByVal CharWhiteList As String, 
    ByVal FilePath As String, 
    ByVal PDFA As Boolean, 
    ByVal Title As String, 
    ByVal Author As String, 
    ByVal Subject As String, 
    ByVal Keywords As String, 
    ByVal Creator As String
) As String
Parameters
Parameters 
Description 
ImageID 
GdPicture Image Identifier. The multipage tiff image to save as PDF.  
Dictionary 
The dictionary to use. A member of the TesseractDictionary enumeration.  
DictionaryPath 
The path into which the engine can find the dictionary files (see the TesseractDictionary enumeration).  
CharWhiteList 
This parameter can be used to specify your own white list of chars. IE:
  • If you want to recognize only numeric you can use "0123456789".
  • If you want to recognize only uppercase alpha you can use "ABCDEFGHIJKLMNOPQRSTUVWXYZ"...
Use empty string to recognize all characters.  
FilePath 
The complete PDF file path to save a GdPicture image.  
PDFA 
True to generate PDF in PDF/A format else False.  
Title 
The title of the PDF.  
Author 
The PDF Author.  
Subject 
The PDF Subject.  
Keywords 
The PDF Keywords.  
Creator 
The name of the application which creates the PDF.  
Returns

The recognized text.

This function requires the optional GdPicture Tesseract Plugin. This plugin must be unlocked with the SetLicenseNumberOCRTesseract() function. 

Use the GetStat() function to determine if this function succeeded.

What do you think about this topic? Send feedback!
Copyright (c) 2009-2011 www.gdpicture.com. All rights reserved.