You are here: GdPicture.NET > Documentation > Classes > GdPictureImaging Class > GdPictureImaging Methods > PDF Creation > Building Searchable PDF & PDF/A > SaveAsPDFOCRStream > SaveAsPDFOCRStream Method (Integer, Stream, TesseractDictionary, String, String, Boolean, String, String, String, String, String)
logo.gif
ContentsIndexHome
PreviousUpNext
GdPictureImaging.SaveAsPDFOCRStream Method (Integer, Stream, TesseractDictionary, String, String, Boolean, String, String, String, String, String)

Saves a GdPicture image as single page searchable PDF stream and performs an OCR recognition. 

The recognized text is written invisibly to the PDF in order to simplify indexing and search. 

 

For each language of recognition, you have to deploy specific files. See the Dictionary parameter.

C#
public String SaveAsPDFOCRStream(
    int ImageID, 
    ref Stream PDFStream, 
    TesseractDictionary Dictionary, 
    String DictionaryPath, 
    String CharWhiteList, 
    Boolean PDFA, 
    String Title, 
    String Author, 
    String Subject, 
    String Keywords, 
    String Creator
);
Visual Basic
Public Function SaveAsPDFOCRStream(
    ByVal ImageID As Integer, 
    ByRef PDFStream As Stream, 
    ByVal Dictionary As TesseractDictionary, 
    ByVal DictionaryPath As String, 
    ByVal CharWhiteList As String, 
    ByVal PDFA As Boolean, 
    ByVal Title As String, 
    ByVal Author As String, 
    ByVal Subject As String, 
    ByVal Keywords As String, 
    ByVal Creator As String
) As String
Parameters
Parameters 
Description 
ImageID 
GdPicture Image Identifier.  
PDFStream 
The Stream object to save a GdPicture image.  
Dictionary 
The dictionary to use. A member of the TesseractDictionary enumeration.  
DictionaryPath 
The path into which the engine can find the dictionary files (see the TesseractDictionary enumeration).  
CharWhiteList 
This parameter can be used to specify your own white list of chars. IE:
  • If you want to recognize only numeric you can use "0123456789".
  • If you want to recognize only uppercase alpha you can use "ABCDEFGHIJKLMNOPQRSTUVWXYZ"...
Use empty string to recognize all characters.  
PDFA 
True to generate PDF in PDF/A format else False.  
Title 
The title of the PDF.  
Author 
The PDF Author.  
Subject 
The PDF Subject.  
Keywords 
The PDF Keywords.  
Creator 
The name of the application which creates the PDF.  
Returns

The recognized text.

This function requires the optional GdPicture Tesseract Plugin. This plugin must be unlocked with the SetLicenseNumberOCRTesseract() function. 

Use the GetStat() function to determine if this function succeeded.

What do you think about this topic? Send feedback!
Copyright (c) 2009-2011 www.gdpicture.com. All rights reserved.