OCR - Get Text

IMPORTANT: GPL GhostScript (32-bit) library is required to support PDF files. You can download the installer for the 32-bit Windows version at https://www.ghostscript.com/download/gsdnld.html.   

Declaration

<AMOCR IMAGE="text" ALLPAGE="yes/no" TOP="number" 
LEFT="number" WIDTH="number" HEIGHT="number" RESULTVARIABLE="text" />

Description: Retrieves text from an image and populates an existing variable with the results.

Practical Usage

Commonly used as a way to gather text from supported image files.

General Parameters

Property

Type

Required

Default

Markup

Description

Image

Text

Yes

(Empty)

IMAGE="C:\temp\Image.jpg"

The path and filename of the  image file to retrieve text from. Supported image formats are JPG, PNG, TIFF, GIF and BMP.

NOTE: Although a variety of formats are supported, image data with lossless compression such as TIFF is recommended   

Entire image/page

 

 

 

 

If enabled, text will be searched within the entire image/page (enabled by default). This is a visual mode parameter used only during design-time, therefore, contains no markup.

Specified region (improves accuracy)

 

 

 

 

If enabled, a specific region of the image will be searched. Press Pick region to open a dialog allowing you to select an image region. See Pick Region Dialog for more details. This is a visual mode parameter used only during design-time, therefore, contains no markup.

Top

Number

Yes if specifying region

(Empty)

TOP="223"

The top most pixel coordinate of the image. This parameter is active only if the Specified region parameter is enabled.

Left

Number

Yes if specifying region

(Empty)

LEFT="115"

The left most pixel coordinate of the image.  This parameter is active only if the Specified region parameter is enabled.

Width

Number

Yes if specifying region

(Empty)

WIDTH="647"

The total width of the image in pixels. This parameter is active only if the Specified region parameter is enabled.

Height

Number

Yes if specifying region

(Empty)

HEIGHT="647"

The total height of the image in pixels. This parameter is active only if the Specified region parameter is enabled.

Populate variable with OCR result text

Text

Yes

(Empty)

RESULTVARIABLE="Text"

The variable in which to populate with the retrieved text.

Exact copy (do not format text)

Yes/No

No

No

EXACTCOPY="YES"

If set to YES, no formatting will occur and an exact copy of the text will be read. Set to NO by default.

Advanced Properties

Property

Type

Required

Default

Markup

Description

Page Range All

Yes/No

No

Yes

ALLPAGE="yes"

If set to YES, text will be retrieved from all pages in a range (YES by default). If this parameter is set to YES, the Pages parameter is ignored.

Page Range Pages

Number

No

No

  1. PAGE="1-3"

  2. PAGE="2,4,6"

  3. PAGE="3"

If set to YES, text will be retrieved from specific pages in a range (NO by default). Supports specification of a single page, specific pages or a sequence of pages in a range (see the Markup column for examples). Note that only GIF images support multiple pages. If this parameter is set to YES, the All parameter is ignored.

Languages

Yes/No

No

English

SPANISH="YES" PORTUGUESE="YES"

The language(s) of the text contained in the image file that should be read. Available languages are:

  • English

  • Spanish

  • Portugese

  • French

  • Russian

  • Dutch

  • German

  • Italian

Use ICR (digits only)

Yes/No

No

No

ICR="YES"

If set to YES, ICR (Intelligent Character Recognition), a more advanced handwriting recognition system, will be used to recognize numbers or digits. Set to NO by default.

Invert image colors

Yes/No

No

No

INVERT="YES"

If set to YES, image colors will be transformed from light to dark and dark to light. If this activity has trouble recognizing text, inverting may add more contrast to the text, thus, may assist in accurate reads.

Example

Description: Get text from an image file and populates a variable with results. Selected region is "{X=169,Y=113,Width=1083,Height=1090}". Thereafter, displays the retrieved text in a message box.

<AMVARIABLE NAME="theText"></AMVARIABLE>
<AMOCR IMAGE="C:\Sample OCR\SAMPLEM.tif" 
TOP="169" LEFT="113" WIDTH="1083" 
HEIGHT="1090" RESULTVARIABLE="a" />
<AMSHOWDIALOG MESSAGE="%theText%" />