OCR
- Get line(s)
Declaration
<AMOCR ACTIVITY="get_lines" OCRENGINE="text (options)" PAGESEGMODE="text (options)" IMAGE="text" TOP="number" LEFT="number" WIDTH="number" HEIGHT="number" FILTER="text" MATCHCASE="YES/NO" RESULTDATASET="text" PAGE="number" ALLPAGE="YES/NO" ENGLISH="YES/NO" TURKISH="YES/NO" SPANISH="YES/NO" PORTUGUESE="YES/NO" FRENCH="YES/NO" RUSSIAN="YES/NO" DUTCH="YES/NO" GERMAN="YES/NO" ITALIAN="YES/NO" ICR="YES/NO" INVERT="YES/NO" />
Description
Retrieves one or more lines of text from an image or PDF file and then creates and populates a dataset with the results.
Practical usage
Can be used along with the OCR - Get word(s), Image - Capture screen, and Input - Move mouse activities to find the X, Y position of one or more lines of text inside an object (for example, button or control) in a non-Windows API-type window (for example, Java Window) to automate moving your mouse to the correct location.
Parameters
General
Property | Type | Required | Default | Markup | Description |
---|---|---|---|---|---|
OCR Engine | Text (options) | Yes | Tesseract |
|
Specifies the OCR engine to use to retrieve one or more lines of text from an image or PDF file. The available options are:
|
Page Segmentation Mode | Text (options) | Yes, if the OCR Engine parameter is set to Tesseract | Single Block |
|
Specifies the page segmentation mode to use to scan the image or PDF file in a specific way. Select the option that best suits the image for a more accurate scan, based on the position of the lines of text to retrieve. This parameter is only available and required
if the OCR Engine parameter is set to Tesseract. The available options are:
|
Image | Text | Yes | (Empty) | IMAGE="C:\temp\Image.jpg" | Specifies the
path and file name of the image or PDF file to use with this activity. Supported image formats are JPG, PNG, TIFF, GIF, and
BMP. NOTE: Although a variety of formats are supported, image data with lossless
compression such as TIFF are recommended. |
Entire image/page | --- | --- | --- | --- | If
selected (default), this activity searches the entire image/page in the file for lines of text. NOTE: This parameter does not contain markup and is only displayed in visual mode for task construction and configuration purposes. |
Specified region (improves accuracy) | --- | --- | --- | --- | If
selected, this activity only searches a specified region of the image or PDF file for lines of text. Click
Pick Region to open a
dialog that allows you to select an image region. Depending on the OCR Engine parameter's current setting, see Pick
Region dialog (Tesseract OCR Engine) or Pick Region dialog (Legacy OCR Engine) for more details. NOTE:
|
Top | Number | Yes, if Specified region is selected | (Empty) | TOP="223" | The top most pixel coordinate of the specified region in the image or PDF file. This parameter is only available if the Specified region parameter is selected. |
Left | Number | Yes, if Specified region is selected | (Empty) | LEFT="115" | The left most pixel coordinate of the specified region in the image or PDF file. This parameter is only available if the Specified region parameter is selected. |
Width | Number | Yes, if Specified region is selected | (Empty) | WIDTH="647" | The total pixel width of the specified region in the image or PDF file. This parameter is only available if the Specified region parameter is selected. |
Height | Number | Yes, if Specified region is selected | (Empty) | HEIGHT="647" | The total pixel height of the specified region in the image or PDF file. This parameter is only available if the Specified region parameter is selected. |
Filter (optional) | Text | No | (Empty) | FILTER="text" | Applies a filter that only populates the dataset with information regarding the text specified in this parameter. If the text appears multiple times in the image, each instance will be recorded. |
Match case | Yes/No | No | No | MATCHCASE="YES" | If selected, the Filter parameter is case sensitive. This parameter is disabled by default. |
Create and populate dataset with word(s) information | Text | Yes | (Empty) | RESULTVARIABLE="theText" | The name of the dataset to create and populate with information about the retrieved lines of text. See Datasets below for more details. |
Advanced
Property | Type | Required | Default | Markup | Description |
---|---|---|---|---|---|
Page range: All | Yes/No | No | Yes | ALLPAGE="YES" | If
selected (default), lines of text are retrieved from all pages
in a range, and the Pages parameter
is disabled. NOTE:
Only GIF images support multiple pages. |
Page range: Page(s) | Number | No | No |
|
If
selected, lines of text are retrieved from specific pages
in a range, and the All parameter is disabled. This parameter is disabled by default. Supports specification of a single
page, specific pages, or a sequence of pages in a range (see the
Markup column for examples). NOTE:
Only GIF images support multiple pages. |
Languages | Yes/No | No | English |
|
Specifies which
languages are found in the image file selected for the Image parameter. The available languages are:
NOTE: This parameter resets if the OCR Engine parameter is switched to a different engine. |
Use ICR (digits only) |
Yes/No |
No | No | ICR="YES" | If selected, ICR (Intelligent Character Recognition), a more advanced handwriting recognition system, is used to recognize numbers or digits. This parameter is disabled by default. This parameter is only available if the OCR Engine parameter is set to Legacy. |
Invert image colors | Yes/No | No | No | INVERT="YES" | If selected, image colors are transformed from light to dark and dark to light. If this activity has trouble recognizing lines of text, inverting can add more contrast to the text and assist in more accurate results. This parameter is only available if the OCR Engine parameter is set to Legacy. |
Additional notes
Datasets
A dataset is a multiple column, multiple row container object. This activity creates and populates a dataset containing a specific set of fields. The table below describes these fields (assuming the dataset name assigned was theDataset).
Name | Type | Return Value |
---|---|---|
theDataset.PageIndex | Number | Returns the page index. |
theDataset.LineIndex | Number | Returns the line index. |
theDataset.Text | Text | Returns the retrieved text. |
theDataset.Top | Number | Returns the top most pixel coordinate of the specified region in the image or PDF file. |
theDataset.Left | Number | Returns the left most pixel coordinate of the specified region in the image or PDF file. |
theDataset.Width | Number | Returns the total pixel width of the specified region in the image or PDF file. |
theDataset.Height | Number | Returns the total pixel height of the specified region in the image or PDF file. |
Example
- Copy and paste the sample AML code below directly into the Task Builder Steps Panel.
- To successfully run the sample code, update parameters containing user credentials, files, file paths, or other information specific to the task to match your environment.
Description
This sample task retrieves English and Spanish lines of text from an image and then creates and populates a dataset with the results.
<AMOCR ACTIVITY="get_lines" SPANISH="YES" ALLPAGE="YES" IMAGE="C:\temp\document_scan.png" OCRENGINE="tesseract" RESULTDATASET="Results" />