OCR (Optical Character Recognition) Actions

Description: OCR (Optical Character Recognition) is the technology of converting handwritten, typewritten or printed text in scanned or faxed documents or images into a computer readable text format that is editable and searchable. It involves reading text from images and translating them into a form that one can manipulate with a word processor or other similar means. The OCR action in allows such capabilities. It can convert typewritten, handwritten or printed text as well as text contained in images to a variable or dataset, making it possible to search for a word or phrase, store text more compactly and apply techniques such as text mining or text to speech.

IMPORTANT: GPL GhostScript (32-bit) library is required to support PDF files. You can download the installer for the 32-bit Windows version at https://www.ghostscript.com/download/gsdnld.html.

Available Activities

The following table briefly describes the available activities for this action arranged in alphabetical order. Select the corresponding link for more details regarding each activity.

Activity

Description

OCR - Get text

Retrieves text from an image and populates a variable with results.

OCR - Get word(s)

Retrieves one or more words in an image and populates a dataset with results.

OCR - Get line(s)

Retrieves one or more lines in an image and populates a dataset with results.