PDF - Search
Declaration
<AMPDF ACTIVITY="search" SESSION="text" SOURCE="text" PASSWORD="text (encrypted)" FIND="text" RESULTDATASET="text" REGEX="YES/NO" PAGE="number" />
Description
Searches for the occurrence of one or more text strings (that is, particular characters, words, or patterns of characters) in a PDF file and then populates a dataset with the results. Regular expressions can be used to provide a more concise and flexible method of finding text.
Practical usage
Commonly used to locate a word or phrase inside of a PDF file in order to retrieve relevant data (for example, pages where text was found, font information, text position) and/or perform other operations referencing the found text (for example, Text - Insert, Clipboard - Copy, Clipboard - Paste).
Parameters
Resource
Property | Type | Required | Default | Markup | Description |
---|---|---|---|---|---|
Resource | --- | --- | --- | --- | Specifies the source of the PDF file. The available options are:
NOTE: This parameter does not contain markup and is only displayed in visual mode for task construction and configuration purposes. |
Session | Text | Yes, if the Resource parameter is set to Session | PDFSession1 | SESSION="mySession" | The existing session to associate with this activity. This parameter becomes active and is required if the Resource parameter is set to Session. |
Source PDF | Text | Yes, if the Resource parameter is set to File | (Empty) | SOURCE="C:\temp\source.pdf" | The PDF path and file name of where to search for text strings. This parameter becomes active and is required if the Resource parameter is set to File. |
Password (optional) | Text | Yes, if the Resource parameter is set to File | (Empty) | PASSWORD="encrypted" | The password required to open the existing PDF file (if required). |
Criteria
Property | Type | Required | Default | Markup | Description |
---|---|---|---|---|---|
Find | Text | Yes | (Empty) | FIND="PDF Security" | The text string to search for in the PDF file. |
Use regular expression | Yes/No | No | No | REGEX="YES" | If selected, indicates that the value entered in the Find parameter is a regular expression. If disabled (default), the value is literal text. |
Create and populate dataset | Text | Yes | (Empty) | RESULTDATASET="theDataset" |
The name of the dataset to create and populate with search results.See Datasets for more information on the fields this dataset creates. |
Page
Property | Type | Required | Default | Markup | Description |
---|---|---|---|---|---|
Page range | --- | Yes | All | --- | Specifies the pages search for the text string in the PDF file. The
available options are:
NOTE: This parameter does not contain markup and is only displayed in visual mode for task construction and configuration purposes. |
Page(s) | Text | Yes, if Page range is set to Page(s) | (Empty) | PAGE="1,3,5" | If enabled, specifies the pages to search for the text string in the PDF file. For a single page, enter the page number. Use a comma (,) to specify more than one page (for example, 1,3,5). Use a dash (-) to specify a range of pages (for example, 5-10). |
Additional notes
Datasets
A dataset is a multiple column, multiple row container object. This activity creates and populates a dataset containing a specific set of fields in addition to the standard dataset fields. The table below describes these fields (assuming the dataset name assigned was theDataset).
Name | Type | Return Value |
---|---|---|
theDataset.FontIsAccessible | True/False | Indicates whether the font representing the found text is present (installed) in the system. |
theDataset.FontIsEmbedded | True/False | Indicates whether the font representing the found text is embedded. |
theDataset.FontIsSubset | True/False | Specifies whether the font representing the found text is a subset. |
theDataset.FontName | Text | The name of the font matching the found text. |
theDataset.FontSize | Number | The size of the font matching the found text. |
theDataset.ForegroundColor | Text (Color) | The HTML color value of the font matching the found text represented in |
theDataset.Page | Number | The page number where the matching text was found. |
theDataset.Position | Number | The position of the matching text represented in hash code. |
theDataset.Text | Text | The text found. |
theDataset.XIndent | Number | The X coordinate of the found text. |
theDataset.YIndent | Number | The Y coordinate of the found text. |
Example
- Copy and paste the sample AML code below directly into the Task Builder Steps Panel.
- To successfully run the sample code, update parameters containing user credentials, files, file paths, or other information specific to the task to match your environment.
Description
This sample task searches a PDF file for a text string on multiple pages and then creates and populates a dataset with the results.
<AMPDF ACTIVITY="search" SOURCE="C:\temp\myDocument.pdf" FIND="PDF Security" RESULTDATASET="theDataset" PAGE="1-5" />