Advanced Search and Replace
Advanced Search and Replace action tasks are used to locate and replace strings of data within the job file and to replace them with other strings of data. Contrary to Search and Replace action tasks, they allow the use of regular expressions.
Using regular expressions, it is possible to search for patterns rather than specific strings. For instance, a pattern can be specified to find all valid email addresses or phone numbers within the data stream.
In a regular expression, substrings can be captured as groups using parentheses. Values of capturing groups - the matched substrings - can then be included in the replacement string using the dollar sign syntax: $1 ... $9. The numbering follows the order in which groups appear in the search string.
For example, in order to replace all instances of "Page x/y" (Page 1/3, Page 2/3, etc.) in a document with "Page x of y total pages", the regular expression would have to contain parentheses to capture the values of x and y: Page\s(\d*)\/(\d*)
. The first capturing group, (\d*), contains the value of x, the second the value of y.
The replacement string would then be: Page $1 of $2 total pages
(where $1 contains x, and $2 contains y).
For more information about regular expressions, visit a website like https://www.regular-expressions.info/.
To test out your regular expressions go to: https://regex101.com/.
Input
Any text-based file can be used in this task, even formats that are not directly compatible with Connect Workflow. As long as the text is visible in a text-based editor (such as Notepad), it is readable and supported by this task.
Processing
The appropriate changes are made to the data file (replacing text).
Output
The modified data file is output from this task. Metadata is not modified in any way if it is present.
Task properties
General tab
- Search mode group: Select your chosen search mode within this group.
- Search line by line: Select if you want each line in the data stream to be searched separately. When this option is selected, Connect Workflow considers each line as an individual data stream (lines are separated by Line Feed characters). It minimizes memory requirements but may also limit hits, since lines are considered separately. Note that it is not possible to use search expressions that specify multiple data lines when this option is selected.
- Search whole file: Select if you want the entire data stream to be searched as if it were a single string of text. When this option is selected, Connect Workflow loads the entire file in memory. It offers more flexibility, since search expressions may span across multiple lines and may result in more successful hits. Note that since this option uses more memory, it may affect performance.
- String to search: Enter your search string or regular expression in this variable property box. To enter multiple strings or expressions, press Enter after each one. (Note that only one string can be entered in the Replace with box.)
- Treat as regular expression: Select to specify that the string or strings entered above are to be interpreted as regular expressions rather than ordinary text strings. This option disables all position options as well as the Whole words only option.
- Search options group
- Case sensitive: Select to force the plugin to match the character casing of the search string above with the characters found in the file. If this option is selected, “DAY” and “Day” will not be considered as matching the search string “day”.
- Whole word only: Select force the plugin to search only for strings that match the search string from beginning to end (cannot be used with regular expressions). If this option is selected, “DAY” and “DAYS” will not be considered as matching strings.
- Position options group: Specify the location where the string must be found using this group. Note that this whole group is disabled when the Treat as regular expression option is selected.
- Anywhere on the line: Select to indicate that the search string can be anywhere on the line.
- At the beginning of a line: Select to indicate that the search string must be the first string on the line.
- At the end of a line: Select to indicate that the search string must be the last string on the line.
- At column: Select to indicate that the search string must be in a specific column. Specify the column number (the value must be greater then 0) in the Column value box below.
- Between specific words: Select to indicate that the search string must be between specific words. Specify these words in the Word before and Word after boxes below.
- Occurrence related: Select to indicate that the search string must be found a specific number of times before a string replacement is performed. If the Search line by line option is selected in the Search mode group, the search counter is reset for every line. If the Search whole file option is selected in the Search mode group, the search counter is not reset before the end of the file. Select one of the occurrence options (described below) in the list box below and enter a value in the Occurrence value box besides it.
- At occurrence: The replacement will take place only when the specified number of occurrences has been reached. Specifying 2 occurrences, for instance, means that only the second occurrence will be replaced.
- At every specified occurrence: The replacement will take place every time the specified number of occurrences is reached. Specifying 2 occurrences, for instance, means that the second, the fourth and the sixth (and so on) occurrence will be replaced.
- All after occurrence: All occurrences of the search string will be replaced once the specified number of occurrences has been reached. Specifying 2 occurrences, for instance, means that all occurrences after the second one will be replaced.
- All before occurrence: All occurrences of the search string will be replaced until the specified number of occurrences has been reached. Specifying 5 occurrences, for instance, means that the four first occurrences will be replaced.
- Replace with: Enter the string that must be used as the replacement string when a match is found. If the search string is a regular expression that captures groups, the values of groups - the matched substrings - can be accessed using the dollar sign syntax: $1 ... $9. The numbering follows the order in which groups appear in the search string. For example: "Page $1 of $2 total pages".
On Error Tab
For a description of the options on the On Error tab see Using the On Error tab.
Miscellaneous Tab
The Miscellaneous tab is common to all tasks.
It contains a text area (Task comments) that lets you write comments about the task. These comments are saved when the dialog is closed with the OK button and are displayed in The Task Comments Pane.
Check the option Use as step description to display the text next to the icon of the plugin in the Process area.
The tab also provides an option to highlight the task in The Process area with the default color, set in the Preferences (see Colors), or the color selected or defined under Highlight color on this tab.
To revert the selected highlight color to the default color, open this tab, turn the Highlight option off and close the dialog with the OK button; then turn highlighting back on.
Highlighting can also be turned on and off via the task's contextual menu and with the Highlight button on the View ribbon.