How to Extract Regex Values From Properties
The Regex Extractor component can be used perform Regex value extraction against any available property.
How to Add the Regex Extractor to AutoClassifier
- Navigate to the AutoClassifier Pipelines component page.
-
Click New Component and select Regex Extractor from the component list.
-
Name your new Regex Extractor component and click Add.
-
Click Apply to save your changes.
-
Ensure your new Regex Extractor component is placed in the list of existing pipeline stages.
How to Configure the Regex Extractor Component
To configure your Regex Extractor component, select it from the components list and complete the following fields in the Configuration section:
In the configuration table, you can specify the following information to extract regex value information from your content source:
- Output Property: Enter the desired output property when values are extracted.
- Input Property: Enter a comma separated list of properties or "*" for all properties for regex to be applied against.
- Pattern: Enter any valid Regex Pattern.
- Return Ordering: Applies to the order of the input properties entered.
- Hit Count: Use Hit Count when the desired values returned should be returned by Hit Count. For example, when extracting non-specific matches such as product numbers this would return the product number by how many times the product number match. Using Max Values = 1 would return only the value with the most hits.
- Position: Use Position when the desired value returned should be returned by position. For example, when extracting non-specific matches such as product numbers this would return the product number in order of how they are matched. Using Max Values = 1 would return only the first match value.
- Hit Count: Use Hit Count when the desired values returned should be returned by Hit Count. For example, when extracting non-specific matches such as product numbers this would return the product number by how many times the product number match. Using Max Values = 1 would return only the value with the most hits.
- Max Values:
- Specify the maximum number of unique values to return.
- Leave empty for all values.
- Ignore Whitespace: Enable this checkbox to ignore whitespace in the regex extraction.
- Click Add.
Input Properties
Depends on the script.
Output Properties
Depends on the script.
Script
This component can be used to do any kind of processing specified by a script. The script can be written in C# or VB.NET. See How to Use Custom Logic (Script) for more information.