Extraction Pro stencil
The Extraction Pro stencil is used to automatically extract information from documents using a machine AI learning model. The advanced functionality allows you to extract line items from multiple tables on a single document. When this stencil is used, the document does not have to go through OCR or be trained.
After going through the Extraction Pro stencil, each data field will be highlighted on the document as follows:
-
Green = 80% or higher confidence
-
Yellow = 50% - 80% confidence
-
Red = under 50% confidence
Note: Each document that is processed through this workflow stencil will decrement your page count license by the number of pages displayed.
Available connections
There are two outgoing connections from an Invoice Extraction stencil: green if the extraction was successful and red if the extraction failed.
Configuration
- To access stencil properties, double-click an Extraction Pro stencil. Or, right-click an Extraction Pro stencil and select Properties.
- In the Display Name box, enter a name for the stencil. The display name will be the name of the folder representing the workflow step in the Document Panel.
-
Configure the stencil using the following tabs:
Data Fields
Tip: To help properly configure data field mappings, you can view names of all extracted values on the Data Fields tab in Document Properties.
Data Field: The Intelligent Capture data field. The information in the Extraction Field will be mapped to this data field.
Extraction Field: The field that is returned by the machine learning AI model. The field name and spelling of this field needs to match the document’s field label. This field supports data field substitution.
Duplicate Selection: Determines which value to use for the data field if there are multiple matched values on a document. The following options are available:
-
First: Use the first matched value.
-
Last: Use the last matched value.
-
None: The field is ignored.
Line Items
Line Items Field (Optional): Select the line item data field that contains the line items you want to map.
Table Expression: An expression that allows you to specify which tables to extract data from. When this field is left empty, data from all tables will be extracted.
Example: (Table.Columns[1] = "description" or Table.Title = "Products") and Table.Title <> "Additional Taxes"
The above expression would return any tables that have a 2nd column titled "description" or any tables that have the Title "Products", but would ignore any tables that have the title "Additional Taxes".The following table expression variables are available:
Property Example Table.Title Table.Title = “Billing” Table.Index Table.Index = 1 (1 based) Table.Location.Page Table.Location.Page = 1 (1 based) Table.Location.Top Table.Location.Top > 350 Table.Location.Bottom Table.Location.Bottom < 600 Table.Location.Left Table.Location.Left < 20 Table.Location.Right Table.Location.Right = 500 Table.Location.Width Table.Location.Width > 500 Table.Location.Height Table.Location.Height < 200 Table.Columns.Count Table.Columns.Count = 2 Table.Columns[index number] as a string Table.Columns[0] = “quantity” Index is zero based. Use lowercase. Table Row Expression: An expression that allows you to specify which table rows to extract data from. When this field is left empty, data from all table rows will be extracted.
Example: TableRow.Count = 3 and TableRow[0].Name = "quantity" and TableRow[1].Value.Length > 0
The above expression would return rows from tables that have three columns; the first column's name is "quantity" and the second column's value has a length that is greater than zero.The following table row expression variables are available:
Property Example TableRow.Count TableRow.Count = 2, specifies there are two columns in the table row. TableRow[index].Value as string TableRow[0].Value = “item5”, specifies the content of the cell. Index is zero based. Use lowercase. TableRow[index].Value.Length TableRow[1].Value.Length < 5, specifies a cell with a value that is less than 5 characters. Index is zero based. TableRow[index].Value.Contains() TableRow[0].Value.Contains(“important“), specifies a cell with a value that contains the string “priority”. Use lowercase. TableRow[index].Name as string TableRow[1].Name <> “#summary”, specifies any row's column name that is not “#summary” Index is zero based. Use lowercase. Note: The #summary column is a name for automatically detected summary rows, such as 'Total', 'Total Due', and 'Total Taxes'. This column can be used in the table row expression to either include or exclude it from the results.
TableRow[index].Name.Length TableRow[0].Name.Length > 20, specifies a row’s column name that is greater than 20 characters. Index is zero based. TableRow[index].Name.Contains() TableRow[1].Name.Contains(“taxes“), specifies a row’s column name that contains the string “taxes”. Index is zero based. Use lowercase. Data Field: The Intelligent Capture data field. The information in the Extraction Field will be mapped to this data field.
Extraction Field: The field that is returned by the machine learning AI model. The field name and spelling of this field needs to match the document’s field label. This field supports data field substitution.
-
- Click Save.