PlanetPress Capture ICR Best Practices
From Workflow 7.5 onwards, PlanetPress Capture supports Intelligent Character Recognition (ICR). However, this technology comes with certain limitations. A successful integration of ICR within a business requires the application of best practices by all parties involved: Form designer, Workflow designer and User.
Here we present a list of recommended best practices. Each of these guidelines aim at maximizing the likelihood that the characters are recognized; and minimize the risk of errors due to an incorrect analysis.
You will find the following information, when applicable, for each best practice:
- Target: The targeted audience. There are 3 possibilities: Form designer, Workflow designer and User.
- What: A brief description of the best practice. This could include an explanation of the concepts that are addressed.
- Why: A brief explanation of the reasoning behind the relevance of this guideline.
- How: How to apply this best practice.
This section describes a list of the best practices to implement. They are listed in no particular order of importance. Pay attention to the targeted audience to know if this rule applies to you.
Using the Most Restrictive Mask
- Target: Form designer.
- What: In the Capture Options tab of a Capture object, the mask type indicates the type of character to be recognized. There are 3 possible selections: numeric, alphabet and alphanumeric. The alphabetic mask type allows you to select the letter case.
The following guidelines are applicable when configuring a PlanetPress Capture object that utilizes ICR:
- The collected data is expected to be a number, therefore the numeric mask type must be selected, or
- The collected data is expected to be a letter, therefore the alphabet mask type must be selected,
- If upper case letters are expected, select Upper case in the Case option menu. The captured characters would be immediately converted to capital letter i.e. the ICR engine will recognize a lower case a but will display it in upper case.
- If lower case letters are expected, select Lower case in the Case option menu. Same as for upper case letters, the captured characters would be converted to lower case and displayed as such.
- If proper names or nouns are expected (i.e. only the first letter must be a capital letter), select Capitalization in the Case option menu. Only the first letter would be converted to a capital letter.
- If no specific format is expected, select None in the Case option menu. The letters will be interpreted as written, no conversion will be done i.e. characters in lower case will be displayed as such.
- The collected data is expected to be a combination of numbers and letters, therefore the alphanumeric mask type must be selected.
Why: Reducing the number of expected characters increases the probability that the correct one is matched. This allows us to avoid that the letter l (a lowercase L) is not recognized as the numeric value 1 (one) and vice versa. Or, if the mask type is identified as alphanumeric, there’s a possibility that the letter a is recognized as 2; since Capture will also interpret how the movement was traced.
How: Use the following options from the Capture options tab under Mask Type and Case option to filter the expected data.
The following diagram illustrates the available mask types. It is recommended to select the mask type that is the closest to the desired result. An alphanumeric field should be used as a last resort.
Guidelines for Capture-Ready Fields
- Target: Form designer
- What: Only one character per Capture field can be recognized. When expecting multiple characters making up a word or phrase, you must make sure that the user only writes one character per field. In order to do so, you must make sure that the fields are big enough and have enough space between each one. The best practice is to make sure that there is a boundary surrounding the field where ink marks are to be written.
Why: To avoid any ink marks that would spill over from one field to another. If both fields A and B are to close in proximity and the ink marks from field A spill over to field B, then the marks captured on field B would be considered as being part of a character written on field B. For example, if a number spills over and is written over two fields like numbers 9, 1 or 7; then the bottom tip of these numbers could be considered as number 1 in the second field. (Refer to the example below)
How: Make sure there’s enough space between each field. You must re-design the document if that’s the case. There’s no minimum value that is required as the distance between 2 fields, except for the 7mm border that is required in order for the Anoto digital pen to recognize the pattern being used.
Writing in a Legible Way
- Target: User.
- What: It is important to write in a legible way i.e. applying yourself by writing well defined numbers and letter that are easily interpreted.
Caution: You must write on a flat and smooth surface i.e. a delivery person should use a clipboard.
Why: Some numbers can create some confusion, like numbers 7 and 1. 7 can be interpreted as a 1 and vice versa. The letter i, where the dot on top is a circle, can possibly cause a conflict because the dot can be considered as an o.
How:
- Write an additional line under the number 1.
- Write an additional line across the number 7.
- The ICR functionality of PlanetPress Capture cannot recognize dotted letters where there are circles instead of dots (like i , j). This would be analyzed as an i AND o. Therefore, dots should be as such and not circles.
- In French, the ç is somewhat sensitive. You must apply yourself and draw the letter carefully. In most cases, it is recognized, but attention must be paid.
- Number 8 is also sensitive. It is recommended that the number is traced as one movement instead of drawing 2 circles on top of each other.
Selecting the Correct Language When Using the Capture Field Processor Task
- Target: Workflow designer.
- What: It is crucial that the correct language is selected when using the ICR recognition option. This will affect how the captured data is interpreted.
Why: The available filters to interpret the ink marks done with the Anoto digital pen, allow you to select the engine language to be used. Doing so will give you results that are the closest match to the captured data. Multiple cultural characters can be interpreted with ICR once the correct language is selected such as û, à, é, etc.
How: This option is available from the Capture Fields Processor task.
Possibility of Interpretation Error in an Automated Process
- Target: Workflow designer
- What: We cannot be 100% sure that a character would be recognized by PlanetPress Capture as it should. Therefore, the analysis of a value interpreted with ICR should only occur if the level of confidence is superior to a determined level.
Why: An automated process can treat the characters incorrectly due to an incorrect interpretation of a value. This occurrence should be minimized as much as possible.
How: Allow for a special process (possibly manual handling) in the case the automated process didn’t reach a high confidence level in its analysis of the ink marks. Use the plugin Capture condition that includes the ICRContent option. This can be configured to be a true condition if the confidence level is greater than a certain value.