How to Classify Images
If you want to classify images within documents, use the Image Extractor component in combination with the components below.
Set up the output from Image Extractor (ExtractedImagesBinaryData) as "Additional Input Property" field for the image processing components.
The accepted extensions field for image processing components is only applicable to image files, not to images extracted from documents or for the documents the image was extracted from.
Image Processor: Custom Vision
-
This component analyzes images and extracts any detected entities.
-
This component is Microsoft Custom Vision AI.
Prerequisites
- Install Visual C++ Redistributable for Visual Studio 2012
- Windows Server Feature: .NET Framework 3.5 Features
- Navigate to Microsoft Custom Vision AI
- Create a new project
- Click the Training Images tab, upload your training images, and train your model
How to Configure the Custom Vision Image Processor
To configure, use the following image and steps:
-
Open your pipeline.
-
Expand the New Component section.
-
Select Custom Vision AI.
-
Enter a component name.
-
Click the + Add link.
-
Click Apply.
-
Click the name of the component in the ordered list to open it for configuration.
- Api Key: Enter the API key for your trained Custom Vision project. To find your API key do the following:
- From the Custom Vision web portal, Navigate to the newly trained project
- In the top nav menu, click on the sprocket icon button and go to project Settings.
- Click Resources > Key.
- Copy the key and paste it in the Api Key field
- Send raw response as metadata: Enable this field to attach the JSON response from Microsoft Custom Vision to the list of output properties.
- Prediction API Url: Enter the Predicition API URL for your trained project. To find your Prediction API URL, do the following:
- From the Custom Vision web portal, navigate to your trained project and click on the Performance tab.
- Click Prediction URL
- Select the URL under "If you have an image file:"
- Paste the URL in the Prediction Api Url field
- From the Custom Vision web portal, navigate to your trained project and click on the Performance tab.
- Label score threshold: Specify a value between 0 and 1 to represent the minimum confidence score accepted for an entity label.
- Accepted extension: Specify the image file extensions that you want to process. Separate each extension with a semicolon (
;
). - Additional Input Property: Specify an input property of type
List<byte []>
that represents additional images that are processed by the pipeline. For example, a list of all images extracted previously from a document. - Max degree of parallelism: Enter a number to specify the maximum degree of parallelism
- Api Key: Enter the API key for your trained Custom Vision project. To find your API key do the following:
Input Properties
File RawData
- (Optional) The property specified in the Additional input property configuration option.
Output Properties
Property |
Type |
---|---|
|
Text – Multi |
|
Text – Multi |
RawResponse
|
Text – Multi |
High Data Usage: Save Money by Using a Trigger
To help you avoid accidentally high cloud data costs, BA Insight designed a predefined "trigger" for the NLP component:
- This is applicable only for AutoClassifier version 5.0.
- The trigger is enabled by default, in sample form.
- The trigger provides sample code that includes the sources to process, (with sample values).
- You must modify the trigger code according to your needs.
Caution: Not implementing this trigger can result in high data costs from your cloud data provider.
For example, when crawling 5 different content sources with a total of 1M items, a high number of documents processed for Natural Language extraction can generate high cloud data costs.
Use the following steps:
- Add the desired component, normally.
- When configuring the component, expand the Trigger section and adapt the predefined trigger to your needs.
- Continue with component configuration.
The predefined (modify before using) script value:
//change the trigger script accordingly
//to the sources that you want this stage to process
//for the example below to work, ContentSource metadata must be
//available on the item to be processed; otherwise adapt your
//script to match your metadata and sources you want to process
var allowedContentSources = new List<string>() { "Source1", "Source2" };
string contentSource = item.Get<string>("ContentSource");
if (string.IsNullOrEmpty(contentSource) ||
!allowedContentSources.Contains(contentSource))
return false;
return true;
The Components affected by this change are all the BA Insight components that make API requests to Microsoft:
- Microsoft Text Analytics
- Custom Vision AI
- Image Processor MS Computer Vision
- Video Processor Microsoft Video Indexer
How to Trigger Your Pipelines Only for Video Files
The script below can be entered into the Trigger screen code window above.
This script runs only if the file extension detected is a supported video format.
- Add to the script only the formats you truly wish to process.
- Adding all the supported video file extensions is unnecessary and inefficient.
This aids you in reducing data usage, thereby saving money.
For more on using Pipeline Triggers, see Add Triggers to Determine When Your Pipelines Run.
// add here all allowed extensions, but always in lowercase
var allowedExtensions = new List<string>(){"mpeg4", "mp4", "avi"};
string fileext = item.Get<string>("escbase_fileextension");
if(fileext!=null)
{
if(allowedExtensions.Contains(fileext.ToLower()))
return true;
}
return false;
Image Processor: MS Computer Vision
This component analyzes images and extracts detected text and concepts detected.
- This component uses Microsoft Computer Vision API.
How to Configure the Microsoft Computer Vision Processor
To configure the Microsoft Computer Vision Processor use the following image and steps:
-
Open your pipeline.
-
Expand the New Component section.
-
Select Image Processor MS Computer Vision.
-
Enter a component name.
-
Click the + Add link.
-
Click Apply.
-
Click the name of the component in the ordered list to open it for configuration. Complete the following fields:
- Endpoint Url: The endpoint URL used when configuring the Cognitive Services instance.
- Api Key: Enter The API Key obtained after configuring the Cognitive Services API.
- Extract text: If selected, OCR is enabled (any text present in the images will be extracted in a separate property)
- Extract tags: Enable this field for label/object recognition.
- Label score threshold: Specify a value between 0 and 1 to represent the minimum confidence score accepted for an image label.
- Accepted Extensions: Enter the image file extensions that you want to process with the Cognitive Services API. The API only supports the following image formats: JPEG, PNG, GIF, BMP.
- Additional input property: Specify an input property of type
List<byte []>
representing additional images to be processed by the pipeline. For example, a list of all the images previously extracted from a document. - Cached data validity in days: Specify the amount of time, in days, that the image tagging data is cached. If the same image is received by the stage multiple times, the API is only called once and it reuses the cached result in the subsequent requests if the cache is valid
- Endpoint Url: The endpoint URL used when configuring the Cognitive Services instance.
Input Properties
File RawData
- (Optional) The property specified in the Additional input property configuration option.
Output Properties
Property |
Type |
---|---|
|
Text – Multi |
|
Text |
MSAllLabelsWithScore
|
Text – Multi – each entry is in the following format: Label;Score |
Image Processor: Amazon Rekognition
This component analyzes images and extracts detected text and concepts detected.
- This component uses Amazon Rekognition API
How to Configure Amazon Rekognition Image Processor
To configure the Amazon Rekognition use the following image and steps:
-
Open your pipeline.
-
Expand the New Component section.
-
Select Image Processor Amazon Rekognition.
-
Enter a component name.
-
Click the + Add link.
-
Click Apply.
-
Click the name of the component in the ordered list to open it for configuration. Complete the following fields:
- If you enable the Use Credentials file field to use a credentials file:
- Credentials file location: Enter the location of AWS Credential file. For example, C:\Users\Luca\Desktop\credentials.txt. An AWS credential file may look like the following example: Copy
[{profilename}]
aws_access_key_id = {accessKey}
aws_secret_access_key = {secretKey} - Credentials Profile Name: Enter AWS Profile Name for the credentials file.
- If you do not enable the Use Credentials file field to use a credentials file:
- API Key: Enter the API Key for your Amazon Rekognition instance.
- Secret Access Key: Amazon account secret access key.
- Amazon Web Service Region: Select the Region of your Amazon Web Service. For more information on the supported Regions for Amazon Rekognition, see the Amazon documentation.
- Extract text: If selected, OCR is enabled and any text present in the images will be extracted in a separate property.
- Extract Labels: If selected, label/object recognition is enabled.
- Label score threshold: Specify a value between 0 and 100 that will represent the minimum confidence score accepted for an image label.
- Accepted Extensions: Enter the image file extensions that you want to process with the component. The API only supports the following image formats: JPEG, PNG.
- Extract text:
- If selected, OCR is enabled (any text present in the images will be extracted in a separate property)
- If selected, OCR is enabled (any text present in the images will be extracted in a separate property)
- Extract tags:
- Enable label/object recognition
- Enable label/object recognition
- Label score threshold:
- Specify a value between 0 and 100 that will represent the minimum confidence score accepted for an image label.
- Specify a value between 0 and 100 that will represent the minimum confidence score accepted for an image label.
- Accepted Extensions:
- Image file extensions that we want to process with the component (the API only supports the following image formats: JPEG, PNG)
- Image file extensions that we want to process with the component (the API only supports the following image formats: JPEG, PNG)
Input Properties
File RawData
- (Optional) The property specified in the Additional input property configuration option.
Output Properties
Property |
Type |
---|---|
AWSExtractedLabels |
Text – Multi |
AWSExtractedText |
Text |
AWSAllLabelsWithScore | Text – Multi – each entry is in the following format: Label;Score |