How to Classify Videos

High Data Usage: Save Money by Using a Trigger
How to Trigger Your Pipelines Only for Video Files
How to Extract Labels and Concepts from Videos
Configure the Microsoft Video Indexer component for offline processing

High Data Usage: Save Money by Using a Trigger

To help you avoid accidentally high cloud data costs, BA Insight designed a predefined "trigger" for the NLP component:

The trigger is enabled by default, in sample form.
The trigger provides sample code that includes the sources to process, (with sample values).
You must modify the trigger code according to your needs.

IMPORTANT! Not implementing this trigger can result in high data costs from your cloud data provider.

For example, when crawling 5 different content sources with a total of 1M items, a high number of documents processed for Natural Language extraction can generate high cloud data costs.

Use the following steps:

Add the desired component, normally.
When configuring the component, expand the Trigger section and adapt the predefined trigger to your needs.
Continue with component configuration.

The predefined (modify before using) script value:

Sample Script - Modify Before Using

Copy

//change the trigger script accordingly
//to the sources that you want this stage to process
//for the example below to work, ContentSource metadata must be
//available on the item to be processed; otherwise adapt your
//script to match your metadata and sources you want to process

var allowedContentSources = new List<string>() { "Source1", "Source2" };
string contentSource = item.Get<string>("ContentSource");
if (string.IsNullOrEmpty(contentSource) ||
    !allowedContentSources.Contains(contentSource))
      return false;

return true;

The Components affected by this change are all the BA Insight components that make API requests to Microsoft:

Microsoft Text Analytics
Custom Vision AI
Image Processor MS Computer Vision
Video Processor Microsoft Video Indexer

How to Trigger Your Pipelines Only for Video Files

The script below can be entered into the Trigger page code textbox. This script runs only if the file extension detected is a supported video format. You should only add the formats you truly wish to process to the script. Adding all the supported video file extensions is unnecessary and inefficient.

For more on using Pipeline Triggers, see Add Triggers to Determine When Your Pipelines Run.

Copy

// add here all allowed extensions, but always in lowercase
var allowedExtensions = new List<string>(){"mpeg4", "mp4", "avi"};
string fileext = item.Get<string>("escbase_fileextension");
if(fileext!=null)
{
  if(allowedExtensions.Contains(fileext.ToLower()))
   return true;
}
return false;

How to Extract Labels and Concepts from Videos

Video Processor: Azure AI Video Indexer

This component analyzes videos and extracts detected labels and concepts.
This component is using Azure AI Video Indexer.

Prerequisites

Before configuring the Microsoft Video Indexer component, you must complete the following prerequisites:

In the Microsoft Azure portal, you must create an Azure AI Video Indexer resource. For more information, see Create an Azure AI Video Indexer (VI) account in the Microsoft documentation.
In the Microsoft Azure portal, you must create a New app registration or select an existing registration. For more information, see Register an application in the Microsoft documentation.
After registering your application, you must grant the "Contributor" role to your your application. For more information, see Assign Azure roles using the Azure portal in the Microsoft documentation.

Component configuration

To configure, use the following image and steps:

Open your pipeline.
Expand the New Component section.
Select Video Processor Microsoft Video Indexer.
Enter a component name.
Click the + Add link.
Click Apply.
Click the name of the component in the ordered list to open it for configuration. Complete the following fields:
1. Tenant Id: Enter the tenant ID for your Microsoft Azure video indexer application. See How to find your Microsoft Entra tenant ID for more information.
2. App Id: Enter the application ID for your Microsoft Azure video indexer application. You can find your application ID from the Overview page in the Microsoft Azure portal.
3. App Secret: Enter the application secret key for your Microsoft Azure video indexer application. You can find your application secret key from the Certificates & secrets page in the Microsoft Azure portal.
4. Resource Group: Enter the resource group name for your Microsoft Azure video indexer application. You can find your resource group name from the Resource groups page in the Microsoft Azure portal.
5. VI Account Name: Enter the Video Indexer account name for your Microsoft Azure video indexer application. You can find your VI Account Name from the Azure AI Video Indexer page in the Microsoft Azure portal
6. Subscription Id: Enter the subscription ID for your Microsoft Azure video indexer application. You can find your subscription ID from the Subscriptions page in the Microsoft Azure portal.
7. Extract transcripts: Enable this field if you want to obtain transcripts from the videos.
8. Explicit content detection: Enable this field to see if the video has explicit and inappropriate scenes.
9. Language detection: Enable this field to detect the language spoken in the videos
10. Accepted extensions: Specify the video file extensions you want the stage to process. Separate each extension using a semicolon (;).
11. Results interrogation interval in seconds: Specify the interval to request results from Microsoft. The annotation process is asynchronous; the video file will be uploaded to Microsoft and it will interrogate the server for results on a periodic basis.
12. Number of interrogation retries: Specify the maximum number of retries for the interrogation process. If the number of retries exceeds the specified number, the annotation process is aborted without returning any results.
13. Maximum video size (in MB): Only process videos of the specified size or lower.
14. Overwrite raw data with extracted info: Enable this field if you want the raw data to be replaced with the extracted labels and transcripts. This is useful when you don't want to index the video raw data but the information from the video.
15. Send raw response as metadata: Enable this field to store the annotation response as a serialized JSON.
16. Additional input property: Specify an input property of type List<byte []>, which represents the additional videos to be processed by the pipeline. For example, a list of all of thee videos previously extracted from a document.

Input Properties

File RawData
(Optional) The property specified in the “Additional input property” configuration option.

Output Properties

Property	Type
`MSVideoAllVideoLabels`	Text – Multi
`MSVideoTranscripts`	Text – Single
`MSVideoEntireJSON`	Minified JSON
`MSVideoExplicitContent`	Text – “True”/”False”
`MSVideoLanguage`	Text – Multi

Configure the Microsoft Video Indexer component for offline processing

You can configure the Microsoft Video Indexer component to be used for offline processing in an online pipeline. Refer to the documentation how to use offline processing for more in depth detail for adding an offline processing component.

Create an offline pipeline with the Microsoft Video Indexer component.
Click the component and configure it, as shown in step 7 above.
Create an online pipeline and add the offline processing component.
Enter a name for your offline processing component.
Click the component and specify the following configuration settings:
1. File Storage Location: Specify a File Share location for storing raw file data.
2. Store Raw Data: Enable this field if a component configured in the Offline Pipeline requires the raw data file.
3. Include Properties: Enter the following values: FileExtension,OriginalPath,FileName,Url.
4. Offline Pipeline: Select the offline pipeline that you created in step 1.
Click Apply.