Machine Learning Classification Providers

About Machine Learning Classification Providers
Existing Machine Learning Provider Information
How to Add a New Machine Learning Provider
How to Configure Your Machine Learning Provider
How to Map Your Untrained ML Provider to a Taxonomy
Set Rules for Your Terms (Nodes)
Train Your Machine Learning Provider
How to Retrain Your ML Provider with New Documents (Data)
Rules Engine Pipeline Stage

About Machine Learning Classification Providers

Machine Learning Providers are trainable models, which classify your content as follows:
- Automatically, without rules
- With a combination of ML tagging and rules
Machine learning classification providers can be found by selecting the "Machine Learning Providers" link from the left side navigation. See the graphic below.
From this screen you can manage your Machine Learning Providers.
For each Machine Learning Provider, you can map ONE taxonomy.
There is a 1-to-1 mapping of taxonomies and Machine Learning Providers.

Existing Machine Learning Provider Information

Field	Description
Name	ML Provider name
Type	Type of ML Provider: Microsoft Azure Machine Learning Studio
Status	Available values: Trained: Training documents have been uploaded and the ML provider algorithm has ingested the data from them. Not Configured: ML provider is not configured Initialized: No training documents have been uploaded Needs Retraining: ML provider contains documents, but new documents have been added that have not been captured by the ML algorithm. Until retraining is performed, the ML provider uses only the data from the documents it has been trained on.
Documents	Number of documents uploaded for the purpose of data training.
Last Trained	Execution date or time of last completed training session.
Last Training Duration	Duration of last training session in format hh:mm:ss
Usage	Mapped taxonomy. Example: Sentiment, ConversationSentiment, etc.
Enabled	Values are: Enabled Not Configured
Delete	Click to delete an ML Provider.
Export	Click to export a Machine Learning Provider as an XML configuration file for later import into another environment. Note: You import an XML configuration file (containing your settings) using the "Import" button under the box that contains Machine Learning types. After import, you must retrain your Machine learning component.

How to Add a New Machine Learning Provider

Two types of Machine Learning providers are supported:

MS Azure Machine Learning Studio
- Prerequisites

To add a new Machine Learning Provider, navigate to the Machine Learning Provider page and expand the section New Provider .
Select your ML provider type.
Enter your component name in the Component Name: field.
1. Note that spaces and special characters are both NOT supported in component names.
2. Your component name may have a maximum 24 alphanumerical characters

How to Configure Your Machine Learning Provider

Use the following instructions to configure your machine learning providers:

Click the name of your Machine Learning provider from the Name column in the table at the top of the page (see the graphic above).
The Settings screen opens. See the graphic below.
Enter the Component name.
1. Note: Once set, your component name cannot be changed later!
  
  MS Azure Machine Learning Studio
  1. MS Azure ML Studio. Enter the following information (all settings are required):
    1. Wrapper Service Address:
      1. The windows service address created from python wrapper script
      2. Default: http://localhost:1989
    2. Resource Group Name:
      1. The name of the resource group the Azure Machine Learning workspace belongs to
    3. Workspace Name:
      1. The name of your Azure Machine Learning Service Workspace
      2. Available from your Azure portal. For more about Azure Workspaces, click here
    4. Subscription ID:
      1. The subscription id from Resource Group.
    5. Tenant ID: Tenant Id from App created for authentication in Prerequisites
    6. Service Principal ID: Application Id from App created for authentication in Prerequisites
    7. Service Principal Password: Client Secret from App created for authetncation in Prerequisites
    8. Storage Account Name:
      1. The name of your Azure storage account.
      2. Where your temporary files (data) are stored.
      3. Files are stored for retraining. Also, contains a key for authentication.
    9. Storage Account Key:
      1. The Storage Account Key of your Azure storage account
    10. Aml Compute Cluster Name:
      1. The name of compute cluster created in Prerequisites
    11. Instance Nodes Number:
      1. Number of iterations allowed to run in parallel during training (how many nodes are allowed to consume from the compute cluster in parallel)
    12. Web Service Allocated Cores:
      1. The number of CPU cores to allocate for the Webservice endpoint used for classification.
    13. Web Service Allocated Memory:
      1. Number of GB allocated for the Webservice endpoint used for classification
    14. Tag score threshold:
      1. A setting (floor) of confidence.
      2. Threshold of confidence (probability) required for tags to be returned by your ML provider.
      3. For example, a tag with a threshold of 0.2 will not be returned if your threshold score is set to 0.3 or higher.
        Values: 0.0-1.0.
    15. Max Concurrent Calls:
      1. This should only be changed if you get error during training that the max concurrent calls value is too high.
      2. This depends on your Azure ML Workspace Location but in general 200 value is used.
    16. Training Status Refresh Interval In Minutes:
      1. Amount of time (in minutes) between re-polling the API of the ML provider for ready state.
      2. The more data you have, the greater this value should be.
      3. No maximum value.
    17. Request Timeout in Hours:
      1. Timeout limit for trainer to run.
When finished, click Apply.
Click "Machine Learning Providers" from the left-side navigation.
Your newly created Machine Learning Provider appears in the list of "Existing Machine Learning Providers".

How to Map Your Untrained ML Provider to a Taxonomy

You can map your untrained Machine Learning Provider to a new or existing taxonomy.
Click "Manage Taxonomies" from the left-side navigation.
Select a taxonomy.
1. Enter a new taxonomy in the Enter new taxonomy name text box and click Create and then select the new taxonomy from the list shown,
  or
2. Select an existing taxonomy from the list shown.
The Rules page opens.
To add a term (node), right-click your Taxonomy at the top of the right pane and select "Add term" from the sub-menu that appears.
1. Enter all the terms you have training data for.
In the following example, the Taxonomy "SentimentDetect" has two terms defined
- Positive
- Negative
These terms are detected by the ML algorithm in conversation or text that it processes.
Select your Taxonomy and then select the (available) ML provider for the taxonomy from the drop-down menu in the MACHINE LEARNING tab. The setting is automatically saved.
Now set the training data for your terms or "nodes."
1. Note: Training data can not be set for the root ("SentimentDetect," in the example below).
Select the node you want to add training data for.
1. In the example below, we select the Positive node.
Drag-and-drop or click the icon at the bottom of the MACHINE LEARNING tab to add the documents that contain training data.

See the example below (the documents added contain "positive" and "negative" language).
Selecting the document in the left-side pane reveals the document contents on the right side.
Add training documents for all nodes.

Set Rules for Your Terms (Nodes)

Term rules support the following rules:

RegEx
Kusto Query Language (KQL)

For each term or "node" the ML rule used when processing training documents is shown in the Term Rule dialog box.

You can set the Term Rule for each term (or node) as follows:

No rule
Default (auto-generated) rule
Combination of complex (RegEx/KQL) rules and ML rules

If the term or ML rules find a match in a training document, the tag is applied.

Enter the Term Rule for each term in your taxonomy.
Click Save changes after each rule is entered.

To clear the training data for a Machine Learning Provider, use the "Clear All Training Data" button next to the Train button discussed in the following section.

If you change the taxonomy a Machine Learning provider is configured for, you must clear the training data before using the Machine Learning Provider with another taxonomy.

This way old training data is dropped.

Train Your Machine Learning Provider

Once the procedures above are complete, you are ready to train your Machine Learning Provider.

If you are still on the Rules page, click back in your web browser.
Select Machine Learning Providers from the left side navigation.
The available ML providers appear.
- Note: The Status of your ML provider will be "Needs Training."
Select your Machine Learning Provider by clicking its name.
Click the Train button at the bottom of the screen.

Note: Machine learning training is not Incremental, it is Full.

If you add documents to your ML provider, a full training process must be run to incorporate all data.
Depending on how much data you provide your ML provider, and how many items, the training process can take 3-6 hours or more.
Proceed to set up your Rules Engine Pipeline stage next (below), the tagging engine which applies the ML provider taxonomy.

How to Retrain Your ML Provider with New Documents (Data)

Re-open your ML provider and add more documents, using the steps in the procedure above.
The Status of the ML provider under "Existing Machine Learning Providers" changes from "Trained" to "Needs Retraining" as it contains new data that has not been processed.
Repeat steps 2-5 above to retrain your ML provider and incorporate the new data.

Rules Engine Pipeline Stage

For ML providers to tag, you must add the Rules Engine pipeline stage.
See the following graphic.

How to Test Your ML Provider Taxonomy

To see a sample of which tags you receive based on ML classification, go to the Pipeline Testing page, highlighted below.

Select Pre-Recorded data from the drop-down menu, or else enter in raw data taken from one of your training documents.
For insight into ML tagging, check the Show Raw Rules Engine Tags check box to view which tags are generated from each output property.
Set your Log Level using the drop-down menu.
Click the Start Test button at the bottom of the screen.
The test runs.

Standard Output

The following is returned if the Show Raw Rules Engine Tags check box is NOT checked:

Input Properties
- body: Raw Text Data value
- uid: Unique Identification Number
Output Properties
- Classification Tags: Tags generated by your Taxonomy
- Status: Test status

Expanded Output

The following is returned if the Show Raw Rules Engine Tags check box is checked:

Input Properties
- body: Raw Text Data value.
- uid: Unique Identification Number.
Output Properties
- RulesEngine: Tags generated from rules.
- [ML Provider]: Tags generated by your Taxonomy.
- Status:
  - Test status.
  - Values are "Success" or "Failure".
  - To debug test failures, set your log level to "debug" using the drop-down menu above the message.