Amazon Comprehend Medical (NLP)

This component analyzes text and extracts detected languages, entities, key phrases and sentiments.

This component uses the Amazon Comprehend NLP service.

High Data Usage: Save Money by Using a Trigger

To help you avoid accidentally high cloud data costs, BA Insight designed a predefined "trigger" for the NLP component:

  • The trigger is enabled by default, in sample form.

  • The trigger provides sample code that includes the sources to process, (with sample values).

  • You must modify the trigger code according to your needs

Note: Not implementing this trigger can result in high data costs from your cloud data provider.

For example, when crawling 5 different content sources with a total of 1M items, a high number of documents processed for Natural Language extraction can generate high cloud data costs.

Use the following steps:

  1. Add the desired component, normally.

  2. When configuring the component, expand the Trigger section and adapt the predefined trigger to your needs.

  3. Continue with component configuration.

The predefined (modify before using) script value:

Copy
//change the trigger script accordingly
//to the sources that you want this stage to process
//for the example below to work, ContentSource metadata must be
//available on the item to be processed; otherwise adapt your
//script to match your metadata and sources you want to process
var allowedContentSources = new List<string>() { "Source1", "Source2" };
string contentSource = item.Get<string>("ContentSource");
if (string.IsNullOrEmpty(contentSource) ||
!allowedContentSources.Contains(contentSource))
return false;
return true;

The Components affected by this change are all the BA Insight components that make API requests to Microsoft:

  • Microsoft Text Analytics

  • Custom Vision AI

  • Image Processor MS Computer Vision

  • Video Processor Microsoft Video Indexer

How to Trigger Your Pipelines Only for Video Files

The script below can be entered into the Trigger screen code window above.

This script runs only if the file extension detected is a supported video format.

  • Add to the script only the formats you truly wish to process.

  • Adding all the supported video file extensions is unnecessary and inefficient.

This aids you in reducing data usage, thereby saving money.

For more on using Pipeline Triggers, see Add Triggers to Determine When Your Pipelines Run.

Copy
// add here all allowed extensions, but always in lowercase
var allowedExtensions = new List<string>(){"mpeg4", "mp4", "avi"};
string fileext = item.Get<string>("escbase_fileextension");
if(fileext!=null)
{
  if(allowedExtensions.Contains(fileext.ToLower()))
   return true;
}
return false;

How to Configure the Amazon Comprehend Medical Component

To configure this component, use the following image and steps:

Credentials can be provided one of two ways:

  • Enter AWS Credentials file and AWS Region

  • Directly provide API Key and API secret

  1. Check Use Credentials file to use a credentials file

  2. AWS Credentials file location:
    1. Enter the location of AWS Credential file.
    2. Example: C:\Users\Luca\Desktop\credentials.txt

      Example AWS Credentials File

      Copy
      [{profilename}]
      aws_access_key_id = {accessKey}
      aws_secret_access_key = {secretKey}

  3. Credentials Profile Name:
    1. Enter AWS Profile Name for the Credentials File
      or
    2. Use API & Secret Access Key directly.


  4. Api Key:
    1. Setup your Amazon Comprehend, obtain an API key, and enter your key into this field.
  5. Api Secret
    1. Enter API Secret
  6. Amazon Web Service Region
    1. Select the Region of your Amazon Web Service.
    2. The supported Regions for Amazon Comprehend Medical are documented here
  7. Input Property:
    1. Property configured for entity extraction.
    2. Default value: 'body'
  8. Include Entities:
    1. Check to include extraction of Medical Entities
  9. Roll Up Category Entities
    1. Check to include "All" category entities into a single property.
    2. Output Property: AmazonMedicalEntities
  10. Entities score threshold:
    1. Enter a value between 0 and 1. 
    2. Entities below the threshold score will not be reported.
  11. Maximum Entities:
    1. Enter maximum number of entities to return.   
    2. The entities will be returned and trimmed by score.
  12. Include RxNorm Concepts:
    1. Check to include extraction of Rx Norm Inferred Concepts
  13. RxNorm Concept score threshold
    1. Enter a value between 0 and 1. 
    2. Concepts below the threshold score will not be reported.
  14. RxNorm Maximum Concepts:
    1. Enter maximum number of concepts to return.
    2. The concepts will be returned and trimmed by score.
  15. Include LCD CM Concepts:
    1. Check to include extraction of LCD CM Inferred Concepts
  16. LCD CM Concept score threshold:
    1.  Enter a value between 0 and 1. 
    2. Concepts below the threshold score will not be reported.
  17. LCD CM Maximum Concepts:
    1. Enter maximum number of concepts to return.
    2. The concepts will be returned and trimmed by score.
  18. Send raw response as metadata:
    1. Check to return the raw data returned by Amazon Web Service.

Output Properties

Property

Type

AmazonMedicalEntities

Text – Multi

AmazonMedicalResponse Text
AmazonMedicalSerializedEntitiesJson
  • Serializied value of top important entities.

  • Note this is useful for summary generation.

Medical Entities

Current medical entities in your text. It detects entities in the following categories and how they are translated to Output Properties (in table above).

Amazon Category

Property

Type

ANATOMY

AmazonMedicalANATOMY

Text – Multi

MEDICAL_CONDITION

AmazonMedicalMEDICALCONDITION

Text – Multi

MEDICATION AmazonMedicalMEDICATION Text – Multi
PROTECTED_HEALTH_INFORMATION AmazonMedicalPROTECTEDHEALTHINFORMATION Text – Multi
TEST_TREATMENT_PROCEDURE AmazonMedicalTESTTREATMENTPROCEDURE Text – Multi

Amazon Types Examples

Amazon Type

Property

Type

AGE

AmazonMedicalAGE

Text – Multi

PROFESSION

AmazonMedicalPROFESSION

Text – Multi

DXNAME AmazonMedicalDXNAME Text – Multi
GENERICNAME AmazonMedicalGENERICNAME Text – Multi
BRANDNAME AmazonMedicalBRANDNAME Text – Multi

Text to Output Example

Passing the following text to Amazon Comprehend Medical would produce the corresponding output: