Azure OpenAI information extractor component
The Azure OpenAI information extractor component allows you to use and configure natural language queries from Azure OpenAI and OpenAI LLMs to extract information from your documents. The component can generate suggestions for metadata and questions based on the documents that you provide during configuration.
Prerequisites
You must complete the following prerequisites before setting up the Azure Document Intelligence component:
Configure the component
To configure your component, do the following:
-
In the AutoClassifier administration portal, Add a new component to a new or existing pipeline.
-
When adding your component, select Azure OpenAI information extractor from the New Component list and provide a name for your component.
-
In the Configuration section, specify if you want to use Azure OpenAI or Open AI as your provider in the Select AI Provider field.
-
If you selected Azure OpenAI, specify the following fields:
-
In the Deployment URL field, enter the deployment url of your Azure OpenAI service resource. To find your Deployment URL, in Azure AI Studio, click Deployments from the left panel, then click on your deployed resource. On your deployed resource page, copy the Target URI value in the Endpoint section.
You must include the API version in the endpoint URL. Upland BA Insight recommends copying the URL from Azure AI Studio as the API version is included. -
In the Api Key field, enter the Api key for your Azure OpenAI service resource.
-
To find your Api key, in the Azure Portal, click on your Azure OpenAI service resource.
-
In the left panel, click Keys and Endpoint and copy the value in the KEY field.
-
-
In the Input Property field, enter the input property that you want the component to get from the extracted text.
-
In the Max Character To Process field, enter the maximum limit of characters that you want to process. Every Azure OpenAI model has a token limit which can be processed. For reference, 1 token equates to roughly 4 characters. For more information, see Azure OpenAI Service quotas and limits in the Microsoft documentation.
-
In the Max Metadata Suggestions, specify the maximum number of metadata suggestions that you want to receive from the Azure OpenAI model.
-
You can also get suggestions for extracted properties by clicking Choose Files, selecting any number of files, and clicking Get suggestions from files. After you have done so, the metadata table will populate with various property names and queries based on the data that was extracted from the selected documents.
-
-
If you selected OpenAI, specify the following fields:
-
In the Model field, enter your OpenAI model. For example, gpt-4, DALL-E, etc.
-
In the API Key field, enter the secret key for your OpenAI resource. To locate your OpenAI API keys, see https://platform.openai.com/api-keys.
-
In the Input Property field, enter the input property that you want the component to get from the extracted text.
-
In the Max Character To Process field, enter the maximum limit of characters that you want to process. Every Azure OpenAI model has a token limit which can be processed. For reference, 1 token equates to roughly 4 characters. For more information, see Azure OpenAI Service quotas and limits in the Microsoft documentation.
-
In the Max Metadata Suggestions, specify the maximum number of metadata suggestions that you want to receive from the Azure OpenAI model.
-
You can also get suggestions for extracted properties by clicking Choose Files, selecting any number of files, and clicking Get suggestions from files. After you have done so, the metadata table will populate with various property names and queries based on the data that was extracted from the selected documents.
-
Metadata table
After selecting files and getting suggestions from those files, the Azure OpenAI information extractor can populate the metadata with properties and property queries based on the content of the files that you provide. Refer to the following for a description of each column in the metadata table:
| Column | Description |
|---|---|
| Property Name | This column displays the name of the metadata property. For example, First_Name. |
| Property Question | This column displays the question which will be asked to the AI model to get the data for the stated property. For example for the First_Name property, the question is "What is the patient’s first name?". |
| DataType | This column displays an initial data type of the property. You can manually change the data type of a property by clicking the dropdown and selecting another data type from the list. |
| Edit | You can click this button to edit the applicable property. |
| Delete | You can click this button to delete the applicable property from the table. |
Additionally, you can manually add a metadata property and question to the list. In the fields below the table, provide a property name, property question, and select a data type, then click Add.
Output details
Based on the property names and property questions that were added to the metadata table, the Azure OpenAI information extractor component will extract the applicable metadata from the document. For example:
First_Name: {'Tony','Steve'}
Last_Name: {'Stark','Rogers'}
Contact_Number: {'4493435','4493436'}
Work_Phone: {'4323123'}
Patient_ID: {'94643534','94643536'}
Review_Date: {'1 08 2024','1 09 2024'}
Doctor_s_Name: {'Dr Rogers'}
Doctor_s_ID: {'2067995460'}
Doctor_s_Email: {'rogers@cityhospital.com'}