What are Components?
AutoClassifier Components are the functions and operations that AutoClassifier performs on your data. These various operations can include:
- Read/Extract metadata
- Modify existing metadata
- Create new metadata
- Expand data classification
- Capture and store data
How Do You Use Components?
In order to perform these operations, you can add and modify Components by:
- Performing script processing to manage your results at a detailed level
- Limit the number of values that are returned as output
- Extract the body and metadata from raw binary files
- Write custom classification tagging
- Use custom components to perform operations such as script-based classification
Components are Bundled into Pipelines
-
Components, which are bundled into pipelines, are integrated into the metadata enhancement process in order to calculate and return extra metadata.
-
Depending on the resource and component dependencies, you choose whether your pipelines run sequentially or in parallel and what triggers each component to run.
Add Additional Pipeline Stages to Process Input Documents
After you configure your components/pipelines, you can add other components as stages to your pipeline in order to further post-process the input documents and refine the input metadata.
- This metadata can be pushed back along with the documents to their storage location (O365 for example) to be used in free text queries.
- For example, if your source contains raw binary files you could add the Tika Extractor component as a stage in your AutoClassifier pipeline.
- Tika Extractor extracts the document body and metadata that lets AutoClassifier process the document body as plain text.