NetDocuments Adapter
The NetDocuments Adapter integrates AutoClassifier with NetDocuments.
NetDocuments Adapter Capabilities and Limitations
- Performance
- Adapter can push additional metadata to NetDocuments
- Adapter can perform incremental crawls
-
Throttling
Note: BA Insight AutoClassifier handles throttling push backs from the NetDocuments API.
AutoClassifier throttles down requests accordingly. -
Profile Attributes
- You must manually create the Profile Attributes in your NetDocuments repository and enable them on the cabinets you wish to index.
- For more information see Defining Custom Profile Attribute.
- Defined Custom Attributes
IMPORTANT!
AutoClassifier does not currently support Link Metadata types.
Only non-linked types are supported.
-
- You must have the same name as the AutoClassifier output properties you wish to push to NetDocuments.
- Single-valued metadata
- AutoClassifier single-valued metadata can be pushed to Text or Notes type NetDocuments attributes.
- The Text type is limited to 50 characters while Notes type is limited to 60,000 characters
- Multi-valued metadata
- AutoClassifier multi-valued metadata can only be pushed to Lookup-Table types attributes.
- Metadata values which are sent by AutoClassifier Engine are automatically pushed as available Lookup-Values to the Lookup-Table attributes that match the metadata name.
- See the Lookup-Table configuration that AutoClassifier can push multi-valued properties to in the example below.
- You must have the same name as the AutoClassifier output properties you wish to push to NetDocuments.
- Special Character Metadata Names
- NetDocuments does not support attributes (metadata) names that contain special characters.
- For example: Microsoft_Location.
- To handle these, you need to add the Metadata Name Sanitizer component as one of the last stages in your processing pipeline.
- If there aren't any metadata with special characters in your output metadata, this component is not needed.
- NetDocuments does not support attributes (metadata) names that contain special characters.
- NetDocuments newly created profile attributes / changed cabinets profile attribute settings
- When you create a new profile attribute, and associate it to your NetDocuments cabinets and folders, the NetDocuments REST API used by the AutoClassifier Adapter might not immediately return information about those new attributes and metadata might not be pushed.
- To force reload the actual list of profile attributes by the NetDocuments REST API, re-authorize to NetDocs OAuth as described below in the configuration steps.
- If you created a new Look-up Table attribute in NetDocuments and want to populate it with AutoClassifier metadata, first add at least one key (sample entry) in the lookup table so that the profile attribute is considered in-use by NetDocuments API.
- Special characters for metadata values
- Currently, the metadata values pushed to NetDocuments are cleaned up, and certain non-alphanumerical characters are removed from metadata values.
- All special characters in the profile attribute use a lookup table and '\' otherwise.
- Multi-valued metadata threshold recommendation
- BA Insight recommends thresholds be applied for multi-valued metadata, so that only the top X most relevant are used.
- This is required because every metadata value of a multi-valued attribute, must belong to the look-up table.
- So, in time the number of lookup value keys can grow very large.
- For example, for NLP, you configure your stages to return 10 most relevant labels.
- Using the Rules Engine is a good multi-value metadata option because the number of potential tags is limited to the number of taxonomy nodes.
- Note: NLP entities, for example, can be unlimited in metadata values depending on your documents content.
-
Read-Only fields
-
In NetDocuments profile attributes can be configured to be read-only and they can be changed only by Admin users with such permissions
-
To have AutoClassifier NetDocuments Adapter work with such fields, make sure the user you are using for authorization has the proper admin permissions
-
How to Configure the NetDocuments Adapter
Use the following steps to configure the NetDocuments Adapter:
- Select the NetDocuments Adapter Sourceto configure.
- Complete the fields shown in the graphic below.
- Server Region:
- Region of the NetDocuments repository (US, EU, AU)
- Region of the NetDocuments repository (US, EU, AU)
- API URL:
- Base URL used for subsequent API calls.
- Changes automatically based on the selected server URL.
- Authorize:
- Click to obtain the Authorization Code.
- You are redirected to NetDocuments and prompted to log in and grant access to the application.
- Click Allow.
- Authorization Code:
- Copy into this field the code that appears when the "Authorize" button is clicked.
- Get Refresh/Access Tokens:
- Click this button to obtain the Access and Refresh tokens from NetDocuments.
- Click this button to obtain the Access and Refresh tokens from NetDocuments.
- Repository ID:
- ID of the repository to be indexed.
- To obtain the Repository ID login with a NetDocuments admin account and go to Admin → Information and Settings → General → ID
- Get Cabinets:
- When selected, fetches a list of all Cabinets in the repository and adds the list to the Cabinets to Process field, each cabinet on a separate line.
- Select the properties you require (hold down shift to select multiple properties)
- Click the Apply button.
- Cabinets to Process
- A list of all the cabinets in the repository to be indexed.
- After the list is populated, you can manually remove the cabinets you do not want to include in processing.
- Each cabinet is added on a new line in the text area with the format "cabinet name:cabinet id
- Get Cabinet Attributes:
- When selected, this option fetches a list of Cabinet attributes in the repository for the selected Cabinets.
- Classifier Controlled Attributes:
- A list of Cabinet Attributes exclusively updated by AutoClassifier
- Filter:
- The NetDocs filter that will be applied at crawling time.
- Maximum length is 1500 characters and may be empty.
- Enumeration Page Size:
- Maximum number of retrieved items.
- Maximum number of retrieved items.
- Crawl Metadata Only:
- When checked, Document data is excluded during the crawl.
- Only metadata will be available for tagging.
-
Max File Size To Download (MB):
-
The maximum file size in MB that will be downloaded for documents.
-
All items exceeding this value will be processed in Metadata Only mode.
-
-
Number Of Parallel Blob Downloads:
-
Amount of parallel document blobs allowed.
-
Change this depending on the file sizes you have in your NetDocuments system / network connection to NetDocuments system
-
-
Number Of Parallel Document Updates:
-
The number of concurrent document metadata updates requests in NetDocuments system.
-
-
Max Time To Wait For Request (minutes):
-
Total waiting time for each of the requests to NetDocuments API
-
- Lookup Table Limit:
- Maximum amount of values that can be added to a profile attribute of type Lookup Table.
- Default value: 1000000
-
Request Retry Max Attempts:
-
The number of times a requests to NetDocuments API should be retried
-
- Full Crawl Start Date:
- Adapter will retrieve only the items with a modified date later than the date shown in this text box.
- Total crawl time varies by size of repository.