How to Enrich Indexed Content with Metadata
Select and Organize Metadata to Index (Add Metadata to Datasets)
Datasets provide a way to enrich indexed content with metadata from an associated content source.
For example, suppose you have an indexed record that defines an employee profile.
- This master record is linked to another content source through a common
employeeID
field. - This other source contains the employee status, for example,
active
,on leave
, and so on. - By joining this status record as a dataset with the master record, your indexed record is richer in terms of properties.
The result is a higher probability users find the results they search for.
- Datasets are a type of content source that have limited functionality.
- For example, you cannot crawl a dataset.
- You can associate one or more datasets with your content source.
- You cannot associate one content source with another content source.
- For example, you cannot associate an Exchange content source with a Documentum content source.
- You cannot associate one content source with another content source.
Similar to content sources, you create the dataset connection before you create the dataset.
-
(For more information about supported datasets or for assistance setting up your dataset, contact bainsight-support@uplandsoftware.com.)
This section provides a SQL example.
How to Add a Dataset Connection
- Datasets > Dataset Connections: See the Dataset Connections > General Settings page that appears.
- Click New and select your dataset connection. Dataset connectors are only supported for encrypted databases. When you are adding your dataset connection, you can only do so with the basic connection mode.
- The <name of the dataset> Connection page appears. Complete the following fields.
- Title (required): Enter a unique, descriptive name.
- SQL Server (required): Enter the name of your server where your database is located (such as
myservername
) or the IP address for this server. - Database Name (required): Enter the name of your database such as
Training
. - Authentication mode: Specify one of the following:
- Use Service Account: Use this account to grant access to the database for the accounts running the Job Service and the Connectivity Hub site.
- Specify User Account: Use the account credentials for a single user.
- Title (required): Enter a unique, descriptive name.
- Click Save
- Go to the Datasets page.
- Click the General Settings tab.
- Add the Filtering Clause to link the dataset to the content source. This is the common record info.
- Title (required): Enter the title of your database.
- Connection (required and automatically entered): The connection to your dataset.
- SQL Query (required): Specify a valid SQL statement without filtering parameters.
- SQL Filter Clause (required): Specify a valid SQL filtering condition. The filtering parameters can be specified using:
- Bracketed
[]
names for replacement purposes. For example, specify[ID],'P2'.
- Typed parameters to improve performance. For example,
@S_PARAMNAME
for strings,@I_PARAMNAME
for integers wherePARAMNAME
can be any name.
- Bracketed
Typed parameter names must be written in uppercase. Parameter names that are written in lowercase, such as@i_paramname
, will not be recognized. - Click Advanced Settings and complete the following fields.
- Click the Cache check box to enable caching for the purposes of improving performance.
-
Cache Timeout
-
Required
-
By default, this specification is set to
1000
-
-
Script Library:
-
Enter a VB.NET script that are used in all other scripts as library methods.
-
For example, type:
Code exampleCopyfunction doSomeThing(inval as string) as string
return inval.replace("a","b")
end function -
- Click Save.
How to Import Metadata from Another Content Source
Prerequisites
You can import metadata from another content source if both of these content sources are similar.
Choose to import your metadata when you have met both of the following conditions:
- Two or more similar content sources that have some metadata properties with the same custom scripting or complex mapping
- The metadata regeneration for each of the content sources requires excessive manual reconfiguration
Import Metadata
- Go to Content > Actions > Metadata: Click and the Content Metadata page appears.
- Select import source: Use the down arrow to locate a content source such as SQL Content that you want to import the metadata from.
- Import: Click OK to import the specified metadata and to see the properties, descriptions and so on in the table below New.
- On the Content Metadata page, review the metadata columns:
- Edit/Create new mappings where a single metadata column can be mapped to one, two, or more crawled properties in Property, or not mapped to any crawled properties.
- Edit/Create new mappings where a single metadata column can be mapped to one, two, or more crawled properties in Property, or not mapped to any crawled properties.
- Map the metadata in the source system to SharePoint.
- Map the crawled properties to the managed properties.
How to Add Metadata to a Dataset
Datasets enrich the metadata that is associated with a document from another source (dataset).
For this reason, any metadata that is added in the Metadata page that has a value is associated with the related documents.
- Each dataset is a secondary content source that has its own set of metadata.
- Metadata that is associated with datasets provides information that otherwise would not be present for a document.
- You can specify metadata (custom or from the dataset columns) for your datasets using the Dataset Metadata page that is similar to the Content Metadata page.
- For example, you might choose to specify
DS_
as the prefix for your dataset metadata in order to track this metadata with the content metadata (ESC_
) in the index.
- For example, you might choose to specify
- This tracking is necessary because after indexing, there is no way to tell where the data came from.
Unless you specify a prefix that is different from the prefix ESC_
that is used for content sources, there is no way to track the source of your metadata.
- Go to Content Sources > SQL Content > > Metadata and see the Metadata page that appears.
- New: Click to select any of the following and complete the required Title field and make any changes that you require:
- Boolean metadata
- DateTime metadata
- Integer metadata
- Numeric metadata
- Text metadata Edit/Delete
- Title (required): Enter a title.
- Value
- The value is calculated from a content source (Default)
- The value is calculated by an enrichment pipeline: Click to add the generated metadata from your Database Connector to the content.
- To make changes to the metadata settings, click > Edit:
- Reopen the selected metadata pane/Delete.
- Click Generate to see specified metadata type results.
- Reopen the selected metadata pane/Delete.