Configuring AutoClassifier Content Enrichment Web Service in SharePoint

AutoClassifier CEWS component allows you to  enrich the managed properties of the items at index time.

  • In order for this processing to happen, you configure your SharePoint system to call this enrichment component at crawl time.
  • This can be achieved by setting the AutoClassifier enrichment component on the current Search Service application.

Procedure:

The Content Enrichment functionality is configured and enabled with the following Windows PowerShell cmdlet:

  1. To help you build the cmdlet with the right parameters, use the Cews Settings page.
  2. To locate this page, access the AutoClassifier SharePoint app and click on the "Cews Settings" button:
  3. On the Cews Settings page, input your information into all the fields shown.
    When you are done, click "Generate PS Cmdlet" to generate a script.
    1. SharePoint Content Enrichment is very specific regarding properties that can be posted as well as returned properties.
      1. As a result any property produced by AutoClassifier will be not returned when invoked by the SharePoint Crawler unless it is specified in the Output Properties Field.(Output Property can be added only if is already existent in Search Service Application)
    2. To increase accuracy of Input and Output Fields you can important managed properties from your SharePoint Farm for selection.
      1. At any point in the future you can additional re-import the properties.
    3. To export Managed Properties from SharePoint for Import AutoClassifier has provided a PowerShell script. 
      1. Open the SharePoint Management Shell as a Farm Admin
      2. Navigate to <SharePoint AddIn root directory>\Config\export-mp.ps1
      3. Execute export-mp.ps1.
      4. Once the CSV file is generated, you can use the upload option to Import Managed Properties into AutoClassifier.

  4. Copy the generated script.
  5. Next, you use this script to set up the enrichment component.
  6. Open the SharePoint Management Shell as a Farm Admin.
  7. Enter the generated script and when requested, type the name of the default search service application.
    The following graphic offers an example:

Alternative CEWS Configuration Method

The following configuration method is PowerShell driven as explained in the below links, which provide background information:

 Procedure:

  1. To add a content enrichment component to a SharePoint Search Service application you prepare and run a script by using:
    1. cmdlets
    2. variables
    3. parameters
  2. The first variable is the Search service application:
      $ssa = Get-SPEnterpriseSearchServiceApplication
  3. The second variable would be the new content enrichment component:
    $CEconfig = New-SPEnterpriseSearchContentEnrichmentConfiguration
  4. Next would be to add the required parameters to the CEconfig variable:

    $CEconfig.Endpoint
    = “http://(*servername*):1967/EnrichmentMS.svc”
     
    $CEconfig.InputProperties
    = "IsDocument", "body", "Author",
    "Filename", "FileType", "MimeType",
    "FileExtension","OriginalPath","url","ContentType","contentclass","LastModifiedTime","Size","escbasecrawlurl"    
    ***these
    property examples are for preview. Adjust to your situation.
     
    $CEconfig.OutputProperties
    = "Some Property"    *** these would be, for example, your
    AutoClassifier output properties
     
    $CEconfig.SendRawData
    = $True
     
    $CEconfig.MaxRawDataSize
    = 8192  ***
    value in Mb
     
    $CEconfig.DebugMode
    = $False

  5. Next, set your new content enrichment configuration:
    Set-SPEnterpriseSearchContentEnrichmentConfiguration
    -SearchApplication $ssa
    -ContentEnrichmentConfiguration $CEconfig
  6. Use the following command to confirm your new enrichment:
    Get-SPEnterpriseSearchContentEnrichmentConfiguration

As one single command, without variables, the code appears as shown here:

Set-SPEnterpriseSearchContentEnrichmentConfiguration
-SearchServiceApplication SSA-endpoint “http://(*servername*):1967/EnrichmentMS.svc” - InputProperties
 "IsDocument", "body", "Author",
"Filename", "FileType", "MimeType",
"FileExtension","OriginalPath","url","ContentType","contentclass","LastModifiedTime","Size","escbasecrawlurl"- OutputProperties "Some Property" -SendRawData
$true - MaxRawDataSize 8192 - DebugMode $False