Filter content from a Content Source

Crawling your content sources may result in capturing content or content types you may not want. To filter for the specific content you want to crawl, you can use the Content Filter feature, described below.

  • Depending on the connector you are using, the filter options that appear (as shown in the graphic below) will vary.

If you make any changes to your Content Filters page, you must run a full crawl by running a target job to update the index.

  1. Navigate to Content Sources .

    1. Click <your Web Service Connector> > > Edit.

    2. Click the Content Filters tab.

  2. Depending on the connector that you are using, any of the following fields may appear. Refer to the following table for more information and configuration instructions:

    Setting Description
    Content types

    The Content type tree contains content types from your source connection. This is populated when you run the datastore types load job from either the Tasks or Connections heading links at the top of the screen.

    Note: These fields are optional.

    1. Leave the default Include All, or choose to include/exclude specific content types. You can use the Shift key for multiple selections.

    Datastores selection

    This setting allows you to break up your content sources into datastore groups or individual datastores. This setting only applies to datastores that you flag as active for this connection.

    1. Specify if you want to include all datastores, or only include/exclude the datastores specified in the textbox.

    2. In the textbox, enter a comma separated list of datastore IDs or ranges. For example, 1, 2, 3-10, 50-100.

    Custom Filters This setting allows you to specify your custom filters and their separators in the textbox. Refer to the description text for examples.
    Locales

    This setting allows you to specify the locale that is configured for your environment. You must only specify one value per content source. For example, en-US or fr-CA.

    This field is not mandatory. if no value is provided, the default locale configured on the source system is used.
    Warning: This field will override the values of the Custom Filters field.
    Category Filter

    This setting allows you to specify category filters for various source system items. Note the following:

    • You can choose to include or exclude these items by specify a tag followed by the category item. For example, IncludeCatalog or ExcludeCatalog.

    • Include/Exclude filters are separated by semi-colon (;). Include/exclude tags and values are separated by equal sign (=). For example, IncludeCatalog=Mobiles ; ExcludeCatalog=Software

    • multiple values are delimited by vertical bar (|). For example, IncludeKnowledgeBase=Outlook|Tips and tricks.

      • To escape the characters above mentioned, in the category name, add backslash (\) before each character.

    • Each category filter type can be included on the same line, but you cannot mix category types. For example:

      • IncludeCatalog=Mobiles ; ExcludeCatalog=Software is valid.

      • IncludeCatalog=Mobiles ; ExcludeIncidents=Inquiry is invalid.

    • You do not need to have both the include and exclude filters for an item.

    • The include/exclude value may be empty. For example, IncludeIncidents=.

    For more information, see the setup and configuration instructions for your specific connector.

    Warning: This field will override the values of the Custom Filters field.
  3. Save and run a full crawl target job using the Setting up a target section.