How to Configure Your Elasticsearch Search Engine

Note: Search engines in SmartHub are often referred to as "backends." Consider the terms interchangeable.

Use the following procedure to configure the Elasticsearch search engine.

Prerequisites

  • Elasticsearch v7.x-v8.x must be installed.
  • If Elasticsearch is not installed, download Elasticsearch and install it.

Limitations

Note the following limitations when using an Elasticsearch search engine in your SmartHub environment:

  • Elasticsearch does not guarantee consistent ordering for documents with identical sort values (including nulls or empty values), unless a secondary sorting criterion is specified. This means that when multiple documents share the same sort value, and you do not define a secondary sort criterion, the order of their appearance in the search results may vary between query executions, leading to inconsistencies in pagination. For more details, see Paginate search results in the Elasticsearch documentation.

    • To ensure a consistent sort order, Upland BA Insight recommends that you include another field, such as a unique identifier, as a tie-breaker. In SmartHub, this can be achieved by adding a new Query Scripting Processor tuning stage with the following code:

      Copy
      Query.SortList.Add(
          SortType.SortByProperty,
          "escbase_title.keyword", // this is an example of a property that can be used
          typeof(System.String),
          SortDirection.Ascending
      );
    • Once configured, this stage must be placed before the Elastic Translator Stage in the User experience tuning section.

Connect SmartHub to an Elastic Cloud or Self-hosted Elasticsearch On-Premise Instance

Make sure that you have the following information and access for the search engine(s) that you are configuring:

  • Elastic Cloud URL
  • Access to Elasticsearch On-Premise

Add an Elasticsearch Search Engine

Procedure:

  1. Navigate to the SmartHub Administration page at http(s)://<web-app-url>/_admin
    1. For example: http://smarthub.azurewebsites.net/_admin.
  2. From the General Settings page click ADD NEW SEARCH ENGINE to add your new Elasticsearch search engine.
  3. Type: Select Regular
  4. Search Engine: Select Elastic or AWS Opensearch (for AWS-hosted Search service) from the drop-down list.
  5. Enter the search engine Name and corresponding information in the appropriate fields, as shown in the following image:
  6. Rank offset formula coefficients(Optional): Enter these values only if you selected the Rank Based mixing algorithm that is set in the Properties for the SSA page:

    • BOOST: Enter the boost factor.

    • OFFSET: Enter the rank offset.

  7. From the Authentication Mode field, select the authentication mode for your environment's Elasticsearch type from the drop-down menu. The possible authentication modes are:

    • None

    • Basic

    • Token Based

  8. Enter your credentials that are used to access Elastic Search service, based on the search engine you selected in step 4.

    • If you selected AWS-Hosted Elastic, use the following credentials:

      • AWS Access Key: Enter the access key for your AWS-hosted Elastic instance.

      • AWS Secret Key: Enter the secret key for your AWS-hosted Elastic instance.

      • Use Temporary Credentials: Use the AWS service to obtain temporary credentials

        • Expiration time: Set the expiration time of the temporary credentials. The default value is 15 minutes.

      • Authenticate via AWS Profiles:
        • Store profiles in a share AWS credentials file and use to access AWS Elastic search engine content.
        • Note: The file must be saved with an .ini extension:
          • Example: credentials.ini
        • AWS Profile name: 
          • Enter AWS profile name.
          • Example: basic_profile
        • AWS Credentials file location:
          • Enter the location of AWS Credential file.
          • Example: C:\\Users\sdkuser\customCredentialsFile.ini
    • If you selected Elastic, use the following credentials:

      • Basic

        • Account: Enter the account name for your Elastic service.

        • Password: Enter the password for your Elastic service account.

      • Token Based
        • How to configure Token-based Authentication can be found here.
      Example AWS Credentials File
      [{profile_name}]
      aws_access_key_id = {accessKey}
      aws_secret_access_key = {secretKey}
  9. Modify the search engine configuration by entering your configuration settings into the Parameters field.
    See the following example code:

    Search engine configuration example

    Copy
    <configuration>       
        <settings>             
            <setting Name='ElasticServerAddress' Value='http://localhost:9200' />             
            <setting Name='Indices' Value='index1, index2' />             
            <setting Name='Timeout' Value='30' />             
            <setting Name='SourceMappings' Value='00000000-0000-0000-0000-000000000000#index1,index2;' />     
        </settings>
    </configuration>
    1. Use the following table to specify this code:

    Parameter Description Default Value

    ElasticServerAddress

    • Required

    • URL of your Elastic Cloud service instance.

    • Obtain this URL from your Elastic Could provider.

    http://localhost:9200

    Indices

    • Required

    • Specify one or more comma (,) separated indices to be used for search.

    index1,index2

    Timeout

    • Optional

    • Specify the configurable timeout of the search

    • Note: If the search takes longer than 30 seconds, the search is canceled

    30

    SourceMappings

    • Optional

    • A mapping between SearchQuery.SourceId and Elasticsearch indices

    • A Search Query with a matching Source ID is executed against the specified indices.

    • This option is used for performance tweaks when you have multiple search engines.

    00000000-0000-0000-0000-000000000000#index1,index2

Tuning Stages

After creating the search engine, two "Elastic translator" stages are automatically added to the Tuning categories:

  • Query Tuning stage

  • Result Tuning stage

These are search engine-specific Tuning stages and do not apply to any other search engines you have configured.

Perform the following steps for both the Query Tuning and Results Tuning Elastic Translator stages:

  1. Click the stage name.
  2. Modify the Stage Configuration by entering your configuration settings into the Parameters field.
    See the following example code.

    Stage Configuration Example

    Copy
    <configuration>         
        <settings>               
            <setting Name='Indices' Value='index1,index2' />               
            <setting Name='RefinablePropertiesSuffix' Value='' />            
            <setting Name='Timeout' Value='30' />               
            <setting Name='EnableFuzziness' Value='true' />               
            <setting Name='TextQueryType' Value='best_fields' />               
            <setting Name='DateProperties' Value='ElasticLastUpdate' />               
            <setting Name='NumericProperties' Value='FileSize' />               
            <setting Name='SourceMappings' Value='00000000-0000-0000-0000-000000000000#index1,index2;' />               
            <setting Name='FieldBoost' Value='' />               
            <setting Name='ShowAccurateResultCount' Value='false' />                
            <setting Name='BodyField' Value='FileContent' />           
        </settings>
    </configuration>
  3. Use the following table to specify this code:

Parameter Optional/Required Description Default Value

Indices

Required

Specify one or more comma (,) separated indices to be used for search.

index1,index2

RefinablePropertiesSuffix

Optional

Suffix to append to the Elasticsearch field name when building an aggregation on top of it.

                                

Timeout

Specify the configurable timeout of the search.

Note: If the search takes longer than 30 seconds, the search is cancelled.
30

EnableFuzziness

The fuzzy query uses similarity based on Levenshtein edit distance. false

TextQueryType

The multi_match query builds on the match query to enable multi-field queries best_fields

DateProperties

Specify the date properties that are to be used as refiners ElasticLastUpdate

NumericProperties

Specify the numeric properties that are to be used as refiners FileSize

SourceMappings

  • A mapping between SearchQuery.SourceId and Elasticsearch indices.

  • A Search Query with a matching Source ID is executed against the specified indices.

  • This option is used for performance tweaks when you have multiple search engines.

00000000-0000-0000-0000-000000000000#index1,index2
FieldBoost
  • Use the boost operator ^ to make one term more relevant than another.

  • The default boost value is 1, but can be any positive floating point number.

  • Boosts between 0 and 1 reduce relevance.

  • Example: escbase_author^3,escbase_fileextention^0.3

Note that the field names should be taken from elastic, not from the property mapper.

ShowAccurateResultCount  

Specify if the elastic max count of 10000 results or the accurate count is used.

  • Enabling this parameter might affect the performance

false
BodyField Optional

Specify the field care corresponds to the body field.

  • If the elastic index is created (or updated) with Connectivity Hub v2.x the setting should be set to FileContent.

  • If the elastic index is created (or updated) with Connectivity Hub v3.x the setting should be set to escbase_fulltextcontent

Note: If the setting is missing the default value is used.
FileContent
AdditionalUrlQueryParameters

Specify the extra URL parameters that you want to send on the Elasticsearch search request.

  • Example: search_type,dfs_query_then_fetch

In order to send multiple parameters, they need to be separated using ";"

  • Example: param1,value1;param2,value2;

 

Logs

  • By default, logs appear in the directory: <SmartHub_Install_Directory>\Logging.

Operator NEAR:
  • The Elasticsearch search engine supports the operator NEAR.

  • Supported syntax:

  1. (A OR B) NEAR(5) C => "A C"~5 OR "B C"~5
  2. (A OR B) NEAR(5) (C OR D) => "A C"~5 OR "B C"~5  OR "A D"~5 OR "B D"~5
  3. (A OR "B C") NEAR(5) D => "A D"~5 OR ("B C D"~5 AND "B C")
  4. (A AND B AND C) NEAR(5) (D AND E) =>"A B C D E"~5