Coordinate Your Search Engines to Work Together

 

Access and Specify Settings for Multiple Search Engines

  1. Navigate to the SmartHub Admin page at: http(s)://[web-app-url]/_admin.
    1. For example: http://localhost:1234/_admin.
  2. Go to the Federator Properties page.
  3. Left panel in the UI:
    • Federator Settings: Access the general SmartHub settings.
    • Security Settings: Secure your SmartHub.
    • User Profile Settings: Configure your user profile/picture providers.
    • Additional Settings: Configure extra settings such as the cache expiration time.
    • Extensibility: Manage your backends and pipeline stages.
  4. General Settings:
  5. Security Settings:
    • Authentication Mode: Specify the authentication type.

      Note: Every authentication mode comes with its own UI.
    • Admin Users: Specify users that are allowed to access the admin page.
    • Trusted App Redirect URLs: Specify the apps that are allowed to communicate with SmartHub.
  6. User Profile Settings:
    • User Profile Providers: Configure providers that fetch the user properties.
    • User Picture Providers: Configure providers that fetch the user pictures.
  7. Additional Settings:

Specify Results Ordering

Normalize Your Results

  • Normalize your results, or prioritize content sources, by specifying a mixing algorithm.

  • Mixing algorithm means the order in which the search results are displayed on the results page.

  • If you set the mixing algorithm to Rank based in the Administrator user interface, the SmartHub search results are shown in descending order.

  • While ranking scores apply only within the search engine The search engine your SmartHub instance uses to perform queries. SmartHub can be configured to use more than one search engine., the mixing algorithm normalizes results across your search engines.

    • For example, a 1,000 point score from search engine A might not be equivalent to a 1,000 point score from search engine B.
    • The equivalent score on engine B might be 500, or 5,000.
    • The rank-based mixing algorithm ensures that when the results are merged, end users see their results sorted by decreasing relevancy (or your specified relevancy score).
  • An alternative to applying the mixing algorithm to search engines is to rank one content source over another.

    • For example, a regional farm using SmartHub might want to rank its local content over its corporate content.

    • Conversely, a global deployment might choose to rank headquarters content higher than its regional content.

  • To perform a ranking operation, modify the mixing algorithm values.

Specify the Mixing Algorithm

  1. Go to the SmartHub Additional Configuration page.



  2. Mixing Algorithm: Click and the SmartHub pop-up window appears.
    This window enables you to chose a mixing method:

    • Rank-Based: Orders the search results based on the ranking from the backend search engine, as well as the boost and offset values that you can assign.

      The SmartHub search results are shown in descending order of rank, calculated as:
         Rank=(RankS * Boost) + Offset

      Where:

      • Ranks: Rank that is determined by the backend search engine.

      • Boostand Offset: Values that can be set for each backend.

        • For example, if you want to add prominence to results from Backend1, you could set the Boost value to 2.

          • In this case, if a search result on the main backend has a rank of 100, the same search result on Backend1 would have a rank of 200.

        • Using the same example, in order to give additional prominence to the main backend, you could boost values set to 1 and set the offset value for the additional backend to 50.

          • In this case, the same search results on the main backend and Backend1 would have values of 100 and 150, respectively.

        • Experiment to determine the optimal boost and offset values for your data set in order to get the required results.

    • Round Robin:
      • Orders the search results by taking the first result from the first location, the first result from the second location (if there is one) and the first result from the third location (if there is one).
      • The process repeats starting with the second result from the first location until all of the results from all of the locations are mixed together.
    • Pseudo Random:

      • Orders the search results pseudo-randomly by doing a light shuffle of the round-robin results, which appear random to the end user.

      • The pseudo random algorithm mixes the search results using the Fisher Yates shuffle algorithm, and rearranges these results so that the more relevant results for each backend are shown before the less relevant results.

      • Whenever possible., the algorithm ensures that the same, or a similar number, of results appear on each page for each backend.

    • Weighted Round Robin:
      • Orders the search results based on the designated backend weight.
      • See more details in the "Mixing results using the Weighted Round Robin algorithm" below.
    • Scriptable:

      • Enables you to write a custom script to decide how the results should be mixed between the backends.

        See the Scriptable Mixing Capability section below for more details

      Tip: Use Weighted Round Robin, Round Robin and Pseudo Random mixing operations when one of your backends does not return scores.
      - Pseudo-random mixing gives an impression of scores even when there are no Pseudo-random mixing.
      - This is based on the Fisher-Yates shuffle algorithm and gives the impression of returned scores even when there are no returned scores.
      - When possible, this algorithm ensures that each page displays a similar number of search results from each backend.

  3. Click OK.

Mixing Results from Multiple Backends using the Weighted Round Robin algorithm

  • This mixing method is useful when you want to mix results returned from different sources which don't provide relevancy scores (document Rank metadata) or where the scores are not consistent across the board.

  • This mixing algorithm enables you to assign a specific importance (weight) to each backend.

How to Enable the Weighted Round Robin Mixing Method

  1. Go to SmartHub Admin > Additional Settings > Mixing Algorithm
  2. Select "Weighted Round Robin" from the Mixing algorithm dropdown
  3. Provide the backend names and weights for each of them in the following format: BackendName1,backend1Weight;BackendName2,backend2Weight; 
    • See more details in the "How to configure" section below.

How to Configure the Weighted Round Robin algorithm

  • Backend Weights: Text field which accepts the backend names and weights associated to them.
    • Example: SharePointOnline,7;NetDocs,3;
    • The weights need to be integer values greater or equal to 1. 
    • Note: If the SmartHub Search Engine returns documents from backends other than the ones specified in the Backend Weights field, then those results will be ignored. A "Warning" message will appear in the logs informing the user about this.
    • Hint: For intuitive configuration and use, the sum of all weights should be 10, that is, the number of results for a regular page.
      • This ensures on any results page the user will see 7 documents from SharePoint Online and 3 documents from the NetDocuments backend (if there are enough results available).

How it Works

Let's assume the following configuration:

  • Assume you have 3 Backends configured in your SmartHub Admin:
    • SharePoint Online, Azure, and NetDocs
    • The Backend Weights setting is: 
      • SharePointOnline,7;Azure,2;NetDocs,1;
  • Assume your query returns 10 results from SharePoint Online, 5 results from Azure and 1 from NetDocs, for a total of 16 results
  • Assume your results page shows a maximum of 10 results, so you'll receive 2 pages of results

The algorithm will split the page in 3 "zones" for results. The number of zones is based on the formula: NumberOfZones = RowsPerPage / NumberOfBackends;

Each zone shows an evenly distributed number of results from each backend.
Page 1 of the results looks like this:

  • Zone 1 shows:
    • 3 documents from SharePoint Online
    • 1 document from Azure
    • 1 document from NetDocuments
  • Zone 2 shows:
    • 2 documents from SharePoint Online
    • 1 document from Azure
    • 0 documents from NetDocuments
  • Zone 3 shows:
    • 2 documents from SharePoint Online
    • 0 documents from Azure
    • 0 documents from NetDocuments
Note that the results are also ordered descending by backend weight (the backend with the highest weight has its results at the top of the zone).

How to use the Weighted Round Robin algorithm in a Scriptable Mixing Stage

  • In your custom mixing stage you can use the algorithm by calling: WeightedRoundRobinMixingAlgorithm.MixResults function.

  • The function has the following definition:

    List<SearchResult> MixResults(Dictionary<string, Queue<SearchResult>> backendResultsQueues, SearchQuery query, Dictionary<string, int> backendWeights, int numberOfSections)

Parameters:

  • backendResultsQueues
    • Dictionary where the key is the backend name and the value is a queue constructed from SearchResults.RelevantResults list
  • query
    • This is the user query object
  • backendWeights
    • Dictionary where the key is the backend name and the value is the associated weight
  • numberOfSections
    • This value defines the number of zones to be used for a page

Mixing and Pagination

Mixing is applied to the results from all of the search backends before pagination is applied.

  • For example, if you request 10 results per page, the SmartHub engine requests 10 results from each of the configured backends.

  • The results are mixed and sorted before all of the results are returned.

  • Only the top 10 results are returned with pagination.

  • The pipeline extensibility stages are applied to the entire mixed and sorted set of results before pagination is applied.

  • These stages let you implement custom mixing algorithms that override the SmartHub’s built-in mixing algorithms.

Specify the Query Response Time

  1. Go to the SmartHub Additional Configuration page.
  2. Query Timeout: Click and the SmartHub pop-up window appears.


  3. Query timeout: The query, or backend, timeout represents the time span (in milliseconds) that is allocated to each backend for query response.

If a backend is not capable of returning results in the specified time span, an error is displayed on the search result page:

See the warning.

Customize Your Search Error Display

  1. Go to the SmartHub Additional Configuration page.
  2. Error Handling: Click and the SmartHub pop-up window appears.

    See the Search errors section of the UI.
  3. Display mode:

    • Show first: Search errors appear at the top of the search results. Enabled by default.
    • Show last: Search errors appear at the bottom of the search results.
    • Don't show: Search errors are not displayed.
  4. Error icon: Select the .png icon that appears for errors.

    After you make a selection, the related Error Icon or Warning Icon changes.

  5. Warning icon: Select the .png icon that appears for warnings.
  6. Error Title Template: Choose one/both:
    • error level: %level%
    • error message: %message%
  7. Error Description Template: %description%: the error details
  8. OK: Click OK.

Specify the Query Syntax

  • The text can be added at the beginning or the end of the query.
    • If you add the text in more than one location, the query parses only one and considers the second location to be part of the query term.
  • Quotes (" ") are mandatory. If the quotes are not specified for the backend list, the backend list is ignored, and the query is passed as-is to the main backend only.
  • The backend list can also be specified in the query text box.
    • Simple cases are supported, but the full KQL syntax is not supported.
  • If the backend list is not specified, or if this list is empty, the SmartHub acts as a pass through and queries only the main backend.
  • If you specify a query against multiple backends, each query must be separated by a semi-colon (;).
  • Backend names are case-insensitive and must exist in the Total additional backends in order to be queried against.
    • A warning is issued for backends that are specified in this list, but are not Registered backends.
  • If a backend is specified more than once (either explicitly or as a result of * expansion), this backend is queried only once.
  • You can use the asterisks character (*) to specify starts with behavior, as shown in the last three examples below:
  • FederatorBackends:"backend1,backend2": Queries against backends named backend1 and backend2
    • FederatorBackends:"*": Queries all the backends
    • FederatorBackends:"FirstBackend; s*": Queries the first backend and all backends that start with s
    • FederatorBackends:"back*": Queries all the backends that start with the word "back"

Scriptable Mixing Capability

The mixing script is called at query time after the results are returned from all the backends.

The script has access to:

  • PerBackendResults: List<SearchResults> 
    • The list of results returned by each backend
    • To know which backend a SearchResults object belongs to you can check SearchResults.BackendName
    • See SearchQuery class to find all the available properties
  • Query: SearchQuery
    • The SearchQuery object that was executed for the current search
    • You can use this object to read information about the search - see SearchQuery class to find all the properties available
  • MixingError: FederatorError
    • This allows you to return an error the engine so that it knows something went wrong at mixing time
    • To set an error:

      Sample Mixing error

      MixingError = new FederatorError(FederatorErrorLevel.Error, "Something went wrong")

    • The available Error levels are:

      • Error

      • Info

      • Warning

The script is expected to return:

  • List<SearchResult> that contains a number of results less or equal to Query.RowLimit (10 by default) which represents the list of results that should be displayed for the current search page.
    • If you return less than Query.RowLimit results, the Paging mechanism does not consider additional results to be returned after the current page.

Sample script that returns only the 1st result of each backend:

Sample mixing script
Copy
var results = new List<SearchResult>();

foreach(var backendResults in PerBackendResults)
{
  if(backendResults.Count > 0)
      results.Add(backendResults[0]);
}

return results;