Configure the mixing algorithm
Specify the Mixing Algorithm
- In the SmartHub administration portal, click General Settings.
-
Click the Mixing Multiple Search Engines link.
- The Mixing Multiple Search Engines dialogue appears.
- Mixing Algorithm: This window allows you to chose a mixing method:
Rank-Based:
Orders the search results based on the ranking from the search engine search engine, as well as the boost and offset values that you can assign.
The SmartHub search results are shown in descending order of rank, calculated as:
Rank=(RankS * Boost) + Offset
Where:
Ranks
: Rank that is determined by the search engine.Boost
andOffset
: Values that can be set for each search engine.For example, if you want to add prominence to results from Backend1, you could set the Boost value to 2.
In this case, if a search result on the main search engine has a rank of 100, the same search result on Backend1 would have a rank of 200.
Using the same example, in order to give additional prominence to the main search engine, you could boost values set to 1 and set the offset value for the additional search engine to 50.
In this case, the same search results on the main search engine and Backend1 would have values of 100 and 150, respectively.
Experiment to determine the optimal boost and offset values for your data set in order to get the required results.
- Round Robin:
- Orders the search results by taking the first result from the first location, the first result from the second location (if there is one) and the first result from the third location (if there is one).
- The process repeats starting with the second result from the first location until all of the results from all of the locations are mixed together.
- Weighted Round Robin:
- Orders the search results based on the designated search engine weight.
- See more details in the "Mixing results using the Weighted Round Robin algorithm" below.
- Support for pseudo random mixing has been removed, you should use weight round robin instead.
Scriptable:
Enables you to write a custom script to decide how the results should be mixed between the search engines.
See the Scriptable Mixing Capability section below for more details
Tip: Use Weighted Round Robin and Round Robin mixing operations when one of your search engines does not return scores. When possible, this algorithm ensures that each page displays a similar number of search results from each search engine.
- Click OK.
Mixing Results from Multiple Search Engines using the Weighted Round Robin algorithm
-
This mixing method is useful when you want to mix results returned from different sources which don't provide relevancy scores (document Rank metadata) or where the scores are not consistent across the board.
-
This mixing algorithm enables you to assign a specific importance (weight) to each search engine.
How to Enable the Weighted Round Robin Mixing Method
- Go to SmartHub Admin > Additional Settings > Mixing Algorithm
- Select "Weighted Round Robin" from the Mixing algorithm dropdown
- Provide the search engine names and weights for each of them in the following format: BackendName1,backend1Weight;BackendName2,backend2Weight;
- See more details in the "How to configure" section below.
How to Configure the Weighted Round Robin Algorithm
- Backend Weights: Text field which accepts the search engine names and weights associated to them.
- Example: SharePointOnline,7;NetDocs,3;
- The weights need to be integer values greater or equal to 1.
- Note: If the SmartHub Search Engine returns documents from search engines other than the ones specified in the Backend Weights field, then those results will be ignored.
A "Warning" message appears in the logs informing the user about this. - Hint: For intuitive configuration and use, the sum of all weights should be 10, that is, the number of results for a regular page.
- This ensures on any results page the user will see 7 documents from SharePoint Online and 3 documents from the NetDocuments search engine (if there are enough results available).
How it Works
Let's assume the following configuration:
- Assume you have 3 search engines configured in your SmartHub Admin:
- SharePoint Online, Azure, and NetDocs
- The Backend Weights setting is:
- SharePointOnline,7;Azure,2;NetDocs,1;
- Assume your query returns 10 results from SharePoint Online, 5 results from Azure and 1 from NetDocs, for a total of 16 results
- Assume your results page shows a maximum of 10 results, so you'll receive 2 pages of results
The algorithm splits the page in 3 "zones" for results.
-
The number of zones is based on the formula: NumberOfZones = RowsPerPage / NumberOfBackends;
-
Each zone shows an evenly distributed number of results from each search engine.
-
Page 1 of the results looks like this:
- Zone 1 shows:
- 3 documents from SharePoint Online
- 1 document from Azure
- 1 document from NetDocuments
- Zone 2 shows:
- 2 documents from SharePoint Online
- 1 document from Azure
- 0 documents from NetDocuments
- Zone 3 shows:
- 2 documents from SharePoint Online
- 0 documents from Azure
- 0 documents from NetDocuments
- Zone 1 shows:
How to use the Weighted Round Robin Algorithm in a Scriptable Mixing Stage
In your custom mixing stage you can use the algorithm by calling: WeightedRoundRobinMixingAlgorithm.MixResults
function.
-
The function has the following definition:
List<SearchResult> MixResults(Dictionary<string, Queue<SearchResult>> backendResultsQueues, SearchQuery query, Dictionary<string, int> backendWeights, int numberOfSections)
Parameters:
- backendResultsQueues: Dictionary where the key is the search engine name and the value is a queue constructed from
SearchResults.RelevantResults
list - query: This is the user query object
- backendWeights: Dictionary where the key is the search engine name and the value is the associated weight
- numberOfSections: This value defines the number of zones to be used for a page
Mixing and Pagination
Mixing is applied to the results from all of the search search engines before pagination is applied.
-
For example, if you request 10 results per page, the SmartHub engine requests 10 results from each of the configured search engines.
-
The results are mixed and sorted before all of the results are returned.
-
Only the top 10 results are returned with pagination.
Note:
-
The pipeline extensibility stages are applied to the entire mixed and sorted set of results before pagination is applied.
-
These stages let you implement custom mixing algorithms that override the SmartHub’s built-in mixing algorithms.