How to Manage Rules for Generating Previews

 

Specify Preview Generation Rules

Preview Rules are used for each specific mode.

(The terms Preview generation rules, Preview rules, and import rules are used interchangeably.)

  • Write these rules to control the granularity of the Preview results.

By default, Smart Previews uses two types of import rules:

  • Online generation (at viewing time):
    • Files that are less than 200kb
  • Pre-generation (at indexing time):
    • Files that are greater than 200kb

Both of these rules use the default Preview databases to store the Previews.

  • Rules are applied to content in descending, sequential order.
    • For example, if you specify an online rule with a maximum file size of 500 kB and the document size exceeds this specification, this rule is skipped and the preview generation service evaluates the following rule.
  • This statement is true whether the following rule is for offline or online generation.
  • When you edit existing rules and create new rules, you choose to perform operations on the imported content such as storing document Previews in different databases based on the URL of the document, or excluding certain types of documents from the Preview generation process.

Write Preview Generation Rules

  1. Go to the Web Admin Console: Click Control the Import Service on the Smart Previews Web Admin console page that appears.
  2. Go to the Preview generation rules section of the Service management page.
  3. Behavior: Click the down arrow and select:
    1. Generate at viewing time: Generate Previews upon request (online mode).
      1. Generate at viewing time rules should precede pre-generate at indexing time rules.
    2. Pre-generate at index time: Generate Previews at indexing time (offline mode).
    3. Ignore document: Do not generate Previews for these documents.
  4. When URL: Click the down arrow and select:
    1. Starts with: Checks whether the document URL begins with the import rule specified in the Value field.
      1. Start the value with the scheme such as http:// or notes:// and complete the URL with one asterisk (* ) wild-card symbol (two asterisks are not allowed).
      2. The asterisk lets the URL match any character sequence that follows the specified syntax.
    2. Matches regular expression: In this mode the Smart Previews Import Server uses regular expressions to compare URLs.
      1. Specify the expression in the Value field. For more information about the allowed regular expression syntax, see Microsoft: Regular Expression Language - Quick Reference.
      2. Regular expressions are not validated upon submission; see the log files if you do not the expected results.
        You can use regular expressions to match a specific file extension, such as the following rule:

        ^.*\.(extension1|extension2)$W


        with the following components:

        ^.*
        - Starts with anything
        \. (dot)
        - Contains (extension1|extension2): contains extension1 or extension2 (and so on)
        $
        - The end of the line, nothing else follows

        1. For example, write ^.*\.(rtf|doc|docx)$. For the 3Kb limit use the Max File Size field in the rule
        2. All URL comparisons are case-sensitive.
  5. Value:
    1. Enter an expression.
    2. This value is compared to the value that you select in When URL.
  6. Max file size (Kb): Enter the maximum allowed file size.
  7. Click the down arrow and select the database where the Previews are stored into the Save into database field.
    • Choose default or New Database.
  8. If you add a rule, click Add.
  9. To make a change to an existing rule or to delete a rule, click Edit or Delete.
  10. Click Save.
    1. If a rule disappears, see a message below the Preview generation rules table stating that the rule is not saved.
    2. If you refresh your page, the rule is deleted.

Preview Generation Rules Examples

How to Specify No Previews for Matching URLs

This example displays the rule that ignores Preview generation for all documents where the URL matches the following syntax http://nopreview/*.

  • This rule uses the asterisk (*) to match any character sequence that follows the specified syntax.

  • If no other rules are present, Previews are generated for all of the other input URLs and stored in the default database.

How to Exclude .asp Files

To exclude all .asp files, specify the following regular expression:

^.*\.asp$

How to Exclude a File From a Folder

To exclude a file from a specific folder, follow these steps:

  1. When Url: Specify Starts with.
  2. Use this rule as a template:
       http://myserver/some_path/folder_to_exclude/*.

Rule Validation Errors

When you add, edit, and save your Preview generation rules, you validate the rules.

If the specified operation cannot be completed, an error message appears at the bottom of the table in red lettering:

  • Save: If a rule is not properly specified, the rule is not saved in the database.
  • Add: If a new rule does not pass validation, the rule is not added.
  • Edit: If a saved rule has been changed from a valid rule syntax to an invalid syntax, the rule is marked with an error message until you change or delete the rule.

Stop or Start the Preview Generation Service

BA Insight recommends that you pause any crawls before you stop the service because crawled documents accumulate and remain in the transfer folder when the service is stopped.

  • When you stop the import service, you can also choose to remove any documents that are in the Smart Previews Import Server queue.

  • For example, select this operation for documents that are no longer relevant because you removed a content source from SharePoint.

To stop and restart the Preview generation service, complete these steps:

  1. Remove documents pending on this server: Select this option if you want to clear all of the files currently in the incoming and temp folders when you stop the Smart Previews Import Server.
  2. Stop the Preview Generation Service: Click.
  3. Start the Preview Generation Service: Click in order to restart the service.
  4. Resume the crawls.

Change OCR Support

Disable OCR Support

By default, both the PDF and/or Images radio buttons on the "Control the Import Server page" are selected.

  • This feature is CPU-intensive and can impact performance. Do not enable OCR with online Previews.
  • For the best results, use this setting for images that are high-quality and without artifacts.

To disable OCR support, complete these steps:

  1. Go to the Service management page > Manage CPU Usage: See the OCR support section.
  2. Click Images (optional)
  3. Click PDF (optional)
  4. Click Save for all updates.

Determine which PDFs are OCR'd

To determine which documents have (and which documents have not) been enabled for Optical Character Recognition (OCR), execute the following query on the Preview Configuration database.

  • This database contains the ProcessedDocumentsStatistics table that stores the statistics for the generated Previews.

To execute this query on this table, edit the following sample query to meet your requirements:

use Preview_Configuration

select * from ProcessedDocumentsStatistics
where OCRPerformed = 0

The possible OCRPerformed values are:

  • 0 = OCR is not performed
  • 1 = OCR is performed

Disable Previews in a Browser

By default, Previews are enabled on all of the supported web browsers, which include:

  • Chrome
  • Firefox
  • Internet Explorer
  • Safari

If you want to disable Previews on any of these browsers, use these steps:

Note: In a multi-server environment, modify the UserAgentsList.xml file for each web front end (WFE).

  1. Go to C:/Program Files/Common Files/microsoft shared/Web Server Extension/15/TEMPLATE/FEATURES/LongitudeService.2013_LongitudeV4 2013.
  2. Right-click the UserAgentsList.xml file and select Edit.
    This file appears in Notepad.


  3. For each browser that you do not want to support Previews, go to the browser Expression tag. Remove the value HTML.
  4. Save and close the file.