How to Remove Previews from Your Configured Storage

About

There are multiple reasons to clear specific previews from the database:

  • To purge previews that are no longer useful
  • To force preview regeneration for specific previews

The following steps are used regardless of which storage type you employ:

  • SQL Server
  • Azure SQL and blob storage

Using PowerShell

Procedure:

  1. Prepare a SQL statement that selects the DocumentUri from the ProcessedDocumentsStatistics table for all the documents you want to remove:
    1. Choose the document conditions:
      • Document modified date
      • Document size
      • ConnectivityHub content ID
      • etc.
    2. You only need to receive back the DocumentUri column that contains all the URLs of the documents to be deleted.

      Sample

      SELECT DocumentUri FROM ProcessedDocumentsStatistics WHERE DocumentUri like '%content.5%'

  2. Open a PowerShell window as Administrator and navigate to the BAInsight Fast Proxy Incoming folder configured for your SmartPreviews Fast Proxy service.
  3. Modify and run the script below so that the Invoke-Sqlcmd connects to your SmartPreviews Configuration database and retrieves all the documents you want to remove (you should be using the query you prepared in step 1.

    Copy
    $results = Invoke-Sqlcmd -Query "SELECT DocumentUri FROM ProcessedDocumentsStatistics WHERE..." -ConnectionString "..."

     

    WARNING

    Pay attention to which/how many documents your query retrieved since the preview for those documents will be removed 
  4. Modify and run the script below:

    PowerShell
    Copy
    (($results | Format-Table -HideTableHeaders | Out-String).Trim()) | Out-File previewsToRemove.txt 
    [System.IO.File]::ReadLines("previewsToRemove.txt") | ForEach { Set-Content -Path ([GUID]::NewGuid().ToString() + ".in") -Value "<Document><CrawledProperty propertyName=`"url`" varType=`"31`" propertySet=`"11280615-f653-448f-8ed8-2915008789f2`">$([System.Security.SecurityElement]::Escape($_))</CrawledProperty></Document>" }

Using SQL Management Studio

  1. Use SQL Management Studio to connect to your Preview Configuration database, either:
    1. SQL Server instance
    2. Cloud-based SQL database
  2. Create a query against the ProcessedDocumentsStatistics table that returns all the documents you want to remove:
    1. Choose the document conditions:
      1. Document modified date
      2. Document size
      3. ConnectivityHub content ID
      4. Etc.
    2. You only need to receive back the DocumentUri column that contains all the URLs of the documents to be deleted
  3. Copy the URLs into a text file, one URL per line. Save the text file on the server that houses your SmartPreviews Fast Proxy service
  4. Launch your local Notepad application.

    1. Copy the script below.

    2. Paste the script into your empty Notepad document and modify it so that it references the file from step 3, above:

      PowerShell
      Copy
      [System.IO.File]::ReadLines("<path to TXT file form step 3>") | ForEach { Set-Content -Path ([GUID]::NewGuid().ToString() + ".in") -Value "<Document><CrawledProperty propertyName=`"url`" varType=`"31`" propertySet=`"11280615-f653-448f-8ed8-2915008789f2`">$([System.Security.SecurityElement]::Escape($_))</CrawledProperty></Document>" }
  5. Open a PowerShell window as Administrator and navigate to the BAInsight Fast Proxy Incoming folder configured for your SmartPreviews Fast Proxy service.
  6. Run the updated script from step 4.

Script Results and Behavior

  • When the script executes it creates several files in your Fast Proxy Incoming folder.
    • There is one file for each URL you want to delete
  • Once Fast Proxy detects the files it will process them and clean the Preview information stored for those documents:
    • Clean the ProcessedDocumentsStatistics table
    • Clean the ProcessedDocuments table
    • Remove the preview container, pages, resources, thumbnails