Capacity Planning and Preview Mode: Offline or Online
Document Considerations
As you plan your capacity requirements for Smart Previews, provide requirements for the following criteria (your requirements may change):
Requirements | Value required by your environment |
---|---|
How many documents are indexed per content source? |
|
How frequently do the crawled documents change? |
|
How many documents do you want to generate Previews? |
|
For what types of documents do you want to generate Previews? |
|
The following sections provide details about the Preview processes.
Preview Process
Previews are generated in 1 of 2 modes.
- Online mode
- Offline mode
Online Mode
- The user clicks the preview icon.
- A fetcher downloads the document.
- The preview is generated and placed into the database, and then returned to the user (1 time).
- Subsequent requests for the same preview retrieves the information directly from the database.
Offline Mode
- The documents are captured during crawling and sent for preview generation.
- The documents are moved to the preview server for preview generation, after which they are stored in the database.
- This method requires no user request.
Which Preview Process is Right for You?
Using offline Preview generation, Smart Previews generates and stores Previews at crawl time.
Preview Process | Primary Benefit |
---|---|
Offline (crawl-time) Previews |
|
Online (On-demand) Previews |
|
Crawling and Document Modifications
Consider the following when determining hardware requirements for the Smart Previews components:
- Crawling
- Database cache
- Document modification operations
Initial Offline Preview Cache Build
- The initial Preview cache database is populated during a full crawl after Smart Previews is first deployed.
- Future crawls, whether full or incremental, only update the Preview cache database with previews changed since the last crawl.
- Smart Previews receives a copy of each crawled document.
- This initial build of the Preview cache can require more hardware resources than normal operations.
- BA Insight recommends this be completed before going live (initializing in a production environment)
Recommended Database Cache Size
- Smart Preview Cache:
- 8 GB per 100k documents
- Longitude_Configuration database:
- 150 Mb per 100k documents
- Longitude_UserProfile database:
- 10 Mb/user assuming 30 documents in workspace
How Long Does it Take to Generate a Preview?
- Previews are typically generated at a rate of 600-1500 documents per core/hr (equivalent to about 230 MB/core/hr to 400 MB/core/hr).
- The first time a user Previews a search result, the Preview can require more time to render because the browser needs to cache the required resources.
- This is the same behavior that you see when you open any SharePoint site.
- Some file types can require more time. For example:
- Emails (excluding attachments) process near the top speed of 1500 files/core/hr.
- Scanned PDF files are closer to the lower end of the range (600 files/core/hr).
How to Choose a Preview Process
Offline vs. Online
- Offline (crawl time) and Online (On-Demand) Previews are each designed to address different requirements.
- Both processes can be used simultaneously by the same hardware configuration.
- Both processes are activated by default.
- Typically, you specify a rule that enables documents up to a specified size to be generated online.
- Rules can also be applied based on date or file type (regex).
- Larger or less frequently accessed documents are generated offline.
- You can choose to use either both processes or only 1 process.
- Using Offline Preview generation, Smart Previews generates Previews for documents while they are being crawled.
- This means that Previews are available and require no additional processing.