Website Connector Prerequisites

Connector Features and Requirements

Features Supported Additional Information
Searchable content types Yes
  • All content types.

  • Meta tags found in HTML documents can be extracted via Connectivity Hub.

  • See the Connectivity Hub documentation on how to configure your content sources for this.

Content Update Full and Incremental
Permission Types No
  • All content is indexed as public.

  • Security is assigned via the ACL Script in the Content > Advanced tab.

Required Software .NET Framework v4.7.2
Hardware

Rending HTML web pages requires a large amount of CPU resources and memory.

BA Insight recommends the following hardware:

  • Server with at minimum of 5 GB RAM and 8 CPU cores available for the connector to process sites correctly.

 

Authentication Protocols

The following Authentication protocols are supported

Authentication Protocol  Description Prerequisite
Anonymous Access

The connector does not pass any information to the web server.

None

HTTP Basic Authentication

The connector passes the username and password for authentication via the standard HTTP Headers.

The username/password for the account to use for authentication.

Azure AD An identity and access management solution from Microsoft that helps organizations secure and manage identities for hybrid and multicloud environments. Application

The connector interacts with Azure Active Directory An identity and access management solution from Microsoft that helps organizations secure and manage identities for hybrid and multicloud environments. to obtain a token and pass it as the HTTP Authorization header

The website must be secured via Azure AD A directory service for Windows domain networks. A hierarchical structure that stores information about objects on the network. Used to manage permissions and control access to critical network resources.

The connector requires the following:

  • The ID of the Azure tenant where the website is deployed.
  • The Client ID of the application in Azure AD.
  • A certificate to obtain an Azure AD token must be uploaded to the certificate store on the computer where the connector is installed.
oAuth Specifies a process for resource owners to authorize third-party access to their server resources without providing credentials. Authentication
  • The connector interacts with the identity provider to obtain refresh, access, and ID tokens for authentication.

  • The access and ID tokens are provided to a bootstrapping page on the website for initialization

The Application used by the website must be configured as follows:

  • Allow PKCE authentication code flow.
  • Provide refresh, access and ID tokens.
  • Add the Connector oAuth Redirect URL to the list of authorized Redirect URLs.
    • This is typically http://localhost:2406/oauthresult.aspx.
      Note that the redirect URL is case-sensitive and must correspond to the exact same way the connector will be accessed. The /oauthresult.aspx part of the URL is always lower case.

The website must be modified to add an extra page to initialize the application for the purpose of crawling. When the website is crawled, this page is called with the ID or Access tokens passed via the URL. The page is then responsible for storing the necessary token in the right location so that the crawling account is considered as successfully authenticated and the browser will not prompt for authentication

Additionally, make sure you have the following information before starting the installation and configuration:

  • The Client ID of the application used by the website.
  • The authentication endpoint of the identity server.