CrawlQueue API
CrawlQueue API enables near real time incremental crawl in ConnectivityHub.
This endpoint is exposed by Connectivity Hub and it should be used to push items and folders for crawling.
When "push incremental crawl mode" is enabled for a content source, then the default logic for incremental crawl is completely replaced and Connectivity Hub recrawls only the items pushed via the CrawlQueue API endpoint.
By default, the Connectivity Hub site is secured with NTLM Authentication, so for a successful call to the CrawlQueue API you must authenticate.
The endpoint exposes 3 operations, as shown in the image below:
Swagger is integrated with Connectivity Hub.
-
Use Swagger to test the CrawlQueue API and familiarize yourself with its methods and parameters.
-
To explain how to interact with this API, we use the SharePoint Online connector as an example.
Below, we define the things that can be pushed to be recrawled by the next incremental crawl:
- A site collection
- In ConnectivityHub, a site collection represents a datastore.
- Adding/removing site collections is done through the Datastore load task available at the connection level.
- Adding/removing site collections is done through the Datastore load task available at the connection level.
- A site collection can be marked for recrawling via the Crawl Queue API with the condition this site collection was already crawled by a previous crawl task.
- By marking for recrawling a site collection, you mark for recrawling all the sites, lists, document libraries, files within the site collection:
- For example:
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
Sample for site collection https://baidev.sharepoint.com/sites/CustomerBug5600
- For example:
- In ConnectivityHub, a site collection represents a datastore.
- A site or list or document library
- By marking for recrawling a site, you mark for recrawling all its sub-sites, lists, document libraries, and files within the site.
- By marking for recrawling a list or document library, you mark for recrawling all folders and files within the list/document library
- Example for a site:
- Example for a site:
- By marking for recrawling a site, you mark for recrawling all its sub-sites, lists, document libraries, and files within the site.
id = "WEB##{spoSite.Id}##{SiteURLConvertedTobase64}"
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId ="WEB##{spoSite.Id}##{SiteURLConvertedTobase64}"
Sample for site with:
-
-
- ID: be83fd40-6812-4606-a7e2-c3abf8ae68e4
- URL: https://baidev.sharepoint.com/sites/CustomerBug5600
-
Example for a list:
id = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"
Example for list with:
-
-
- ID: bc79d072-4a4c-4140-bc97-6c9ebec41c87
- Parent site
- URL: https://baidev.sharepoint.com/sites/CustomerBug5600
- ID: be83fd40-6812-4606-a7e2-c3abf8ae68e4
-
- A document or list item
id = item id in list or document library
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"
Example for item with:
-
-
- id 1 from list with ID bc79d072-4a4c-4140-bc97-6c9ebec41c87
- site URL: https://baidev.sharepoint.com/sites/CustomerBug5600
-