CrawlQueue API
CrawlQueue API enables near real time incremental crawl in Connectivity Hub.
This endpoint is exposed by Connectivity Hub and it should be used to push items and folders for crawling.
When "push incremental crawl mode" is enabled for a content source, then the default logic for incremental crawl is completely replaced and Connectivity Hub re-crawls only the items pushed via the CrawlQueue API endpoint.
Authentication
By default, the Connectivity Hub site is secured with NTLM Authentication, so for a successful call to the CrawlQueue API you must authenticate.
The endpoint exposes 3 operations, shown below:
Swagger
Swagger is integrated with Connectivity Hub.
-
Use Swagger to test the CrawlQueue API and familiarize yourself with its methods and parameters.
-
To explain how to interact with this API, we use the SharePoint Online connector as an example.
Below, we define the things that can be pushed to be re-crawled by the next incremental crawl:
- A site collection
- In Connectivity Hub, a site collection represents a datastore.
- Adding/removing site collections is done through the Datastore load task available at the connection level.
- Adding/removing site collections is done through the Datastore load task available at the connection level.
- A site collection can be marked for re-crawling via the Crawl Queue API with the condition this site collection was already crawled by a previous crawl task.
- By marking for re-crawling a site collection, you mark for re-crawling all the sites, lists, document libraries, files within the site collection:
- For example: Copy
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
Sample for site collection https://baidev.sharepoint.com/sites/CustomerBug5600
- For example:
- In Connectivity Hub, a site collection represents a datastore.
- A site or list or document library
- By marking for re-crawling a site, you mark for re-crawling all its sub-sites, lists, document libraries, and files within the site.
- By marking for re-crawling a list or document library, you mark for re-crawling all folders and files within the list/document library
- Example for a site:
- Example for a site:
- By marking for re-crawling a site, you mark for re-crawling all its sub-sites, lists, document libraries, and files within the site.
id = "WEB##{spoSite.Id}##{SiteURLConvertedTobase64}"
folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId ="WEB##{spoSite.Id}##{SiteURLConvertedTobase64}"
Sample for site with:
-
ID: be83fd40-6812-4606-a7e2-c3abf8ae68e4
- URL: https://baidev.sharepoint.com/sites/CustomerBug5600
Example for a list:
id = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"folderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"
Example for list with:
-
-
- ID: bc79d072-4a4c-4140-bc97-6c9ebec41c87
- Parent site
- URL: https://baidev.sharepoint.com/sites/CustomerBug5600
- ID: be83fd40-6812-4606-a7e2-c3abf8ae68e4
-
- A document or list item
id = item id in list or document libraryfolderId = "{sitecollectionrootweb.Id}:{URLRelativeToSiteCollection}:{ sitecollectionrootweb.WebTemplate}"
folderSubId = "LIST{list.Id}##{ ParentSiteURLConvertedTobase64}##{list.ParentWeb.Id}"
Example for item with:
-
-
- id 1 from list with ID bc79d072-4a4c-4140-bc97-6c9ebec41c87
- site URL: https://baidev.sharepoint.com/sites/CustomerBug5600
-