File Scripts

 

How to Remove the HTML Tags from Item Body

  • Usage: Unstructured data column

  • Description: Use function HOST.CleanupHtmlTags to update the content by removing the HTML tags.

Remove HTML Tags Before Indexing
Copy
dim origFile() as byte= HOST.GetFileContent()
dim noHtml as string = HOST.CleanupHtmlTags(origFile)
return noHtml

How to Copy and Rename a File

  • Usage: Unstructured data column

  • Description:

    • Use this function to copy a file for indexing.

    • This function removes the temp file and fix the extensions.

Copy File for Indexing
Copy
dim origFile as string = HOST.GetStringValue("FILEPATH")
dim ext as string= origFile.substring(origFile.LastIndexOf(".") + 1)
dim newFilePath as string = HOST.GetTempFilePath(ext)
File.Copy(origFile,newFilePath)
return newFilePath

How to Add Multiple Files to a ZIP File, Dynamically

  • Usage: Unstructured data column

  • Description:

    • If you have a one-to-many relationship with data to files, you can attach all of the files for crawling when you use this script.

    • Select ZIP type from the list of unstructured options.

Add Multiple Files to a .zip File
Copy
'how to pull multiple file references. assume second query returns all the files
dim ds as Dataset = HOST.GetDataSet()
dim fieldName as string = "filepath"dim dsindex as integer = 1
for i as integer = 0 to ds.tables(dsindex).rows.count -1    
    dim fullfileName as string = HOST.MyToString(ds.tables(dsindex).rows(i)(fieldName))    
    dim shortfileName as string = fullfileName.substring(fullfilename.lastindexof("\") + 1)
    HOST.AddFileToZip(fullfilename,shortfilename)
next

How to Make a Copy of the File to Filter

  • Usage: Unstructured data column

  • Description: Use this function if the filter is locking the file or encounters problems opening the file.

    • You can also rename the file.
    • The resulting copy is deleted when the filtering is complete.
    • This statement is true because the path is returned by the call HOST.GetTempFilePath
Make a Copy of File to Filter
Copy
dim origFile as string = HOST.GetStringValue("document")
try
if not string.IsNullOrEmpty( origFile ) Then
    dim ext as string= origFile.substring(origFile.LastIndexOf(".") + 1)
    dim newFilePath as string = HOST.GetTempFilePath(ext)
    File.Copy(origFile,newFilePath)
    File.SetLastWriteTime(newFilePath, Date.now)
    return newFilePath
else
    return "NOFILE"end if
catch x as exception
return origfile
end try

How to Create a ZIP File

  • Usage: Unstructured data column

  • Description: If there is only one file, this code enables you to conditionally return a zip file or a single file.

Create a .zip File
Copy
dim files() as string = {"c:\test\Federation.log","c:\test\file.docx"}
dim nfile as string = HOST.GetTempFilePath("zip")
HOST.CreateZIPFile(nfile,files,false)
return nfile

How to Download a File from a Website

  • Usage: Unstructured data column

  • Description: Retrieve a binary file stream from a web site based on the URL that is provided in a field from your data source.

Retrieve a Binary File Stream
Copy
dim fileURL as string = HOST.GetStringValue("FILEURLCOLUMN")
dim fileBytes() as byte
try
fileBytes = HOST.RunByteRequest(fileURL , "username-leaveblankforanon", "pass")
dim ext as string= HOST.GetSTringValue("FileExtension")
dim newFilePath as string = HOST.GetTempFilePath(ext)
HOST.SetFileExtension(ext)
File.WriteAllBytes(newFilePath,fileBytes)
return newFilePath
catch efe as exception
Host.writetrace("Exception on file retrieve:" + efe.message)
return ""end try