Boundaries settings

Boundaries are the division between records: they define where one record ends and the next record begins; for an explanation see Record boundaries.

CSV, Excel or Database file boundaries

Since database data sources are structured the same way as CSV and Excel files, the options for these file types are identical.

  • Record limit: Defines how many records are displayed in the Data Viewer. This does not affect output production; when generating output, this option is ignored. To disable the limit, use the value 0 (zero).

  • Line limit: Defines the limit of detail lines in any detail table. This is useful for files with a high number of detail lines, which in the DataMapper interface can slow down things. This does not affect output production; when generating output, this option is ignored. To disable the limit, use the value 0 (zero).

  • Trigger: Defines the type of rule that controls when a boundary is set, creating a new record.

    • Record(s) per page: Defines a fixed number of lines in the file that go in each record.

      • Records: The number of records (lines, rows) to put in each record.

    • On change: Defines a new record when a specific field (Field name) has a new value.

      • Field name: Displays the fields in the top line. The boundaries are set on the selected field name.

    • On script: Defines the boundaries using a custom JavaScript. For more information see Setting boundaries using JavaScript.

    • On field value: Sets a boundary on a specific field value.

      • Field name: Displays the fields in the top line. The value of the selected field is compared with the Expression below to create a new boundary.

      • Expression: Enter the value or Regular Expression to compare the field value to.

      • Use Regular Expression: Treats the Expression as a regular expression instead of static text. For more information on using Regular Expressions (regex), see theRegular-Expressions.info Tutorial.

PDF file boundaries

For a PDF file, Boundaries determine how many pages are included in each record. You can set this up in one of three ways: by giving a static number of pages; by checking a specific area on each page for text changes, specific text, or the absence of text; or by using an advanced script.

  • Record limit: Defines how many records are displayed in the Data Viewer. To disable the limit, use the value 0 (zero).

  • Trigger: Defines the type of rule that controls when a boundary is set, creating a new record.

    • On page: Defines a boundary on a static number of pages.

      • Number of pages: Defines how many pages go in each record.

    • On text: Defines a boundary on a specific text comparison.

      • Start coordinates (x,y): Defines the left and top coordinates of the data selection to compare with the text value.

      • Stop coordinates (x,y): Defines the right and bottom coordinates.

      • Use Selection: Select an area in the Data Viewer and click the Use selection button to set the start and stop coordinates to the current data selection.

        Note: In a PDF file, all coordinates are in millimeters.

      • Times condition found: When the boundaries are based on the presence of specific text, you can specify after how many instances of this text the boundary can be effectively defined. For example, if a string is always found on the first and on the last page of a document, you could specify a number of occurrences of 2. This way, there is no need to inspect other items for whether it is on the first page or the last page. Having found the string two times is enough to set the boundary.

      • Pages before/after: Defines the boundary a certain number of pages before (-) or after (+) the current page. This is useful if the text triggering the boundary is not located on the first page of the record.

      • Operator: Selects the type of comparison (for example, "contains").

      • Word to find: Compares the text value with the value in the data source.

      • Match case: Makes the text comparison case sensitive.

    • On script: Defines the boundaries using a custom JavaScript. For more information see Setting boundaries using JavaScript.

    • On all pages: Sets a boundary after the last page, creating one source record.

Text file boundaries

For a text file, Boundaries determine how many 'data pages' are included in each record. These don't have to be actual pages, as is the case with PDF files. The data page delimiters are set in the Text file boundaries.

  • Record limit: Defines how many records are displayed in the Data Viewer. This does not affect output production; when generating output, this option is ignored. To disable the limit, use the value 0 (zero).

  • Selection/Text is based on bytes: Select this option for text records with fixed width fields whose length is based on the number of bytes and not the number of characters.

  • Trigger: Defines the type of rule that controls when a boundary is set, creating a new record.

    • On delimiter:Defines a boundary on a static number of pages.

      • Occurrences: The number of times that the delimiter is encountered before fixing the boundary. For example, if you know that your documents always have four pages delimited by the FF character, you can set the boundaries after every four delimiters.

    • On text: Defines a boundary on a specific text comparison.

      • Location:

        • Selected area:

          • Select the areabutton: Uses the value of the current data selection as the text value. Making a new selection and clicking on Select the area will redefine the location.

          • Left/Right: Defines where to find the text value in the row.

          • Top/Bottom: Defines the start and end row of the data selection to compare with the text value.

        • Entire width: Ignores the column values and compares using the whole line.

        • Entire height: Ignores the row values and compares using the whole column.

        • Entire page: Compares the text value on the whole page. Only available withcontains,not contains,is emptyandis not emptyoperators.

      • Times condition found: When the boundaries are based on the presence of specific text, you can specify after how many instances of this text the boundary can be effectively defined. For example, if a string is always found on the first and on the last page of a document, you could specify a number of occurrences of 2. This way, there is no need to inspect other items for whether it is on the first page or the last page. Having found the string two times is enough to set the boundary.

      • Delimiters before/after: Defines the boundary a certain number of data pages before or after the current data page. This is useful if the text triggering the boundary is not located on the first data page of the record.

      • Operator: Selects the type of comparison (for example, "contains").

      • Word to find: Compares the text value with the value in the data source.

      • Use selected text button: copies the text in the current selection as the one to compare to it.

      • Match case: Makes the text comparison case sensitive.

    • On script: Defines the boundaries using a custom JavaScript. For more information see Setting boundaries using JavaScript.

XML file boundaries

The delimiter for an XML file is a node. The Boundaries determine how many of those nodes go in one record. This can be a specific number, or a variable number if the boundary is to be set when the content of a specific field or attribute within a node changes (for example when the invoice_number field changes in the invoice node).

  • Record limit: Defines how many records are displayed in the Data Viewer. This does not affect output production; when generating output, this option is ignored. To disable the limit, use the value 0 (zero).

  • Trigger: Defines the type of rule that controls when a boundary is set, creating a new record.

    • On Element: Defines a new record on each new instance of the XML element selected in the Input Data settings.

      • Occurrences: The number of times that the element is encountered, as a direct descendant of the same parent element, before the boundary is set.

    • On Change: Defines a new record when a specific field or attribute in the XML element has a new value.

      • Field: Displays the fields and (optionally) attributes in the XML element. The value of the selected field determines the new boundaries.

      • Also extract element attributes: Check this option to include attribute values in the list of content items that can be used to trigger a boundary.

JSON file boundaries

The delimiter for a JSON file is an object or array inside the selected parent element (see JSON file boundaries). The Boundaries determine how many of them go in one record.

Note: Only arrays and objects can be seen as a record. It is not possible to split the JSON between key-value pairs.

  • Record limit: Defines how many records are displayed in the Data Viewer. This does not affect output production; when generating output, this option is ignored. To disable the limit, use the value 0 (zero). The default value is 200.
  • Trigger: Defines the type of rule that controls when a boundary is set, creating a new record.
    • On element: Creates a new record in the output for each object or array, or - if you set a higher number of occurrences - after every n-th object or array in the parent element.
      • Occurrences: The number of times that an element is encountered in the parent element before fixing the boundary.
    • On change: Creates a new record each time the value in a certain key-value pair changes.

      • Field: Displays the keys of key-value pairs that exist at the root of direct child elements of the selected parent element. The value of the selected field determines the new boundaries.