0.010/29/2019

FAQ-1506: Support For Files Larger Than 2GB

Description

Can PlanetPress Suite handle files that are larger than 2GB in size? What are some other memory limitations of PlanetPress Suite?

Contents

The answer to this question is not a simple one, as the true explanation involves a lot of technical details that go beyond the scope of our documentation, FAQs and support services. However, here are some very basic rules of thumbs that should be kept in mind when dealing with large files.

Seeking vs. Contiguous

The 2GB limitation that PlanetPress Suite may encounter only occurs in certain specific situations. These situations all have one thing in common: they occur when PlanetPress Suite needs random access to a file, i.e. it needs to navigate back and forth within the contents of the file. Conversely, when it needs to read from or write to a file sequentially (i.e. from beginning to end without ever navigating backward), it can do so on files of any size, up to your file system's limit

Operations that require seeking back and forth in a file larger than 2GB may fail because PlanetPress Suite can only use offsets (which are addresses that identify the location of any item inside the file) smaller than 2GB. Those offsets are required when navigating inside a file, but not when reading and writing sequentially since the read/write operation always occurs at the last location that was accessed in the file.

Therefore, not all actions are subjected to this 2GB limitation. Some actions however are known to be affected, including but not limited to:

  • Add & Remove Text (PlanetPress Workflow Task): This task will yield undefined results with files larger than 2GB.
  • Windows Queue Output: Log file may not correctly count printed pages.
  • Folder Output: Undefined results when using Concatenate on a folder output, except when writing to a PDF.

Some other processes, tasks and operations may also suffer from these kinds of limitations, such as file splitters, search & replace, etc.

PDFs are a different beast

In the case of PDFs, the limitation is quite different. While a PDF can theoretically have an unlimited number of pages and a size of up to 10GB, working with PDF files is done a lot through the computer's RAM which, unfortunately, is still limited by 32-bit architecture. This means that PlanetPress Suite – which is a 32-bit application – is less likely to encounter the 2GB limitation with PDF files than it is to run out of RAM.

To put it in the words of Adobe:

Memory limits cannot be characterized as precisely as architectural limits because the amount of available memory and the ways in which it is allocated vary from one product to another. Memory is automatically reallocated from one use to another when necessary: when more memory is needed for a particular purpose, it can be taken from memory allocated to another purpose if that memory is currently unused or its use is nonessential (a cache, for example). Also, data is often saved to a temporary file when memory is limited. Because of this behavior, it is not possible to state limits for such items as the number of pages in a document, number of text annotations or hypertext links on a page, number of graphics objects on a page, or number of fonts on a page or in a document.

The limitation itself is therefore not actually the 2GB file size, even though it may seem to be around those lines.

XML is yet another case

XML files used in PlanetPress Suite (as data files, mostly) do have a limitation that's somewhat different and much lower than 2GB.  While this isn't directly related to the scope of this article, it does answer a memory limitation question.

PlanetPress Suite cannot read XML files sequentially; it always loads them into memory directly. Even if you have over 2GB of memory however, it is not possible to load an XML file that large into memory. Without going too much into technical details, suffice it to say that the realistic maximum size for XML files is around 100 to 150MB. This is due to overhead, memory limitations and fragmentation.