When scanning your paper records with optical recognition software, PDF is a great output file format to use. It allows you to convert text in the document image to text that you can edit, add to, and search, as well as use its content in other applications, such as Microsoft Word and Google Docs.
However, if your primary concern when scanning documents is archiving them for posterity's sake, then you might consider using the PDF/A format, which is designed precisely for that.
PDF/A Preserves Records for the Long Term
The trick with archiving documents is that we don’t know how technology will change. The file formats and storage methods in use today may not be relevant down the road.
While there are no guarantees, PDF/A is the best bet to store your document so it can be viewed in its original format. It's the standard that many online archives and libraries (including the U.S. Library of Congress) are adopting for the preservation of their digital records.
As the ISO-standardized version of the Portable Document Format, PDF/A differs in a few ways. Since the archived documents are supposed to be true reflections of the original document, PDF/A doesn’t allow modification of the files and text. However, the text can be searchable, so it is easy to find information and cut-and-paste what you need.
PDF/A Benefits
To try to ensure that archived documents are accessible in years to come, PDF/A adds features and imposes some limitations and protections on the files that help prevent redundancy and obsolescence. The idea is to make records as self-contained and device-independent as possible. Everything that is required to display the document the exact same way as the original, in every instance, is contained right within the PDF/A file, including fonts, colour profiles, images, and more.
Among the features:
-
Support for embedded fonts – allowing unlimited, universal rendering – instead of font linking
-
Colourspaces defined in a device-independent manner
-
A user interface for reading embedded notations
-
Mandated use of standards-based metadata
To ensure long-term document viability, PDF/A doesn't allow:
-
Audio and video content
-
Encryption
-
JavaScript and executable file launches
-
Font linking
-
Bookmarks
Other PDF/A Considerations
As technology progresses and changes, some users might fear that future releases of PDF/A won't be backward compatible with older ones. That is now allowed under ISO standards, which require new versions of PDF/A to be able to display all files made in previous releases.
When it comes to file size, with its embedded fonts, increased functionality, and colour profiles, PDF/A files are larger than regular PDF ones, but the difference is usually marginal. In a few rare instances, certain colour profiles can lead to much larger file sizes.
It should be noted that the most stringent form of PDF/A is the PDF/A-1b format. While it does support text search, anomalies with character coding may affect the results. If being able to search text is of paramount importance to your business, then the standard PDF/A format is best.
At MES Hybrid Document Systems, more than 90% of the documents we scan on behalf of our clients are saved in PDF/A format, to protect their long-term viability. To find out how we can help your office go paperless, call us today for a free document scanning quote.