Zarvan JBIG2 Compression Engine

Overview

JBIG2 is an image compression standard for bi-level (black & white) images. Designed for use in both Internet and desktop imaging applications, JBIG2 offers certain advantages over the traditional bi-tonal compression schemes. One such advantage is that it can be used for both lossless as well as lossy compression of scanned images.

In its lossless mode JBIG2 can generate image files that are from 3 to 5 times smaller than the corresponding Group 4 Tiff images. The lossy mode of JBIG2 can compress to much higher levels without significant loss of image quality. Furthermore, by compressing across pages in a multi-page TIFF file, size reductions of 10-fold or more are achievable.

JBIG2 compression was incorporated into PDF specification starting from Acrobat 5. Most JBIG2 compressed images are embedded inside PDF files for easier viewing by the free Acrobat reader.

JBIG2 Encoder

The JBIG2 specification (available here as a 1.12MB PDF file) has been published in 2000 as the international standard ITU T.88, and in 2001 as ISO/IEC 14492. The specification only deals with decoding JBIG2 images. Many aspects of the encoding were left open as implementation choices. Also, unlike the traditional codecs, such as TIFF or JPEG, there are several coding schemes and templates available to deal with a variety of pixel patterns.

The use of symbol dictionaries and symbol matching in JBIG2 enables very effective encoding of documents that contain recurring symbols, making JBIG2 ideal for compressing scanned documents. The compression process starts by segmenting a page into symbol classes and building symbol dictionaries through pattern matching and similarity analysis. The symbol dictionaries are stored in the document together with a content stream which encodes positional information for instances of a symbol. In lossy compressions, minor differences among symbols in a similarity class are thrown away. In lossless encoding, the minor differences are encoded as refinements.

The quality of the compressed JBIG2 image and the level of compression is largely determined by the choice of algorithms and heuristics that goes into the symbol classification and pattern matching. Every implementation of JBIG2 encoder will produce a different compressed file from the same input.

The use of JBIG2 compression was made popular by the inclusion of JBIG2 filters in Acrobat 5. Unfortunately, the Acrobat decoder cannot handle the full range of compression options envisioned in the JBIG2 specification.

Zarvan JBIG2 encoder was written in portable C++ language and produces JBIG2 streams that are compatible with Adobe Acrobat. Certain advanced features (such as non-standard adaptive templates for arithmetic encoding) are not used in favor of maintaining this compatibility.

JBIG2 Encoder Features:
Cross-platform object oriented C++ code
Creates both multi-page JBIG2 (*.jb2) and linearized PDF files
Compresses across pages in a multi-page document
Virtual interfaces are callable from many languages
Compresses in both lossless and lossy modes
Encodes from any file or stream
Integrated with Zarvan PDF Library for real-time document composition
Integrated with Alfresco ECM

JBIG2 Decoder

Zarvan JBIG2 Decoder is a full implementation of the JBIG2 specification as published by the standards committee. The decoder was also written in portable C++ and is callable from a number of programming languages.

For most applications, JBIG2 compressed images are embedded inside PDF documents and are decoded by the Acrobat viewer. Zarvan JBIG2 decoder is integrated with Zarvan PDF Library and can extract and decode JBIG2 image streams from PDF files. It is also possible to convert these streams to other formats, such as Fax Group 4. A potential application of the decoder is in the conversion of existing PDF files into PDF/A: the standardized subset of PDF specification suitable for long term archiving.

JBIG2 Decoder Features:
Cross-platform object oriented C++ code
Can decode JBIG2 (*.jb2) files and PDF streams
Can generate both sequential and random JBIG2 files
Virtual interfaces are callable from many languages
Arithmetic, Huffman and MMR decoders
Can decode Generic, Text and Halftone regions
Supports standard and user defined Huffman tables
Supports standard and user defined adaptive templates
Supports symbol refinement

Business Benefits

JBIG2 compression standard can enable many businesses to streamline their document centric operations. From reduced storage requirements to fast and secure serving of documents on the Internet, JBIG2 can provide immediate improvements in document related processes and lead to significant ROI in the short term.

  • Reduced Storage Costs

    JBIG2 compressed documents are typically 10 times smaller than the corresponding multi-page TIFF G4 image files. Batch conversion of existing TIFF files to JBIG2 compressed documents can restore about 90% of used storage. In addition to saving on storage costs, smaller files are also easier and faster to distribute, backup or archive.

  • Faster Downloads

    Smaller files are faster to download. A JBIG2 compressed document can be downloaded about 10 times faster than the corresponding TIFF file. Also, if a JBIG2 file is embedded inside a inearized PDF document, then the document will be further optimized for on-line viewing. The Acrobat viewer can open such documents in streaming mode, downloading only the parts that are required to view each page. For linearized PDF documents, the time to view each page becomes independent of the number of pages in the document.

  • Content Management

    Document imaging is a significant component of many ECM systems. Traditionally, most scanned image files were compressed using one of the Fax compression schemes and stored in TIFF files. Within the ECM system, scanned documents maybe annotated, distributed for reviewing and comments or published on the Web portal. It is also often the case that the TIFF versions of the documents need to be preserved and archived. TIFF is one of the legally recognized electronic forms of a scanned document.

    Zarvan technologies has created a unique extension to the Alfresco ECM which provides the option of preserving the TIFF documents for archiving while seamlessly creating a JBIG2 compressed PDF version of the document for use within the ECM system. The Zarvan Content Transform for Alfresco can easily integrate with any Alfresco ECM and provide real-time optimization of scanned images as they are imported or scanned into the system. The resulting PDF documents are encrypted and optimized for web access. They are also typically 10 times smaller than the corresponding TIFF files.

  • Publishing & Distribution

    Businesses that are document centric, such as legal or insurance companies, deal with large number of electronic files. Many of these documents are scanned into multi-page TIFF files. A typical TIFF file of a few hundred pages is several mega-bytes large. These large files are inherently insecure and difficult to distribute through publishing on the Web or via e-mail. Many e-mail accounts have limitations on attachment sizes and will bounce back e-mails that exceed their limits. Also, there is no universal platform agnostic viewer for TIFF images. JBIG2 technology provides a mechanism for overcoming these problems. It converts the TIFF file into a standard compressed, secured and web-optimized PDF document suitable for publication and distribution.

  • Downloads

    Evaluation copies of pre-built applications and a Software Development Kit are available upon request. Please send an e-mail to info@zarvantech.com.

    Back to Products