Lossy Compressed Image Formats Study

Mozilla Corporation, October 2013

Introduction

This study compares the compression performance of four different image formats: JPEG, JPEG XR, WebP, and HEVC-MSP. The latter three formats were chosen because they are frequently discussed as possible JPEG successors.

It is our intent to only address compression performance in this study. Other technical, legal, and market factors that might be considered when evaluating codecs are outside the scope of this study.

Quality Comparison Algorithms

We chose to test with four algorithms:

All of these algorithms compare two images and return a number indicating the degree to which the second image is similar to the first. In all cases, no matter what the scale, higher numbers indicate a higher degree of similarity.

It's unclear which algorithm is best in terms of human visual perception, so we tested with four of the most respected algorithms.

Image Sets

  1. Lenna: Widely used Lenna image, a 512x512 PNG.
  2. Kodak: 24 PNG images from the Kodak Lossless True Color Image Suite.
  3. Tecnick: 100 images from Tecnick's public test images. Images used are the original size RGB color images.

We had planned to include a set of very large images (~20 megapixels each) but some encoders had issues with them. Because this does not allow for a full comparison, the image set was cut for this study. Encoder developers were notified of any problems found.

Methodology

All evaluation results should be easily reproducible using publicly available tools.

MATLAB is the only non-free (as in beer) software used. We recommend tweaking the test harness to use GNU Octave if you would like to test without access to MATLAB.

The following software is used to generate results for this study:

PNG test images are converted to CCIR 601 full-range Y'CbCr 4:2:0, which is then fed directly into the encoders. In order to convert back to PNG for quality scoring (e.g. SSIM) we decode to Y'CbCr 4:2:0 and then encode that to PNG. Doing this consistently allows us to avoid pre and post processing done by production encoders and decoders, and we test the encoding algorithms themselves as closely as possible. Direct encoding and decoding for JPEG, JPEG XR, and WebP is done via custom encoder and decoder programs (source code on github with the testing scripts) which call directly into the encoding and decoding APIs. The HEVC-MSP encoder and decoder accept and output Y'CbCr 4:2:0 directly, so no custom program is necessary.

HEVC-MSP files are penalized 80 bytes per image file because HEVC-MSP is just a bitstream with no container. This penalty approximates the size of container data.

The algorithm used, where F is the format being evaluated and Q is a JPEG quality level:

  1. Compress source PNG image to JPEG at quality Q, with Y'CbCr 4:2:0 as the intermediate.
  2. Convert JPEG back to PNG using Y'CbCr 4:2:0 as the intermediate.
  3. Record quality score between the source PNG and the PNG produced from the JPEG, as well as the JPEG's file size.
  4. Perform binary search of the target format's quality range, with interpolation, to find the file size of the image in format F that matches the JPEG quality score. The same process used for JPEG is used to compress and compare images in the target format F.
  5. Calculate the file size ratio for format F to JPEG.

For image sets including multiple images, the result will be the arithmetic mean.

Note: Video formats such as VP8 and HEVC typically use 'studio swing' Y'CbCr with a restricted range of 16 to 235 instead of full range of 0 to 255. When working with RGB data the scaling for studio swing is accomplished as part of the colorspace conversion process. Some image formats derived from video formats, such as WebP, inherit video's conventional range in their common RGB conversions. In our study we adopted a methodology which uses identical colorspace conversion for all formats because the objective metrics were developed against greyscale images. While these metrics correlate well with perception in their intended applications they are known to exaggerate the perceptual impact of small brightness or contrast shifts that can be caused by differences in colorspace conversion. As a result the study does not consider the effect of colorspace or range difference that would typically be found in production, and manual visual spot checking did not suggest the conversion had a large effect on perceptual quality.

Results (Raw Data)

The following Excel (.xlsx) files contain the full results for this study. These can be opened with MS Excel or LibreOffice 4.1.x.

Change over JPEG Quality Range at Equivalent Y-SSIM

The goal for this section is to visualize file size ratios, where JPEG is always 1.0, over a range of JPEG qualities. File sizes are recorded at equivalent Y-SSIM values. In each graph, the Y axis represents file size ratios. The X axis represents a range of JPEG quality values. There is one graph for each image set.

Graph 1: Lenna.png, Y-SSIM quality metric, lower is better

Graph 2: Average for Kodak image set, Y-SSIM quality metric, lower is better

Graph 3: Average for Tecnick image set, Y-SSIM quality metric, lower is better

Change over JPEG Quality Range at Equivalent RGB-SSIM

The goal for this section is to visualize file size ratios, where JPEG is always 1.0, over a range of JPEG qualities. File sizes are recorded at equivalent RGB-SSIM values. In each graph, the Y axis represents file size ratios. The X axis represents a range of JPEG quality values. There is one graph for each image set.

Graph 1: Lenna.png, RGB-SSIM quality metric, lower is better

Graph 2: Average for Kodak image set, RGB-SSIM quality metric, lower is better

Graph 3: Average for Tecnick image set, RGB-SSIM quality metric, lower is better

Change over JPEG Quality Range at Equivalent IW-SSIM

The goal for this section is to visualize file size ratios, where JPEG is always 1.0, over a range of JPEG qualities. File sizes are recorded at equivalent IW-SSIM values. In each graph, the Y axis represents file size ratios. The X axis represents a range of JPEG quality values. There is one graph for each image set.

Graph 1: Lenna.png, IW-SSIM quality metric, lower is better

Graph 2: Average for Kodak image set, IW-SSIM quality metric, lower is better

Graph 3: Average for Tecnick image set, IW-SSIM quality metric, lower is better

Change over JPEG Quality Range at Equivalent PSNR-HVS-M

The goal for this section is to visualize file size ratios, where JPEG is always 1.0, over a range of JPEG qualities. File sizes are recorded at equivalent PSNR-HVS-M values. In each graph, the Y axis represents file size ratios. The X axis represents a range of JPEG quality values. There is one graph for each image set.

Graph 1: Lenna.png, PSNR-HVS-M quality metric, lower is better

Graph 2: Average for Kodak image set, PSNR-HVS-M quality metric, lower is better

Graph 3: Average for Tecnick image set, PSNR-HVS-M quality metric, lower is better

Bibliography and Relevant Reading

  1. WebP Compression Study, Draft 0.1. May 18, 2011. Google.
  2. HD View: JPEG XR updates. May 30, 2013. Matt Uyttendaele (Microsoft).
  3. Structural similarity. Wikipedia.
  4. The SSIM Index for Image Quality Assessment.
  5. IW-SSIM: Information Content Weighted Structural Similarity Index for Image Quality Assessment.
  6. Nikolay Ponomarenko homepage - PSNR-HVS-M download page.

Contributors