Data Compression/Lossy vs. Non-Lossy Compression

From Wikibooks, open books for an open world
Jump to: navigation, search

Contents

[edit] Lossy Compression vs Non-Lossy Compression

Lossy image compression and lossy video compression (such as JPEG compression, MPEG compression, and fractal image compression) give much better reduction in size (much higher compression ratio) than we find in almost any other area of data compression.

Lossy compression is best used to reduce the size of video data, where defects in the picture can be hidden as long as the general structure of the picture remains intact. This type of compression is called lossy compression because part of the reason that it compresses so well is that actual data from the video image, is lost and then replaced with some approximation.

Fractal compression is based on the concept of self-similarity in fractals. It uses fractal components that have self-similarity to the rest of the surrounding area in the picture.

The storage is so inaccurate that by replacing some of the fractal data with encrypted text, it is very difficult to see a difference in the picture, which means that important messages can be embedded in video images with little fear of detection.[citation needed]

If we could achieve the same level of compression without loss of data, we wouldn't use Lossy Compression because there is an imperceptible loss of quality in the image.

One of the reasons that there is still some interest in new loss-less compression techniques, is that only very inexact data structures can survive Lossy Compression. It is often the case that loss of a single bit, renders a whole phrase or line of data inaccurate. This is why we attempt to build more and more stable memory systems. The recent shift from RD to SD ram for instance was partially because RD ram needed more interactive maintenance of its data.

If we are so protective of the memory of data, then it makes sense that we must also be protective of the compression scheme we use to store and retrieve data. So the only places Lossy Compression can be used, are places where the accuracy at the bit level, does not materially affect the quality of the data.

[edit] image compression

Most image compression and video compression algorithms have 4 layers: Is there a standard terminology for these things?

  • A "modeling" or "de-correlation" layer converts the raw pixels into more abstract information about the pictures -- and, at the decoder, a "rendering" part converts that abstract information back into the raw pixels.
  • A "quantizer" layer that throws away some of the details in that abstract information that humans are unlikely to notice -- and, at the decoder, a "dequantizer" that restores an approximation of the information.
  • A "entropy coder" layer that compresses and packs each piece of information into the bitstream to be transmitted -- and, at the decoder, the "entropy decoder" that accepts the bitstream and unpacks each piece of information.
  • A layer that adds synchronization, interleaving, and error detection to the raw bitstream just before transmitting it.

Since we talk about entropy coding elsewhere in this book, and the Data Coding Theory book discusses synchronization and error detection, this section will focus on the other layers.

Some kinds of "modeling" process gives us information that is useful for things other than data compression, such as "de-noising" and "resolution enhancement".

Some popular "modeling" or "de-correlation" algorithms include:

  • delta encoding
  • Fourier transform -- calculated using the fast Fourier transform (FFT) -- in particular, the discrete cosine transform (DCT)
  • wavelet transform -- calculated using a fast wavelet transform (FWT) -- in particular, some discrete wavelet transform (DWT)
  • motion compensation
  • matching pursuit
  • fractal transform[1][2]


There is a huge amount of information in a raw, uncompressed movie. All movie and image compression algorithms can be categorized as either:

  • Lossless methods: methods that make no changes whatsoever to the image; the uncompressed image is bit-for-bit identical to the original.
  • "Nondegrading methods" or "transparent methods": methods that make some minor changes to the image; the uncompressed image is not exactly bit-for-bit identical, but the changes are (hopefully) invisible to the human eye.
  • "Degrading methods" or "low-quality methods": methods that introduce visible changes to the image.

... "idempotent" one-time loss vs. "generational" loss at every iteration ...

In theory, a "perfectly" compressed file (of any kind) should have no remaining patterns. Many people working compression algorithms make pictures of the compressed data as if it were itself an image. [3] The human visual system is very sensitive to (some kinds of) patterns in data. If a human can describe a repeating pattern in the compressed data precisely enough, you can use that description to help model the original image more accurately, leading to a smaller compressed image file.

[edit] JFIF: JPEG File Interchange Format

[edit] JPEG 2000

JPEG 2000 is useful in applications that require very high quality -- such as the DICOM medical image format -- because the compressor can choose higher quality settings -- including a completely lossless mode -- than are available in earlier JPEG standards.

JPEG 2000 is a image coding system based on wavelet technology. [4] The JPEG 2000 compression standard uses the biorthogonal CDF 5/3 wavelet (also called the LeGall 5/3 wavelet) for lossless compression and a CDF 9/7 wavelet for lossy compression. However, a variety of other kinds of wavelets have been proposed and used in experimental data compression algorithms. [5]

Open-source implementations of JPEG 2000 are available. [6]

[edit] lossy and residual give lossless

Sometimes people download a (highly compressed)image or video or music file, using it to decide if they really want to spend a (much longer) time downloading the high-resolution lossless version of the file.

Rather than delete the original (lossy) version of the file and start downloading the lossless version from scratch, several people are experimenting with methods of somehow using the partial information in the lossy version of the file in order to reduce the time required to download "residue", also called the "residual" -- the "rest" of the file (i.e., fill in the details that the lossy compressor discarded). [7] [8] [9] [10] [11] [12] [13] [14] [15]

As long as the size of the (compressed) residue is significantly less than the size of the file stored in a stand-alone lossless format, the user has saved time -- even though the total lossy + residue size is usually larger than a stand-alone lossless format.

GIF and PNG, image file formats designed to be downloaded over relatively slow modems, have some of this characteristic -- they are designed to support "partial downloads".

Such lossless formats include JPEG XR, DTS-HD Master Audio, MPEG-4 SLS (lossless audio compression), Wavpack Hybrid, OptimFROG DualStream, ...

[edit] References

  1. Wikipedia: fractal transform
  2. "Fractal based Image compression algorithm (and source code)" [1]
  3. "Technical Overview of Cartesian Perceptual Compression" (c) 1998-1999 Cartesian Products, Inc.
  4. Joint Photographic Experts Group : JPEG 2000 standard
  5. The Wavelet Discussion Forum
  6. The JasPer Project by Michael D. Adams, an implementation of JPEG-2000 Part-1.
  7. "Flexible 'Scalable to Lossless'?"
  8. Detlev Marpe, Gabi Blättermann, Jens Ricke, and Peter Maaß "A two-layered wavelet-based algorithm for efficient lossless and lossy image compression" 2000.
  9. GCK Abhayaratne and DM Monro. "Embedded to lossless coding of motion compensated prediction residuals in lossless video coding"
  10. G.C.K. Abhayaratne and D.M. Monro. "Embedded to Lossless Image Coding (ELIC)". 2000.
  11. CH Ritz and J Parsons. "Lossless Wideband Speech Coding". 2004. "lossless coding of wideband speech by adding a lossless enhancement layer to the lossly baselayer produced by a standardised wideband speech coder"
  12. Nasir D. Memon, Khalid Sayood, Spyros S. Magliveras. "Simple method for enhancing the performance of lossy plus lossless image compression schemes" 1993.
  13. Qi Zhang Yunyang Dai Kuo, C.-C.J. Ming Hsieh "Lossless video compression with residual image prediction and coding (RIPC)". 2009.
  14. Gianfranco Basti and M. Riccardi and Antonio L. Perrone. "Lossy plus lossless residual encoding with dynamic preprocessing for Hubble Space Telescope fits images" 1999.
  15. Majid Rabbani and Paul W. Jones. "Digital image compression techniques". 1991. Chapter 8: Lossy plus lossless residual encoding.
Personal tools
Namespaces
Variants
Actions
Navigation
Community
Toolbox
Sister projects
Print/export