Data Compression/References

From Wikibooks, open books for an open world
Jump to: navigation, search

Benchmark files[edit]


To do:
Are there any benchmarks for evaluating Wikipedia: differential compression?


To do:
Should we have a list of desired features for future benchmark sets, something like "Some Data Compression Corpora We Need Badly" ?

open-source example code[edit]

Most data compression creators release open-source implementations of their algorithms. This makes it much easier to evolve the algorithms by combining combines clever ideas from many different sources.


To do:
Should we link to good, open-source, well-commented implementation of, for example, LZW *here*, or in the section of the book that discusses LZW ?

  • jvm-compressor-benchmark is a benchmark suite for comparing the time and space performance of open-source compression codecs on the JVM platform. It currently includes the Canterbury corpus and a few other benchmark file sets, and compares LZF, Snappy, LZO-java, gzip, bzip2, and a few other codecs. (Is the API used by the jvm-compressor-benchmark to talk to these codecs a good interface standard for compression algorithms?)
  • inikep has put together a benchmark for comparing the time and space performance of open-source compression codecs that can be compiled with C++. It currently includes 100 MB of benchmark files (bmp, dct_coeffs, english_dic, ENWIK, exe, etc.), and compares snappy, lzrw1-a, fastlz, tornado, lzo, and a few other codecs.
  • "Compression the easy way" simple C/C++ implementation of LZW (variable bit length LZW implementation) in one .h file and one .c file, no dependencies.
  • BALZ by Ilia Muraviev - the first open-source implementation of ROLZ compression[1]
  • QUAD - an open-source ROLZ-based compressor from Ilia Muraviev
  • LZ4 "the world's fastest compression library" (BSD license)
  • QuickLZ "the world's fastest compression library" (GPL and commercial licenses)
  • FastLZ "free, open-source, portable real-time compression library" (MIT license)
  • The .xz file format (one of the compressed file formats supported by 7-Zip and LZMA SDK) supports "Multiple filters (algorithms): ... Developers can use a developer-specific filter ID space for experimental filters." and "Filter chaining: Up to four filters can be chained, which is very similar to piping on the UN*X command line."
  • libarchive (win32 LibArchive): library for reading and writing streaming archives. The bsdtar archiving program is based on LibArchive. Libarchive is highly modular. "designed ... to make it relatively easy to add new archive formats and compression algorithms". LibArchive can read and write (including compression and decompression) archive files in a variety of archive formats including ".tgz" and ".zip" formats. BSD license. libarchive WishList.
  • WebP is a new image format that provides lossless and lossy compression for images on the web. "WebP lossless images are 26% smaller in size compared to PNGs. WebP lossy images are 25-34% smaller in size compared to JPEG images at equivalent SSIM index." WebP is apparently the *only* format supported by web browsers that supports both lossy compression and an alpha channel in the same image. When the experimental "data compression proxy" is enabled in Chrome for Android, all images are transcoded to WebP format.[2] BSD license.
  • VP8 and WebM video compression ...
  • The Ogg container format, often containing compressed audio in Vorbis, Speex, or FLAC format, and sometimes containing compressed video in Theora or Dirac format, etc.
  • libPFG, library for reading and writing files in Progressive Graphics File "PGF" format. Uses fast wavelet transform; lossless and lossy compression. Supports alpha transparency. LGPL.

Further reading[edit]

non-wiki resources[edit]