Data Compression/Streaming Compression
[edit] file vs streaming compression
Many data compression algorithms are file oriented. They assume that underlying transmission protocols have transmitted every bit of the compressed data file without error.
A few data compression algorithms are designed to be used with streaming one-way broadcast. At any time, a receiver may be turned on and start listening to the broadcast. In order for a receiver to start up in mid-stream (or recover from severe transmission errors), such systems typically break up the stream into "blocks". The decoder uses synchronization marks or checksums to detect the beginning of the next block, and starts decoding from there. At each new block, it re-initializes and re-starts the decoding process. Block decoders that use a fixed dictionary or fixed Huffman table forget the old one and read a new dictionary or table at the beginning of each block. Block decoders that use an adaptive dictionary forget the old dictionary, reinitialize the default dictionary, and begin building a new dictionary.
The block size is a compromise between compression (longer blocks usually give better compression) and quick restarts (shorter blocks allow the decompressor to restart more quickly after an error or after turning on). Because practically all errors can be detected with CRC codes, a streaming decoder knows which blocks are in error, and typically rejects the entire block. Streaming audio decoders use a variety of techniques to mask such errors -- they may attempt to "fill in" with a prediction that previous tones will continue. To avoid "clicks", they may choose to ramp down the volume to silence at the end of the last good block, play silence during the bad block, and then ramp back up to normal volume at the beginning of the next good block.
There has been little research on alternatives to "blocks". Some people speculate that, compared to using a block size of B bytes, it may be possible to achieve better compression and just as quick restarts with a (non-blocking) converging decoder designed so that all decoders, no matter when they are turned on, eventually converge on the same adaptive dictionary after at most B bytes. An adaptive block decoder has very little context (and poor compression) at the beginning of each block, gradually rising to lots of context (and much better compression) at the end of the block. A non-adaptive block decoder has the overhead of a dictionary or Huffman table at the beginning of each block (in effect, zero compression during the dictionary or table, and high compression during the data in the remainder of the block). The converging decoder, in theory, has some intermediate context (and some intermediate compression) at all times, without dictionary or table overhead or times of poor compression.
This page may need to be