Fundamentals of Data Representation: Sound compression

From Wikibooks, open books for an open world
Jump to: navigation, search

UNIT 1 - ⇑ Fundamentals of Data Representation ⇑

← Sampled sound Sound compression Nyquist-theorem →


As you can see we have some serious issues with the size of sound files. Take a look at the size of a 3 minute pop song recorded at a sample rate of 44kHz and a sample resolution of 16 bits.

44,000 * 16 * 180 = 126 720 000 bits (roughly 15 MB)

As you are probably aware an mp3 of the same length would be roughly 3Mb, a fifth of the size. So what gives? It is easy to see that the raw file sizes for sounds are just too big to store and transmit easily, what is needed it a way to compress them.

Lossless[edit]

Lossless compression - compression doesn't lose any accuracy and can be decompressed into an identical copy of the original audio data


WAV files don't involve any compression at all and will be the size of files that you have calculated already. There are lossless compressed file formats out there such as FLAC which compress the WAV file into data generally 50% the original size. To do this it uses run length encoding, which looks for repeated patterns in the sound file, and instead of recording each pattern separately, it stores information on how many times the pattern occurs in a row. Let us take a hypothetical set of sample points:

0000000000000000000001234543210000000000000000000123456787656789876

As you can see the silent area takes up a large part of the file, instead of recording these individually we can set data to state how many silent samples there are in a row, massively reducing the file size:

(21-0)123454321(17-0)123456787656789876

Another technique used by FLAC files is Linear prediction.

Lossy[edit]

FLAC files are still very large, what is needed is a format that allows you to create much smaller file sizes that can be easily stored on your computer and portable music device, and easily transmitted across the internet.

Lossy compression - compression loses file accuracy, generally smaller than lossless compression


As we have already seen, to make smaller audio files we can decrease the sampling rate and the sampling resolution, but we have also seen the dreadful effect this can have on the final sound. There are other clever methods of compressing sounds, these methods won't let us get the exact audio back that we started with, but will be close. This is lossy compression.

Some audiophiles stick by vinyl records as this uncompressed music format doesn't lose audio accuracy like an mp3. However dirt and wear degrade the quality of vinyl

There are many lossy compressed audio formats out there including: MP3, AAC and OGG (which is open source). The compression works by reducing accuracy of certain parts of sound that are considered to be beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner. Because the accuracy of certain frequencies are lost you can often tell the difference between the original and the lossy versions, being able to hear the loss of high and low pitch tones.

Exercise: Sound compression
Why is it necessary to compress sound files?

Answer :

So that they take up less space and can be sent quickly across the internet or stored on portable music players
Name the two categories of compression available and give a file format for each

Answer :

Lossy (mp3/AAC/ogg) and lossless(FLAC)
perform run length encoding on the following sound file

012344444444444432222222222222211111111111111000000000000

Answer :

0123(11-4)3(13-2)(14-1)(11-0)
Describe a technique used to compress mp3 files

Answer :

perceptual coding reduces the quality of frequencies stored in a sound file that are beyond the auditory resolution of most people
When would it be best to use FLAC instead of ogg and vice-versa?

Answer :

  • when you really care about the sound quality and you're not bothered about the file size
  • when you are trying to make a sound file as small as possible