Digital Signal Processing/Sound Processing

From Wikibooks, open books for an open world
Jump to: navigation, search

Digital Sound[edit]

Sampling Frequency[edit]

Sound in the digital realm is stored in one or more arrays of discrete samples, with each array of samples correlating to a channel (e.g. stereo sound requires two channels, and thus two arrays of samples). The interval of time between each sample is a constant, and is determined by the type of data to be represented. Since we are interested in sound, and the extreme upper limit of human hearing is generally accepted as 20kHz, the Nyquist-Shannon sampling theorem can be used to determine the interval between samples to accurately re-construct the signals we're interested in.

This theorem states that

Exact reconstruction of a continuous-time baseband signal from its samples is possible if the signal is bandlimited and the sampling frequency is greater than twice the signal bandwidth.

Essentially what this means is a signal that is limited to a certain range (audible sound: ~20Hz to 20kHz) can be reconstructed without error if it is sampled at a rate that is greater than twice the bandwidth. The Red Book audio CD standard sets the sampling rate at 44,100 Hz. This frequency was chosen to leave ample overhead (as required by the Nyquist-Shannon theorem), but could support at least up to 22kHz.

44.1kHz is the general standard for sampling rates in digital audio on consumer level equipment, however 48kHz is common when working with film or video. Also, many recording engineers prefer to record classical or otherwise complex music at 88.2 or 96kHz -- some claim to be able to perceive a difference.

When converting from 48kHz to 44.1kHz a sonic blurring effect can sometime occur, because the math is floating point, which is inherently imprecise on a computer. The conversion from 88.2kHz to 44.1kHz or 96kHz to 48kHz is simpler to perform, since the computer, or device, doing the conversion only has to disregard half the samples. To bypass this problem, a high-quality digital-to-analog converter can be used to bring, for example, a 48kHz signal back to its analog form, and then is fed into another high-quality analog-to-digital converter to re-sample the signal at 44.1kHz. This technique is common practice in recording studios where high-end equipment can be trusted to do the conversion flawlessly, however in other situations, the sonic distortion caused by converting the audio in software or hardware may be of little concern.

Bits Per Sample[edit]

While sampling frequency determines the time component of an audio signal, the number of bits per sample is used to describe the amplitude. Red Book audio CDs store each sample as a 16 bit signed integer. This means that when an audio signal is converted for use on a CD, each sample's value is quantized as an integer to fit in the range -32768 to +32767.

Wave Files[edit]

Wave files contain data which is a representation of audio sound. This format for storing data is an uncompressed format. This means the data can be sent to the digital-to-analog processor for playback without an added step of decompression. This also means that this format will consume a great deal of memory.

MP3 Compression[edit]

OGG Compression[edit]