Digital Signal Processing/Discrete Data

Digital Signal Processing

Continuous data is something that most people are familiar with. A quick analogy: when an analog data sensor (e.g., your eye) becomes active (you blink it open), it starts receiving input immediately (you see sun-shine), it converts the input (optical rays) to a desired output (optic nerve signals), and sends the data off to its destination (your brain). It does this without hesitation, and continues doing so until the sensor turns off (you blink your eyes closed). The output is often called a data "stream"; once started, it might run forever, unless something tells it to stop. Now, instead of a physical sensor, if we're able to define our data mathematically in terms of a continuous function, we can calculate our data value at any point along the data stream. It's important to realize that this provides the possibility of an infinite (∞) number of data points, no matter how small the interval might be between the start and stop limits of the data stream.

This brings us to the related concept of Discrete data. Discrete data is non-continuous, only existing at certain points along an input interval, and thereby giving us a finite number of data points to deal with. It is impossible to take the value of a discrete data set at a time point where there is no data.

The analogy with the vision would be the illumination with a stroboscope. The scene viewed consists of a series of images. As all information between two images is lost, the frequency of the stroboscopic illumination should be high enough not to miss the movements of a moving object. This data can also be defined by a mathematical function, but one that is limited and can only be evaluated at the discrete points of input. These are called "discrete functions" to distinguish them from the continuous variety.

Discrete functions and data give us the advantage of being able to deal with a finite number of data points.

Sets and Series

Discrete data is displayed in sets such as:

X[n] = [1 2 3& 4 5 6]

We will be using the "&" symbol to denote the data item that occurs at point zero. Now, by filling in values for n, we can select different values from our series:

X[0] = 3
X[-2] = 1
X[3] = 6

We can move the zero point anywhere in the set that we want. It is also important to note that we can pad a series with zeros on either side, so long as we keep track of the zero-point:

X[n] = [0 0 0 0 1 2 3& 4 5 6] = [1 2 3& 4 5 6]

In fact, we assume that any point in our series without an explicit value is equal to zero. So if we have the same set:

X[n] = [1 2 3& 4 5 6]

We know that every value outside of our given range is zero:

X[100] = 0
X[-100] = 0

Stem Plots

Discrete data is frequently represented with a stem plot. Stem plots mark data points with dots, and draw a vertical line between the t-axis (the horizontal time axis) and the dot:

F[n] = [5& 4 3 2 1]

About the Notation

The notation we use to denote the zero point of a discrete set was chosen arbitrarily. Textbooks on the subject will frequently use arrows, periods, or underscores to denote the zero position of a set. Here are some examples:

            |
            v
Y[n] = [1 2 3 4 5]

            .
Y[n] = [1 2 3 4 5]

            _
Y[n] = [1 2 3 4 5]

All of these things are too tricky to write in wikibooks, so we have decided to use an ampersand (&) to denote the zeropoint. The ampersand is not used for any other purpose in this book, so hopefully we can avoid some confusion.

Sampling

Sampling is the process of converting continuous data into discrete data. The sampling process takes a snapshot of the value of a given input signal, rounds if necessary (for discrete-in-value systems), and outputs the discrete data. A common example of a sampler is an Analog to Digital Converter (ADC).

Let's say we have a function based on time (t). We will call this continuous-time function f(t):

f(t)=2tu(t)

Where u(t) is the unit step function. Now, if we want to sample this function, mathematically, we can plug in discrete values for t, and read the output. We will denote our sampled output as F[n]:

F[n] = 0 : n ≤ 0
F[1] = f(1) = 2
F[2] = f(2) = 4
F[100] = f(100) = 200

This means that our output series for F[n] is the following:

F[n] = [0& 2 4 6 8 10 12 ...]

Reconstruction

We digitize (sample) our signal, we do some magical digital signal processing on that signal, and then what do we do with the results? Frequently, we want to convert that digital result back into an analog signal. The problem is that the sampling process can't perfectly represent every signal. Specifically, the Nyquist-Shannon sampling theorem states that the largest frequency that can be perfectly reconstructed by a sampled signal with a sample rate of $2N$ is $N$ . To give an example, audio CDs are sampled at a rate of 44100 samples per second. This means that the largest frequency that can be represented on an audio CD is $44100\div 2=22050$ Hz. Humans can hear frequencies up to around 20000 Hz. From this, we can conclude that the audio CD is well-suited to storing audio data for humans, at least from a sample rate standpoint. A device that converts from a digital representation to an analog one is called a Digital-to-Analog converter (DAC).