Clock and Data Recovery/Structures and types of CDRs/The CDR based on a first order PLL

This fundamental architecture is often used for CDRs. As it contains all the fundamental features of a PLL, it is also useful to fix concepts and to better understand the more sophisticated architectures of higher order.

This architecture (the simplest of the three) is best fit to implement CDRs with fast and slip-free acquisition of phase lock.

Its response to an abrupt change of phase (step input), even if combined with a frequency difference between the timing of the received signal and the free running frequence of the local oscillator (ramp input), is always free from slips.

The phase difference during the acquisition is always a decreasing function (apart from a small drift that may originate by f_fr / f_p mistuning and that would only be evident during periods of missing transitions and only if the mistuning sign was making the VCO resist the direction of the acquisition transient).

Generalities of the 1st order (type 1) PLL

Simple and efficient. Preferred for phase aligners, but also valid wherever there are no special requirements.

It may present itself under different guises, as seen in the page of examples, but the mathematical model remains the same.

It is maybe easier to derive the linear model equations from the first example in the following figure (the "slave"), although the third example (the "phase aligner") is the one most frequently found in actuality.

The only function (the only one that is not a transfer function) which does not belong to the linear model is -obviously- the one that depends on non-linearities for its definition (=the jitter tolerance function).

Two different jitter tolerance functions are derived for the two different examples studied here, which incorporate respectively one and two hard non-linearities.

The Phase Aligner is a CDR circuit that shifts the phase of the received signal so that it matches the phase of a reference clock.

The simplest form of the phase aligner is the so-called "gated oscillator". A gated oscillator is inactive until the first transition of the incoming burst is detected: at that moment the oscillator is immediately released to its free-running state, starting with the phase that best matches the phase information of that first transition in the burst of pulses.

The gated oscillator is great for applications where the burst is very short and the signal strength high. It could be found in the old Teleprinter systems, where one character at a time made up one burst, but it can also be seen in modern applications like PON.

Another kind of phase aligner is the delay-locked loop.^[1]

But the best compromise between fast acquisition and immunity from phase noise (jitter) can be found taking advantage of all the initial transitions available in the incoming burst.

The case has been well standardized for the PON systems where the ITU-T Recommendation G.984.3. It specifies that the burst starts with some initial pulses without transitions for assessing the optical gain needed in the receiver for that burst. Immediately after that, a preamble of pulses is sent with the maximum possible number of transitions for the CDR phase acquisition.

Then bits that shall be correctly regenerated begin, and by then the CDR is expected to be already locked and to sample correctly these information (=payload) bits.

The preamble on each packet is discarded by higher levels of the protocol stack.^[2]

The phase aligner that implements the CDR of a modern PON OLT receiver is an interesting example.^[3]

In accordance to the definition of phase aligner, it shall incorporate the elastic buffer function, and the delay line can be located as the first stage of processing of the received signal, with the phase comparator just behind. (Dieter Verhulst, et al. 2004^[3] is a good reference for further reading on this aspect. It may be remarked that the phase comparator in the CDR of this article is of the linear type.)

Topology of an OLT phase aligner. The received signal, delayed by the delay line, is recovered with (= regenerated by the sampling of) the system clock.
This approach (that the delay line is the first processing done on the received signal) inherently implements also the phase alignment expected of an elastic buffer.

The circuit of the above figure is in fact a 1st order, type 1, control loop (= a first order PLL), and a complete CDR if also the gray part in the figure above is included.

jω and t functions

The system under study is described by a mathematical model, that is relatively simple as long as the system is linear, time invariant, causal, sometimes abbreviated into LTI.

Under these conditions the model is simplified:

the differential equations become algebraic polynomial equations in the transformed domain.
all solutions in the t domain are functions = 0 for all t < = 0,
if the system is in steady state, the Laplace transforms (functions of s) overlaps exactly the corresponding Fourier transforms. It is enough to substitute s with 0 + jω in the former to obtain the latter.

But, on one hand, PLLs are never really unstable and, on the other hand, a complex function of a complex variable is very difficult to measure and to plot!

To analyze the steady state operation of the PLL (where the transfer function belongs), the transformed domain is not left, but a double substantial simplification applies to all the functions of s that describe such steady state operation.

Real independent variable. In steady state not all possible value of s are to be considered, but just those of its imaginary part ω, that is a real variable. A function of the real variable ω is obtained, but it still assumes complex values, like magnitude and argument.
Real dependent variable. If our system is a minimum phase system (a reasonable assumption), just the magnitude (= absolute value) of the system frequency response shall be used, as it contains all the information. The argument is neglected by the PLL engineers.

To analyze the transient operation of the PLL, the functions in the Laplace domain are reverse transformed to obtain solutions in the form of time function (e.g. the Jitter Transfer Function is reverse transformed to obtain the Unit Step Response).

Jitter transfer function

A jitter x(t) (that represents a deviation of the timing instants from their ideal positions), applied to the PLL input, combines with the feedback signal y(t). Their difference generates the error signal ε(t) at the input of the phase comparator.

Moving to the transformed s domain:

( X(s) - Y(s) ) = E (s)

The other two blocks of the loop establish the relation:

Y(s) = E (s) * G φ * G f * G VCO /s

Eliminating E(s) by substitution, the explicit relation between Y(s) and X(s) is obtained:

Y(s) / X(s) = 1 / (1 + s/G)

Note:

G

=

open loop gain

=

G φ * G f * G VCO

The transfer function is useful in this form when the system stability is studied.

The magnitude is normally represented in a Bode magnitude plot.

The magnitude of the transfer function in this case is:

\left|{\tfrac {Y(s)}{X(s)}}\right|=\left|{\tfrac {1}{(1+j\omega /G)}}\right|={\tfrac {1}{\sqrt {1+(\omega /G)^{2}}}}

All CDRs act as low pass filters on the timing signal, and remove all the (jitter) frequencies above a cut off frequency, called ω_n in the model of a 1st order loop and ω_n2 in the models of the 2nd order loops.

The figure above contrasts the magnitude of the jitter transfer of a 1st order loop (in red) with the jitter transfer of a 2nd order loop of type 1 (in blue).

The comparison is made in the (unrealistic) case that that ω_n = ω_n2, in order to show the relative merits of the fundamental models studied in this book.
The 2nd order loop of type 1 is the best architecture if sharp jitter filtering is required, as the next pages emphasize. (It has also weak points, so no architecture is best in all aspects).
The slope of its low-pass asymptote is 40 dB/decade (corresponding to a double zero at frequency ∞).
The 1st order loop is not remarkable in this respect: the slope of its low-pass asymptote is 20 dB/decade (corresponding to a single zero at frequency ∞).
The frequency of the received line pulses, ω_p, is shown as well, as a vertical spike of the yellow line.

ω_n1 is a function of G only

The cut-off frequency ω_n (where the asymptotes cross) is for this architecture exactly equal to the loop gain G.

It is called ω_n or ω_n1, depending on whether being the parameter of a 1st order loop is to be stressed or not. The two notations are equivalent.

This is only approximately true for the other two loops that are going to be investigated in detail further on.

The cut-off frequencies of those two other (2nd order) loops coincide exactly with G only if their damping ratio ζ is exactly 1, and deviate slightly from G (one increases, the other decreases) proportionally to twice the deviation of ζ from 1.

The deviation is relatively small because, as it will be repeated a few times, ζ shall not be much different from 1.

All PLL act as low-pass for the input jitter.

The meaning of the parameter ω_n (for all PLLs) can be seen as:

the loop interprets all jitter frequency components below ω_n as useful signal to lock to;
the loop interprets all jitter frequency components above ω_n as input noise and rejects them.

Error signal

The error signal is the result of the comparison between:

an input signal (the CDR input in a slave CDR, the local clock in a phase aligner) and
the feedback signal that locks into it (the local clock in a slave CDR, the phase shifted received signal in a phase aligner).

The time distance between input and output can also be seen as a phase distance if time is divided by the duration of a line pulse, i.e. by 2π/ω_p.

In the range of frequencies where the PLL must track the phase of the incoming bit stream (from ω = 0 to ω = ω_n), the error signal should be close to zero.
In the range of frequencies where the PLL must disregard the phase of the incoming bit stream (from ω = ω_n to ω = ∞), the error signal should be close to the input signal.

The error signal is a function belonging to the mathematical model, indicated respectively as ε(t) or E(s) or E(jω), and is of special interest for CDRs because it models the distance between the eye center and the sampling clock inside the CDR.

E(jω) is especially meaningful, because:

jitter is investigated in "steady state" conditions, i.e. with the Fourier transforms: sinusoidal input, sinusoidal output and sinusoidal waveforms at every inner node of the PLL;
the PLL output tracks exactly the input with just the difference ε(t);
ε(t) is exactly the difference between the eye center and the sampling instant: when ε(t) is maximum the lateral eye margin is minimum;
the maximum value of a sinusoidal deviation is its magnitude: the useful function is therefore |E(jω)|.

Its magnitude tells, at every jitter frequency, the amplitude of the sinusoidal distance between the input and the output phases. It is easy to realize that this function is the error signal function:

|

Ε (jω)

|

=

|

X(jω) - Y(jω)

|

=

|

X(jω)

|

{\tfrac {\sqrt {\left(\omega ^{2}/\omega _{n1}^{2}\right)}}{\sqrt {1+\left(\omega ^{2}/\omega _{n1}^{2}\right)}}}

Note:

G

=

open loop gain

=

ω n1

=

natural frequency

To have a model function that can synthetically represent case by case how much of the input jitter becomes error signal, it is practical to normalize the error signal to a jitter input of exactly 1 rad:

\left|{\tfrac {E(j\omega )}{X(j\omega )}}\right|={\tfrac {\sqrt {\left(\omega ^{2}/\omega _{n1}^{2}\right)}}{\sqrt {1+\left(\omega ^{2}/\omega _{n1}^{2}\right)}}}={\tfrac {1}{\sqrt {1+\left(\omega _{n1}^{2}/\omega ^{2}\right)}}}

The regeneration of the data depends on sampling the received pulses (that have undergone amplification, equalization and filtering of out-of-band noise) close to the time of maximum amplitude of each pulse. At that time the remaining noise and intersymbol interference still alter the pulse to a certain extent. The phase error shifts the sampling point away from its optimum and closer to the side of the eye, where the eye is less open and the probability of error increases:^[4] in other words a phase error that is not affecting clock tracking may still increase to intolerable levels the bit error rate!

Jitter tolerance

Jitter tolerance of the 1st order slave with frequency control of the VCO

For the slave 1 - 1 loop, the jitter tolerance can be obtained as |X(jω)|_{|Ε(jω)| = Φ_leo}, which corresponds to the hypothesis that the tolerance is limited by (and only by) the lateral eye opening, see also the relevant page. (Φ_leo is the lateral eye opening expressed in radian).

In other words the mathematical function is obtained as if it was the range of the phase comparator that set the tolerance limit, but reduced to the lower value Φ_leo.

The comparator range corresponds to the ideal width of the received pulse left and right of its mid point ( ±π ).

The function that gives the ratio of the input to the error is:

{\tfrac {X(s)}{\mathrm {E} (s)}}={\tfrac {X(s)}{X(s)-Y(s)}}={\tfrac {1}{1-\left({\tfrac {1}{1+s/G}}\right)}}={\tfrac {1+s/G}{s/G}}

Its magnitude in steady state (s = jω), that represents the magnitude of the sinusoidal input generating an error of magnitude 1 rad ( |Ε(jω)| = 1 rad ), is the normalized jitter tolerance function:

\left|{\tfrac {X(j\omega )}{\mathrm {E} (j\omega )}}\right|

=

\left|X(j\omega )\right|_{\left|\mathrm {E} (j\omega )\right|=1}

=

{\sqrt {\tfrac {(1+\omega ^{2}/\omega _{n}^{2})}{\left(\omega ^{2}/\omega _{n}^{2}\right)}}}

The magnitude of the jitter tolerance function (de-normalized) is the same function but taken for |E(jω)| = Φ_leo :

\left|X(j\omega )\right|_{\left|\mathrm {E} (j\omega )\right|=\Phi _{leo}}

=

\Phi _{leo}{\sqrt {\tfrac {(1+\omega ^{2}/\omega _{n}^{2})}{\left(\omega ^{2}/\omega _{n}^{2}\right)}}}

The Bode magnitude plot is given in the following figure, for different values of the parameter G:

If another value $\mathrm {\Psi _{leo}}$ better simulates the circuit tolerance limit, the curve plotted still applies provided it is translated vertically by the amount $\mathrm {20log_{10}(\Psi _{leo}/\Phi _{leo})}$ .

At low jitter frequencies there is very good tracking, even if the jitter has a large amplitude. The circuit has time to follow these large but slow variations. It takes an extremely large jitter amplitude to reach the limit of tolerance.

At very high jitter frequencies the circuit is unable to follow the jitter that varies too fast. The tolerance is in practice exactly the lateral eye opening (or the phase comparator range, whichever limit is reached first). In fact, the PLL is not able to track the jitter at all and the local clock stays unmoving with respect to it.

The normalized jitter tolerance for all three important loop models used in (slave) CDRs is shown in the following figure.

under the hypothesis that the tolerance is limited by (and only by) the lateral eye opening, which is reasonable in most practical cases of SLAVE CDRs, see also the relevant page

It may be noted that the Bode plot of the curve of normalized jitter tolerance is the mirrored image (across the x axis) of the normalized error curve.

The asintote towards low frequencies (in the log-log plot) has a slope of 20 dB/ decade for both the type 1 systems, and of 40 dB/ decade for the type 2 system.

Jitter tolerance of the 1st order phase aligner

For this architecture, there are two non linearities that shape the jitter tolerance function:

the range limitation of the phase comparator (better: the lateral eye opening), and
the range limitation of the phase adder (= of the elastic buffer).

The mathematical model to use is still the 1 - 1, with reference to the block diagram variant of the phase aligner (the bottom one in the figure at the beginning of this page).

The phase comparator multiplies the error signal by G_φ.

The integrator (or accumulator, if implemented digitally) is modeled via a $G f$ /s block.

The fixed local clock is followed by the integrator and by the elastic buffer (that the output of the integrator controls). The final output is a phase that is the accumulation of the phase comparator output. It can be modeled as a block $G f G +$ /s (i.e. as a frequency controlled VCO), where G₊ = 1.

In order to be consistent with the formulas that apply to the other two loop variants in the figure, a gain G_VCO shall be added to the block diagram, so that the open loop gain becomes exactly G = $G f G f G VCO$ .

Note that the integrator output is defined as -Y(s), which completes the correspondence with the model of the two other loop variants of the figure.

This VCO has a limited output range, corresponding to range possible for the delay of the (variable) delay line.

Whenever the adder is driven (by -y(t)) to reach one of its limits, the adder is abruptly re-set to its central position.

This shifts the adder characteristic so that its center corresponds to the delay present at the moment of reaching the threshold, and becomes the zero reference for x(t) - y(t) from then on.

Using the results obtained in the relevant page about the tolerance function and the tolerance of the phase adders, the tolerance region defined with the lateral eye opening is further reduced. A horizontal line clamps it at a magnitude equal to the distance from the elastic buffer center and either of the two slip thresholds at its extremes.

The figure above contrasts the case of the 1 - 1 loop (in blue) with the case of the other two important loops (2-1 and 2-2) only to better understand the former. The other two loops in fact are not used for phase-aligners, as explained in a previous page.

For all the three loops, that are considered with the same natural frequency, the figure above shows:

an asymptotic value of the tolerance equal to 7 dB, 20log₁₀(2.25), as Φ_leo is chosen as 2.25 rad.
a clamp towards low jitter frequency to a magnitude of 28 dB, which corresponds to a total length of the elastic buffer (= of the phase adder) of 9 UI.

The rejection of generated (VCO) noise…..

This section addresses the modifications that the loop itself makes on the noise
that is generated by the loop components or picked up from supply, ground and through cross-talk
and that ultimately propagates to the output signal.

The phase aligners are not addressed because their output has no phase noise (by definition),
even though a similar approach could be used to evaluate how the added noise may increase their BER.
But in phase aligners the most important signal impairments come from the transmission line and the PLL added noise is often less important.

Rather than with the noise spectrum or other noise characteristics (that are different with different loop components and technologies), this paragraph deals with the modifications that the loop itself operates on the noise frequency components.

The three blocks that make up the PLL all generate and pick up (from supply, ground and through cross-talk) noise that is added to the signal.

In the model of the loop the noise is added at the end of each block, as in the figure below. As the noise is relatively small with respect to the signal amplitude, it is correct study it with a linear model.

As noise tends to be relatively stable, the steady state model generates useful results.

As the main source of (phase) noise is almost always the VCO, the relevant curve is plotted in the next figure for different values of the open loop gain.

The formulas derived from the loop model tell that the loop attenuates (disregards) the noise in the frequency band ( 0 to ω_n) where it tracks the input signal and leaves the noise unmodified in the band where it attenuates (from ω_n to ∞) the input signal.

This general behavior is common to all PLL variants (that are unity feedback, low pass loops).

The 1 - 1 loop is not particularly selective against added noise: it rejects the low frequency components from the VCO with a high pass below ω_n that is typical of the type 1 systems.

Noise coming from the earlier blocks is treated as the incoming signal, apart from a flat attenuation (or amplification).

Unit Step Response

The unit step response can be obtained from the (jitter) transfer function as follows:

take the (jitter) transfer function that is in the s domain; (its reverse transform -that is a function of t- is the unit impulse response). (The transform of the unit impulse is just a constant 1 in the s domain.)
multiply by the transform of the unit step function, that is 1/s;
reverse-transform into the time domain, i.e. obtain the USR:

{\color {Blue}1-e^{-\omega _{n}t}}

The case of our 1st order type 1 loop is plotted in blue here below, together with the USR of the two other important loop models.

The parameter values used for this figure are chosen primarily to allow a meaningful comparison. The value of ζ used for the 2nd order loops in this figure is lower than the values used in practice.

The step response can also be seen as the response of the CDRs to the appearance of the incoming signal.

The local oscillator phase is uncorrelated with the phase of the incoming signal and, before the incoming signal (in some applications: the burst) is detected, there exists a phase difference.

The amount of such difference is unknown to the CDR before the burst is actually received.

As soon as the burst is detected, the phase difference between the first received pulse -delayed by the present length of the delay line- and the closest transition of the local clock is computed by the phase comparator.

The value of the difference is fed into the control loop: the phase lock starts.

The evolution of the phase lock is described by the step impulse response.

The height of the step, in radian, is the value of the initial phase difference.

The response when left free-running (slave 1st order type 1)

A phase aligner cannot go "free-running", because its local oscillator is always locked by definition.

A slave CDR instead is left free-running, for instance, when the incoming signal has no level transitions ( and consequently the phase comparator cannot detect its phase ).

At that moment the phase comparator output steps abruptly to its neutral -i.e. no phase difference- level.

In a 1st order loop, the comparator output waveform is applied directly to the VCO input (apart from some flat amplification).

The VCO abruptly changes its frequency (that was f_p as long as it was in lock) jumping to its free-running frequency f_fr.

The sampling point (its phase is the integral of the VCO frequency step) drifts from its lock-in point following a linear (phase) ramp.

The slope of the ramp is the frequency difference between the free-running frequency of the local oscillator and the frequency of the remote transmit oscillator:

Δf = f_fr - f_p.

The (1st order type 1) loop drifts 1 rad exactly after a time equal to: f_p / |f_p – f_fr| periods of received pulses.

But errored bits might appear even before, if the lateral eye opening was less than 1 rad (1 rad is an improbable value, 2 rad is a more likely threshold for the onset of error-ed bits).

If a (slave 1st order type 1) CDR is designed to be able and overcome a run length of RL pulses, it shall exhibit a VCO frequency accuracy tighter than:

| f_fr - f_p | / f_p ≤ 1 / RL

Continuous time representation: is it sufficient?

The model described above and the simulator available on Google Docs both use a continuous time representation: is it sufficient in the case of the 1-1 CDRs that often operate close to the line baud frequency?

This question shall be asked here, because this architecture is used in the applications where the loop has to operate closest to the line frequency.

Discrete time representation is always more complex, in most cases unnecessary

In the burst mode receiver, the number of input signal transitions (that are almost the same as the number of clock cycles available for the acquisition of the lock) are not many.
For instance, in the case of the 2.5 Gbps US/ 1.25 Gbps DS GPON, just 20 to 50 transitions are truly available for the OLT CDR to lock the phase of an incoming burst.

Is a discrete time representation of our models to be used in order to ensure accuracy?

Luckily, in the case of the regenerator CDRs, that are in all practical cases associated with continuous mode operation, long acquisition times and tight jitter transfer bandwidth, all the modeling presented so far is adequate and accurate.

Let’s see the case of the PON OLT receiver (and of all burst mode receivers), where acquisition times are at the other extreme of the range (= very short).

The PON OLT receiver circuit (=burst mode) referred to above^[3] becomes slightly different in a discrete time representation:^[5]

The unit step response changes as well, but not so fundamentally, as the following figure shows us:

A continuous time representation is accurate enough also in this case, provided the scale factor of the clock frequencies of the two simulations is taken properly into account!

This result should come as no surprise, if we consider that the CDR function is essentially an averaging function of subsequent phase information coming as a discrete time series.

The need to filter severely in some cases (regenerator CDRs) makes discrete time modeling a largely unnecessary complication.

In the case of burst mode receivers instead, the reaction of the control system has to be so fast as to be executed within a few clock cycles. In this case as well, luckily, the simplicity of the 1 - 1 architecture makes the discrete time system behave not differently from a system where the phase information was inputted (and the whole system operated) continuously.

References

↑ Wikipedia: delay-locked loop
↑ Serial Programming/Forming Data Packets
↑ ^a ^b ^c Dieter Verhulst, Xin Yin, Johan Bauwelinck, Peter Ossieur, Xing-Zhi Qiu and Jan Vandewege (the INTEC team of Ghent University), “A robust phase detector for 1.25Gbit/s burst mode data recovery”, IEICE Electronic Express, Vol. 1, No. 18, pp.562-567, (2004). http://www.jstage.jst.go.jp/article/elex/1/18/1_562/_article
↑ Ransom Stephens, “Tektronics Jitter 360° Knowledge Series” from http://www.tek.com/learning/
↑ Alan V. Oppenheim, Ronald W. Schafer, John R. Buck : Discrete-Time Signal Processing, Prentice Hall, ISBN 0-13-754920-2

← Clock and Data Recovery/Noise is shaped by the PLL structure

Clock and Data Recovery

Clock and Data Recovery/Structures and types of CDRs/Applications of the 1st order type 1 architecture →

[1] Wikipedia: delay-locked loop

[2] Serial Programming/Forming Data Packets

[Verhulst-3] Dieter Verhulst, Xin Yin, Johan Bauwelinck, Peter Ossieur, Xing-Zhi Qiu and Jan Vandewege (the INTEC team of Ghent University), “A robust phase detector for 1.25Gbit/s burst mode data recovery”, IEICE Electronic Express, Vol. 1, No. 18, pp.562-567, (2004). http://www.jstage.jst.go.jp/article/elex/1/18/1_562/_article

[4] Ransom Stephens, “Tektronics Jitter 360° Knowledge Series” from http://www.tek.com/learning/

[5] Alan V. Oppenheim, Ronald W. Schafer, John R. Buck : Discrete-Time Signal Processing, Prentice Hall, ISBN 0-13-754920-2

[1]

[2]

[3]

[4]

[5]

Clock and Data Recovery/Structures and types of CDRs/The CDR based on a first order PLL

Contents

Generalities of the 1st order (type 1) PLL