Time, Phase, Freqency, Delay

Signal theory is a topic which intimidates many audiophiles. We generally have a good grasp of the concept of time and feel comfortable discussing frequency. But many consider phase and signal delay to be magical quantities in some sort of esoteric alchemy, understandable only with great effort and through extensive mathematical abstraction. In this article I will attempt to present a relatively simple treatment of these subjects, with emphasis on their importance in loudspeaker design and testing.

First, lets make sure that we all agree about the meaning of time domain data and frequencies. Time domain data is any quantitiy that varies with time, which in the case of loudspeakers is usually a sound pressure or signal voltage. Music (as viewed on an oscilloscope) is an example, as are the other various signals (impulses, MLSs, sweeps, tone bursts, noise) used in loudspeaker tests.

You might think that frequency is another word for pitch, describing the notes of a musical scale. But for this discussion, frequency is a parameter relating only to sinewave-like signals. Very few musical instruments approximate a sinewave shape in their output. While we may hear a pitch when listening to an acoustical sinusoid (a sinewave-like waveform), a note from an instrument such as piano is not a single sinusoid but a pattern of sinusoids. In most cases, we identify the note with the lowest strong sinewave frequency present, but in some cases the ear can hear a pitch corresponding to a low frequency which isn’t really there in any appreciable amount.

The sinewave around which this concept of frequency is based is a special repeating time domain signal. An ideal sinewave is forever; it has no beginning or end. It (along with its phase shifted alter-ego, the cosine wave) has a specific value at any specified time. To minimize the confusion, we’ll refer to points in time using numbers to indicate positions. "Time = 0" doesn’t mean "the beginning of time", but just some convenient reference time with all previous time to be denoted in negative seconds and all later time in positive seconds.

Mathematical sinewaves and cosinewaves have a maximum size (amplitude) of 1. The frequency of these waves is a measure of how many times a second the basic sinewave shape repeats itself.

Cosine waves and sine waves are identical except for a phase or time shift of 90 degrees, which means 90/360 (or one fourth) of the basic waveform. Sine waves have a value of 0 at time=0, and cosine waves have a value of +1 at time=0. Cosines rather than sines are generally used in discussions of Fourier theory.

Fourier showed that any time domain signal can in principle be made from a sum of sized and delayed cosine waves. This means that you could, at least in theory, take a huge number of cosine waves, put them all through a network which adds the voltages (shifted and amplified) together at each point in time, and duplicate any possible time domain signal at the output. Remember that each individual cosine wave varies over time between positive values and negative values, and the summation at any point can therefore be positive, negative, or zero. If you are uncomfortable with the idea that your treasured recording of Beethoven’s Ninth could be duplicated with just a bunch of sinewave generators, I should mention that it would require an infinite number of them; this is abstract theory, you know.

The Fourier transform, used in many signal analyzers, essentially breaks a time domain waveform into its component cosine waves. The transform does this by revealing the size and phase position required of the cosines at each frequency to reconstruct the original waveform.

As an example of Fourier summation, a square wave contains cosine waves (delayed and sized) only at the frequencies which are odd multiples of the repetition rate.

The classic nonrepeating waveform is the impulse. This waveform has a value of zero at all times except at time 0, where the waveform is at plus infinity. These idealized spikes don’t happen in real life, of course, yet they are central to the concept of frequency response. If you could take all the cosine waves at all possible frequencies, all with the same amplitude of one, all defined at the same time=0, and played them simultaneously they would sum to form this impulse. Another way of stating this is that the impulse has equal contribution from all frequencies and with all these at zero phase shift (where phase refers the that of a cosine waveform). A system which could pass this waveform would have a perfectly flat frequency response in magnitude and phase.

The construction of an impulse from cosines can be envisioned as follows: since all these cosine waves have zero phase shift, they each equal 1at time 0. Therefore, the sum of all the inifinite number of cosine waves at time 0 will be the sum of an infinite number of 1s. At non-zero times, however, you will have some cosine waves (extending up through infinite frequency) being at positive values, some at negative values, and some at zero. At times which are not 0, the positives will balance out the negatives and the resulting sum will be zero (note that this is not meant to be a proof, but merely to serve as a conceptual aid).

Because the impulse contains all frequencies phase aligned at equal levels, an approximation of it makes a very handy test signal. If you feed a bandwidth and size limited impulse through a system , all its component cosine waves might come out the other side with their sizes changed and their phase shifted. This change is called the frequency response of the system (a frequency domain representation). The changes to each cosine consequently affect the shape of the impulse as it passes through the system, forming what is known as the impulse response (a time domain representation). You could apply cosine or sine waves of each frequency, one at a time, to measure the frequency response characteristic of a system and tally the curve up on a frequency-by-frequency basis. But the impulse can be used to do it all at once, assuming you can perform the Fourier transform to analyze the cosine components after passing through the measured system. A real world version of the impulse, finite in size and bandwith, is often used to measure the frequency response of loudspeakers with devices such as the basic IMP (more sophisticated techniques are available in Liberty Audiosuite or IMP/M for deriving the imoulse response with greater immunity to noise).

The reason for all this trouble to define any signal as a collection of cosine waves is so that unified descriptions can be made of predominantly linear systems such as loudspeakers, which modify signals. If we can describe how a linear device modifies cosine waves of any frequency, we need not measure the separate characteristics of square-wave response, triangle wave response, or Beethoven’s Ninth Symphony response. The full frequency response including phase and magnitude data, or equivalently the impulse response, contains the information needed to mathematically determine how a system will treat most any waveform within its dynamic range.

The concept of phase tends to confuse a lot of speaker builders and audiophiles. But it’s really very easy. The phase shift of a cosine wave is, first of all, a relative measurement defined in relation to an unshifted cosine wave. Take your wave in question and scale its size (make it bigger or smaller in height) so that its maximum value is +1. Then see how far you must move it right or left to get it to match the reference cosine wave. Phase is always a comparison between two cosine waves of the same frequency. If the reference isn’t specified it can usually be assumed to be that at the input of a measured system or else a spectrum where all cosines are aligned at a reference time. We don’t measure phase shift in seconds, but in portions of a cosine wave cycle.

This is the first main difference between phase shift and delay. While delay is measured in time, i.e., seconds, the units used for phase shift are radians (there are 2*pi of these per cycle) or degrees (360 per cycle).

But because the sine or cosine waves are eternally repeating, a funny thing happens: shifting a wave backwards or forward one full cycle (360 degrees) gives the same result as not shifting it at all, and is also the same as shifting it any whole number of cycles forward or back. As a result, any shift possible can be done within a phase change of one cycle. This is true because we are talking about only cosine waves of a single frequency. As long as a single frequency waveform is being used for measurement of a speaker, you cannot tell whether the shift is x degrees or 360+x degrees; there will appear to be (and will in fact be) no difference at all between the two.

Delay, on the other hand, is how far in seconds you must shift a waveform to the left on a typical oscilloscope plot to get it to align with a reference unshifted waveform (usually the input signal). If your waveform for analysis is a simple cosine wave, you can still get there by shifting only within whatever time equals 360 degrees. The actual phase shift corresponding to a given time shift will depend on the frequency of the cosine wave. For instance, if the frequency were 1/360th of a Hertz, each cycle of the wave would last 360 seconds and one seconds’ delay would correspond to one degree of phase shift (360 degrees/360 seconds). A delay of 6 minutes would correspond to a phase shift of 360 degrees, the same as no phase shift at all. The same 6 minute delay for a 1/1440 Hz waveform (each cycle being 1440 seconds), however, would correspond to 90 degrees.

But most waveforms are not single frequency waves but, in Fourier theory, are collections of cosine waves. And these complex waveforms may not repeat at all, and if they do, will not repeat at the same rate as each component cosine wave. So delay is not such an ambiguous quantity for such a complex waveform. But delay can be rather hard to define if the waveform is warped by a magnitude response or is phase shifted in such a way that its shape doesn’t come out resembling the original. This is usually the case for loudspeakers, most of which do not have equal delay at all frequencies (even if the frequency response magnitude might be essentially flat) and which consequently alter complex waveform shapes.

Whether phase distortion as generated in existing loudspeaker systems is audible is a subject of some controversy. I think we can agree that an imagined pathological system that delayed, for instance, mids and highs by only microseconds but bass frequencies by several months would certainly have an audible delay characteristic. But less severe delay errors may or may not be such a problem, and the dividing line is not clearly known.

If you are designing a speaker and wish it to have ideal waveform replication, what do you look for in the phase response? Do you want a speaker which has the smallest delay possible? One which imparts the same phase shift at all frequencies? Or that has flat "group delay"? Or that shows no curving sections in a plot of phase versus frequency? Just what does an ideal phase characteristic look like?

Delay itself is what audio playback is all about. When you listen to your recording of Belafonte at Carnegie Hall, there is a delay of over thirty years at work on that signal! In speaker measurements, the amount of delay you get depends on the distance from the speaker to the measuring microphone. Remember, sound travels about a foot per millisecond; you get an additional delay of around one millisecond for each foot of speaker to mike spacing. In short, the "true phase response" of a microphone or speaker depends on just how far away you measure it! -- it has meaning only if you specify some reference plane. So there would seem to be little motivation to pursue minimum delay in a speaker design.

How about the phase of a complex waveform like an impulse or Beethoven’s Ninth? Can you shift a complex waveform by some number of degrees? In general, no. You can, however, shift each of its component cosine waves. But the result may not be what you expect. If you shift each component by 360 degrees or any whole multiple thereof, you will of course not change any of them and will therefore not change the complex waveform. If you shift by 180 degrees, you will invert the waveform (turn it upside down)! But if you shift by, say, 65 degrees, what do you get?

Shifting all the cosine components of a complex waveform by angles which aren’t whole multiples of 180 degrees (for example, a fixed shift of 65 degrees for any frequency) results in severe waveform distortion! So equal phase shift at all frequencies is generally not desirable from the standpoint of waveform fidelity.

What you want for ideal waveform replication is not a response without delay nor with a constant phase shift versus frequency, but with constant delay versus frequency. This is known as a uniform delay or "linear phase" characteristic.

The same delay applied to all frequencies means that the phase shift will be different for different frequencies. In fact, the phase shift of a linear phase system can be expressed as being proportional to frequency (remember that each phase angle has many aliases; you can add or subtract any whole multiple of 360 degrees to each angle value without changing it). A frequency response with uniform magnitude (the "dB" part) and which imparts the same delay to all frequencies will preserve waveform shape. If magnitude or delay is not flat, however, the delay is difficult to determine because the shape of a complex (multi-frequency) test signal is changed by the system being measured. And phase shift or delay is rather ambiguous if you consider single cosine waves in isolation.

One way to deal with this impasse is to consider the cosine components at closely spaced frequencies and their phase shifts in relation to each other as they pass through the system. In other words, look at how the phase shift changes for small changes in frequency. This leads to a definition of what is called "group delay", mathematically defined as the negative rate of change of phase versus frequency. On a plot of phase versus linear frequency, this is the "slope" or downward steepness of the phase curve (the sharp upward edges are a result of phase ambiguity where -180 degrees is equated to +180 degress). A uniform, waveform preserving phase response will have a constant value of this slope over the entire curve.

But beware! The phase response curves shown by most measurement systems (including IMP and Liberty Audiosuite) are normally given in log frequency format. A plot of a uniform delay system, other than one with zero delay, will then show a line for which the downward slope gets steeper as frequency is increased, because more frequencies are scrunched together toward the right side. In such a plot, a response with a phase shift line which tilted downward without a bend in it would actually have a non-uniform delay.

Beware also that constant group delay does not guarantee uniform time delay. A uniform time delay will exhibit constant group delay, but the opposite is not necessarily true. For example, remember the shift of each cosine component by 65 degrees. The phase response of a system bringing about this shift could be represented by a straight line at 65 degrees. The slope of that curve and therefore the group delay of such a system is a constant zero. Yet the resulting waveform distortion definitely indicates non-uniform time delay.

A uniform time delay implies that you could remove delay from the phase response to achieve a horizontal line at 0 degrees (which means no delay). If you can’t get to a straight horizontal line by removing delay or if the horizontal line you then achieve is at some angle other than 0 degrees, the system being measured will exhibit waveform distortion. So it seems that a way to determine whether a delay is constant over a band of frequencies is to work backwards and see whether removing some constant delay will get you to zero degrees everywhere within that band. To remove delay at a given frequency, take the time delay being tried (in seconds) and multiply it by 360 times the frequency. Do this for each data frequency and add the result to each raw phase angle (in degrees), remembering that you can also add or subtact 360 degrees as many times as necessary at each point to reduce the answer to within +/-180 degrees. Do this for various delay values until a best approximation of a straight line is reached.

That isn’t very practical to do by hand. It is much more practical to simply apply a known complex waveform to the system and see whether it passes through uncorrupted in the time domain. That is the motivation for square wave or triangle wave testing. However, such simple time domain testing may tell you that there is waveform distortion, but won’t clearly indicate whether any problem uncovered is due to frequency response magnitude or phase error. In addition, meaningful square wave tests are not easily conducted on loudspeakers because of the inability to remove the effects of room reflections or echoes which also strongly alter the shapes of the reproduced periodic waveforms.

With IMP or LAUD generated quasi-anechoic responses, one can easily remove delay mathematically using the computer until the best approximation to a straight line is achieved on the plot over the frequencies of interest. This can be done by trial and error without much effort. If the line is then essentially near 0 degrees (if at 180 degrees, you can reverse the speaker leads), the delay of the speaker is uniform and complex waves made up from these frequencies will pass phase aligned. Of course, the magnitude of the frequency response must also be flat to achieve waveform integrity. In IMP or LAUD, fixed amounts of delay can be subtracted from or added to a phase plot by using the [F9] key.

You can get a close starting value for the amount of delay to remove by measuring the distance from speaker to the mike and multiplying the value in feet by 0.886 (or the value in meters by 2.91). The result is the number of milliseconds it took for the signal to reach the mike after it left the speaker. For this to be valid (in the case of IMP and LAUD), the measurement should be made with a "Cal" normalization from the signal at the crossover input (in dual channel mode for LAUD). You should also set the first time marker to a placement of "1" before any transformations so that the time window includes the entire time of flight. (For further information about normalization, time markers, windows, and measurement devices, see the series of IMP articles in Speaker Builder issues 1,2,3,4 and 6 of 1993, the IMP Guide, or the Liberty Audiosuite manual).

I should again mention that you probably need not have such extremely uniform phase response or waveform shape integrity for good sound reproduction. It is not unusual to find that the phase curve of a very good speaker straightens only over short portions of the audio band and often away from 0 degrees, revealing a non-uniform delay. The degree of phase distortion which is tolerable before audible sound quality is affected is another subject altogether.