Distortion Isolation in the Time Domain

PRAXIS' "Distortion Isolation" is a technique for separating the nonlinear distortions or noises (harmonic, intermodulation, buzzes and rattles, noises, etc.) from linear effects (such as frequency response, phase response, room reflections), in the reproduced result of any program material.

For those who remember their Fourier theory, the operation of Distortion Isolation is relatively easy to understand.   But just in case, I'll start with a simple review.

Remember that if a system is both linear and time-invariant, then its effects on an input signal can be completely characterized by determining the system's impulse response.  The impulse response is equivalent to the full complex frequency response, and these two can be converted between each other using the Fourier Transform and the Inverse Fourier Transform.  This essentially means that if you know a linear, time-invariant (LTI) system's impulse response (or complex frequency response), then you can in principle calculate its exact output signal, for any input signal in its passband, as shown in the two equivalent diagrams below.

"Linear" refers to a situation where system effects on an input signal (the impulse response) does not change when test levels are changed.  You get the same impulse response at low levels as at high levels, the system never distorts, compresses, rattles, or clips no matter how much you scale the input signal.  This is an idealized situation, as no real systems or devices are completely linear -- even a wire will eventually melt or arc over if driven with a large enough voltage signal.  Real-world loudspeakers and amplifiers will produce greater distortions as signal levels are increased beyond some point, --though with some amplifiers the distortion can increase as signal level is decreased.

"Time-Invariant" means that the system produces the same output result every time the same input signal is fed to it.  The result doesn't change over time, and has no random components.  The act of testing once doesn't affect the following tests.  Again, this is an idealized situation, as real systems and devices have internal noise which will not be the same each time.  In a practical sense, there will also be significant external noise in any test environment, the effects of which will not be the same each time a test is run.  This will be particularly true when acoustical devices (loudspeakers and microphones) are involved.

Since none of our real audio systems are completely linear and time-invariant (or "LTI"), we might think of them as having both an LTI part and a non-LTI part, both of which act on any input signal.  The resulting output could then be considered to be the sum of the LTI and non-LTI parts of the system.

The LTI part includes all the effects of frequency response, delays, and environmental reflections, which might be mitigated through equalization, room treatments, or directivity control.  The Non-LTI part includes the harmonic and intermodulation distortions, hums, noises, rattles, spurious or aliasing tones, which are usually more difficult to deal with.  Mathematically speaking, the LTI part is that which can be calculated, give the system's impulse response and a representation of the input signal.  The Non-LTI part is the portion of the signal which cannot be calculated that way.  This, of course, assumes that the system is "approximately linear", and can be said to even have an impulse response when measured at levels low enough to not significantly distort; and at levels high enough to avoid measurement and ambient noise.

If you've been following up to this point, it should be clear where all this is going.  If we can measure an impulse response that we can consider to represent the small-signal characterization of the audio system, then given a digital signal, we can calculate the LTI-only part of the output.  And, if we digitize the actual output signal of the system, and subtract the calculated LTI part from it,  we are then left with the isolated Non-LTI part, as illustrated in the diagram below.  The distortion and noise part is isolated, for analysis or audition.

This is somewhat similar in flavor to the Baxandall/Hafler power amplifier distortion tests [1],[2], by which you can listen to the difference between the input and a scaled-down output of a power amplifier.  An ideal amplifier should reveal no audible difference.  Such subtraction techniques can give not only a numerical measure of the difference (the energy in the difference can be determined using some other calculations or test equipment) but also a way to subjectively experience the difference by itself, to hear the character of what the system being measured is doing to a given input signal.  The new technique described here, however, is also applicable to loudspeakers, listening environments, or entire audio systems as well as to amplifiers.

Reasons for Isolating

The Distortion Isolation process can be useful for a number of purposes.  

For the equipment or system designer, it can give direction for improvement efforts.  If the distortion and noise (Non-LTI) part is very low or non-offensive sounding, then significant improvements are more likely to come from equalization, noise reduction, room treatment, or directivity control.  If the Non-LTI part is high and/or irritating, then effort toward improving linearity or signal level handling are warranted.

Isolating the Non-LTI portion will also help in determining the severity or give hints about the cause of problems.  Mechanical rattles should be more easily identified.  Amplifier clipping has an abrupt, rather immediate sound.  Signal compression tends to leave a Non-LTI part that sounds a lot like the LTI part.

Distortion isolation also somewhat mitigates the subjective nature of some listening tests.  Human ears can still be used to judge how much of a problem the distortion is, but the effects are not as much masked by the linear parts of program material as in a normal listening test.  This could perhaps lead to better agreement by listeners about the effects of such distortions, as the effects should be more apparent.  Of course A/D and D/A converters are involved, so there could be some objection about the purity of such results, particularly by those who are wary of digital audio.

This kind of test can also avoid listener fatigue in such tests.  Sometimes just having the volume too high, even if clean, makes it sound bad.  Ears are not perfectly linear, either! Distortion Isolation can help determine if the system is having trouble rather than that a subjective listener is just tired of too-high volume.

Optimizing the results

We can improve on the test, if we are wanting to detect nonlinear distortions but are not as concerned with noises or rattles.  Almost any modern synchronous measurement system provides a way to average measurements.  This averaging will tend to reinforce the repeatable part of a response, while rejecting the time-variant noises.  Averaging is always a good thing when acquiring the impulse response, as it reduces noise in the system characterization.  The acquisition of the Actual Output (the top path in the previous figure) can also be averaged, but doing this will depend on what you are trying to accomplish.  You can use averaging to mitigate a test environment that is not free of ambient noise.  If you are searching for repeatable non-linearities, then averaging is good.  If you are looking for, say, door panel rattles in a vehicle, then averaging is probably not a good idea, as the rattles may not be the same each time and would then be reduced in the process.  In such a case, a quiet test environment is required as noise reduction via averaging is not an option.

The observant reader will notice that, for acoustic systems like that shown in the previous diagram, a measurement microphone is also included.  Any noises or distortions due to the microphone will also unavoidably appear in the Non-LTI output signal.  Unfortunately, for high level tests as are usually of interest for this process, the only option is an infusion of money.  Typical unmodified electret microphone capsules usually will begin to compress at higher listening levels.  For best results, a high voltage condenser microphone capable of handling high SPL levels is recommended.  Of course, for non-acoustic tests, this is not required, as only electrical probes are then involved.

For this technique to work well, you need to first acquire a clean impulse response that is indicative of the linear performance of the system being tested under the exact same conditions as will be used for the output acquisition.  When testing acoustic systems, he impulse response should be at least as long as the reverberation time of the system and room.  The impulse response should also be measured at relatively low input levels, to avoid compression of the output signal.  The best method to use for acquiring the impulse response is the "Log-Swept Sine" technique, as that has the ability to eliminate harmonic distortion in the measurement [3], leaving  a very pure impulse response result.

The Process

The Distortion Isolation (Time Domain) process has been largely automated for the Liberty Instruments "PRAXIS" system, in its "Distortion Iso (Time Domain)" script.  The script guides the user and runs through the process as follows, performing the setups and acquisitions.

First is selection the program material to be used.  It should not be too long to minimize the time required to later calculate the expected LTI portion of the output signal.  WAV files of half a minute or less are manageable with most computers.  The file is played through the system in a dry run, so the user can turn up the volume to the desired levels (usually to where he thinks he hears some objectionable effects) and can adjust the input gain levels of the measurement system (in the "Levels Form")  to optimize its dynamic range.  These settings must then remain stable through the rest of the process.

Next an impulse response measurement is made, preferably using some user-configurable averaging, via a Log-Swept Sine ("Log Chirp") at a reduced drive level.  This impulse response should be made immediately before the acquisition of the actual output of the system, to reduce the possibility of changes in air pressure, microphone position, or electronic component drift causing error in the calculated LTI portion.  It is important that the impulse response that is measured is that of the system output with respect to the internal digital signal path of the computer measurement system, including all gains and delays, measured as exactly as possible.  The A/D converters and D/A converters of the measurement system, like the microphone, are all part of the system being measured, and must be included in the IR measurement.  In PRAXIS, a special "Chirp (synchronous)" measurement stimulus is provided expressly for this purpose.

 The software then combines the WAV file and the measured  impulse response data using an operation known as linear convolution.  The result of the convolution is another data set that represents what the A/D output would be if the program file were played through an LTI version of the system that is being measured.  

Finally, the WAV file program material actually is played through the system, and the result is digitized and recorded by the measurement system.  If only time-invariant phenomena is being investigated (that is, if noise or rattles should be reduced), then this recording can be repeated and averaged as long as the user wishes.  After this, the expected signal is simply subtracted, sample by sample, from the measured signal, leaving the Non-LTI signal.  

Output Format

The PRAXIS script provides this in a format in which the Left channel (channel 1) contains the LTI portion and the Right channel (channel 2) contains the Non-LTI portion.  Summing the two of course gives the entire measured output signal.  The data can be saved in PRAXIS' "px2" format, or can be exported as a stereo WAV file.

The result files are best auditioned using headphones.  They can be played using various media players, or the PRAXIS program (in its full or free/demo operation) can be used for this.  PRAXIS can be downloaded from http://libinst.com/praxis_downloads.htm  at no charge, and can be used to examine, further process, or listen to example files created using the Distortion Isolation method.  To use PRAXIS to listen, load the WAV file(s) into the Primary Plot using the menu "File, Open" and selecting "Files of Type" = "wav" and browsing to the where the files are stored.  After loading the file (it may take a few seconds, as the files are large), select the menu "File, Listen" on the Primary Plot form, and select the soundcard to use.  You can open the soundcard's mixer to select inputs or adjust volume.  With stereo WAV files, there will be a series of buttons labeled "m", "s", "L" and "R".  Before starting the playback, select "m" to hear both channels combined in mono (for a Distortion Isolation file, this will be the same as the Actual Output signal).  Select "L" to hear the LTI portion. Most interestingly, select "R" to hear the distortion and noise (Non-LTI) portion by itself. Click on the blue arrow at the left to begin playback.

Some Example Files

I had a request from a potential user in the automotive industry to generate some Distortion Isolation files with a car stereo system in a vehicle. The following files were generated using PRAXIS 1.37c, with an M-Audio Transit card and an ACO Pacific 7012 microphone (with the "PS9200 Kit").  The vechicle was a 2005 Chrysler Town and Country van, with a Six-Speaker sound system.  The deck in the van was a JVC KD-SX980 (50W x 4). This vehicle was used because it had a line level input on its panel.  The computer used was an older 650MHz Compaq laptop.

Two WAV files were used.  First was about 30 seconds of the cut "Lie Still Little Bottle" by They Might Be Giants from the "Lincoln" CD.  The second was from the cut "First We Take Manhattan" from Jennifer Warnes' "Famous Blue Raincoat" CD.  These were selected because of their heavy bass lines and reasonable clean sound.

We tried to get the cuts to stimulate some strong panel vibrations in the van.  We could make the speakers sound overdriven, but buzzes were not apparent.  The vehicle door panels were (disappointingly, for our tests) well behaved.  During the second take on the Jennifer Warnes cut, we also leaned a tray of silverware against a door speaker and tested without averaging to try to bring out some buzzes, with slight success.

In most of these cuts, a hum is audible in the background.  This is from operating lawn mowers in the neighborhood, the tests being done in a suburban setting on a Saturday afternoon.  The program material was fed in mono, simultaneously to all loudspeakers in the van.

You can download and listen to the files using the links below.  Remember, the left channel is "LTI", the right channel is "Nont LTI" (distortion and noise).  Also, all files are "normalized", that is, scaled in level to best use the 16bit data format, so don't expect the "6dB higher level" to be 6dB higher when you listen.

Addendum, May 7, 2005: Another distortion isolation test was made the following weekend, this time using a slow "log-chirp" audio track to stimulate panel buzzing.  A log-chirp is a sinusoidal wave that slides slowly  in frequency.   The chirp stops at 400Hz (to avoid possibly damaging tweeters, which cannot usually handle high level steady state signals).  The test was done using a different vehicle (a 1997 Honda DelSol), and through an FM modulator.  There is evidence of significant compression, but it is unknown how much of this is from the speakers, or from the FM radio or the modulator.  But panel buzzing is distinctly audible in the Right channel.  This wav file is quite handy for testing for buzzes by ear, also, so a version is provided that can be easily used to burn a test CD.  There are few speakers or headphones that don't audibly rattle or buzz at even moderate levels when playing this!



LSLB original program.wav "Lie Still Little Bottle", single channel, no processing
LSLB DI Low Level.wav "Lie Still Little Bottle" Distortion Isolation file, at moderate listening level, 4x averaging
LSLB DI High Level.wav "Lie Still Little Bottle" Distortion Isolation file, at about 6dB higher level.  Some clipping evident on singer's voice. 4x averaging
FWTM original program.wav "First We Take Manhattan", single channel
FWTM DI High Level.wav "First We Take Manhattan" Distortion Isolation file., at high level, 4x averaging.  Speaker overload evident, slight buzzing.
FWTM DI Mod Level.wav "First We Take Manhattan" Distortion Isolation file., about 3dB lower, no averaging on Actual recording.  With silverware tray added to try for extra buzzing!
BassSweep.wav A log-chirp running from 10Hz to 400Hz over 10 seconds.  Used to generate the "DelSol 1.wav" file (below).
BassSweep44kHzStereo.wav A stereo (both channels identical), 44.1kHz sampled version of BassSweep, that can be easily burned onto CD for by-ear testing of speakers, cars, and rooms for buzzes and rattles.
DelSol 1.wav A Distortion Isolation file made using "BassSweep.wav", of  a 1997 Del Sol's stock stereo system, via an FM modulator. Played at a level where audible buzzing was apparent.

Analysis and comments

One oddity that was noticed is that there is a point early in the LSLB DI recordings in which it sounds like there is a breakup or distortion.  The same sound is not evident (to me) in the original program file.  I would have thought this was definitely non-linear distortion, but it appears in the LTI (left) channel!.  Apparently my ears aren't so good at differentiating linear frequency-response effects from non-linear distortion effects. 

The addition of the silverware tray didn't seem to make too much difference in the last test.  It was not noticeably rattling inside the vehicle, "live", either, but is mentioned in this report for completeness.

Files from these kinds of tests can be further processed to provide more objective results.  For each of the two "DI High Level" files, I separated the right and left channels, FFTd them (to find their spectrum), and then smoothed the results with 1/12th octave smoothing.  Then I divided the right (Non-LTI) spectrum by the left (LTI) spectrum to get a plot of relative "dB distortion" using program materials as stimulus. The graphs were better than I would have expected, distortion about 40dB down through most of the midrange.  At lower frequencies, the curves rise significantly, probably because those frequencies are far below the cutoff of the loudspeakers.  At higher frequencies, the curve also rises, which could be due to more distortion, or because of greater noise.  This latter rise may also be because harmonics of individual tones will pushed toward higher frequencies.

The most surprising aspect of these plots are their similarities.  Both spectrum curves are overlayed in the PRAXIS plot shown below.  Considering that they were made using different program material files, the similarity is remarkable.


Thanks to Joachim Gerhard for his ideas, efforts, and bug reports about the Distortion Isolation script in PRAXIS.

Thanks to Tom Alverson for his assistance in setting up and making the example files, and for the use of his van.


[1] Baxandall, P.,  "Audible amplifier distortion is not a mystery", Wireless World, November 1977, pp.63-66.

[2] Hafler, D.,  "A Listening Test for Amplifier Distortion", HiFi News & Record Review, November 1986, pp.25-29. 

[3] Farina, A., "Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique", presented at the AES 108th Convention, Paris, France, 2002 February 19 - 22.