The Glossary of Audio Measurements and Design Terms

system · April 3, 2023, 3:23pm

Use your browser search function (CTRL+F) to find particular topics you're wanting to learn more about!

Introduction

In recent years, measurements of audio gear have become more widely available with various sources putting out objective testing results of all sorts of headphones, DACs, amplifiers, power conditioners, you name it. But many of these measurements can be a bit difficult to understand at a glance. What does a particular graph mean? And how is it tested? This article serves as an introduction and guide to audio measurements, and a glossary explaining what all of the different tests and definitions you might see are, how they’re done, and what they represent.

The explanations given here will intentionally not go into too much detail and only tell you what you need to know, so that almost anyone can gain a sufficient understanding of a particular term or measurement, but links to additional reading are provided if you’re interested to learn more.

If you think that something is missing, or would like something explained or explored in a little more depth, leave a comment on the forum or come and ask on Discord!

The Basics

Before diving into specific tests and terms, let’s explain the three main ways that measurements of the output of a device can be presented:

Oscilloscope/Waveform:

The most basic way to show what a device is outputting is simply to show how the output changes over time. For a headphone/speaker, we would measure SPL (Sound Pressure Level), and for a DAC/Amplifier, we would measure voltage. In the graph below, you can see what a 1khz sine wave looks like on an oscilloscope as an example.

A sine wave is a very basic signal, and when playing music, the waveform will look quite a bit more complex, such as the graph below.

You can also see the waveform of a piece of music by looking at it in a program such as Audacity or Adobe Audition.

Additional Reading:

Tektronix - What is an oscilloscope?

FFT:

In many situations, an oscilloscope or waveform view will not be very useful for inspecting a signal or looking for certain types of distortion. There are two primary reasons for this.

Firstly, the scope view typically shows voltage as a linear scale, and so looking for very small distortions on a larger signal is extremely difficult, they might not even be big enough to shift a pixel on your display! Using a logarithmic scale can fix this, but it makes it nearly impossible to show all content on the display at once when there are both small and large signals involved.

Secondly, the scope view makes it difficult to see at a glance what the frequency of the signal is, or what combinations of different frequency signals are making up what we are seeing.

For example, the image below shows the oscilloscope view of a signal that has two main frequency components. And without getting out a calculator, we can’t easily see what those frequencies are, or what the relative levels between them are.

To solve this, we use a ‘Fast Fourier Transform’ or FFT. The maths behind this is outside the scope of this article, and if you would like to learn more about I’d strongly recommend watching this video, but all you need to know to understand audio measurements is that it breaks a section of signal down into lots of smaller, individual frequency components, and displays them with frequency on the X axis, and level on the Y axis. (Note: FFT bins are finitely sized and do not always represent 1 per hz)

Now things are looking a bit more clear! We can instantly see that the signal we viewed on the scope is a combination of two sine waves, one at 1khz and one at 10khz. We can also see that way down at the bottom there is a flat ‘floor’. This is just white noise. All real devices will have some level of random noise, and usually some additional distortion too which will show as smaller or lower level ‘spikes’ on the FFT.

NOTE: Never use FFT to judge the level of a noise floor, always use RMS values for this.

The noise floor on an FFT is dependent on the FFT parameters, FFTs break the signal into a bunch of different bins and show the total magnitude of signal in each of those bins, so to get the noise from an FFT, you need to know the number of bins.

An FFT also helps us to see whether there are any very small components in a signal that are not obvious on a scope view.

The image above shows the scope view AND the FFT for a signal. Just looking at the scope, it would look to be a 1khz sine wave, and that’s all. But if we look at the FFT, we can see that there is a little bit of content at 2khz and some at 3 and 4khz too. But because it’s so small (less than 0.000001V!) it’s nearly impossible to see on the scope. But because the Y axis on our FFT is in a decibel scale, which is logarithmic rather than linear, every time you go down by 20dB, that means the actual voltage is reduced by a factor of 10. So -40dB would be 100x less, -60dB is 1000x less and so on. This makes viewing even incredibly low level signal components very easy. FFTs can be displayed either with linear or logarithmic scales, though our hearing also operates on a roughly logarithmic scale for both frequency and amplitude, so using log scales is much more representative of how we perceive things than linear scales and thus this is usually the way they are presented.

Additional Reading:

NTI Audio - FFT Basics

NI - Understanding FFTs and Windowing

DesignNews - FFT Gain (Why FFT should not be used to evaluate noise floor)

Line Chart:

Lastly, many measurements rather than showing an FFT or Oscilloscope view, will plot one particular factor or value on the Y axis and another on the X axis. What these factors are will depend on the measurement, so always check the axis labels, but to provide an example, let’s show something that you’re perhaps familiar with, a frequency response chart.

In this case the measurement plots frequency on the X axis with device output level in dB on the Y axis, which is the frequency response of this device.

But the X and Y axis could contain any number of factors, such as THD vs Level, or Channel balance vs Attenuation.

Measurement Test Setup

Measurements can be influenced by the equipment and configuration used for the test setup. The same device tested on two different analyzers or in different environments may show different results. For this reason, we provide a short summary of the test setup, including both the measurement equipment, and any applicable settings/configurations of the device itself.

Audio Precision APx555 B-Series analyzer
Measurement setup and device under test are running on regulated 230V power from a Furman SPR-16-Ei
Dummy load is a Neurochrome HP-LOAD
Amplifier was warmed up for 12 hours prior to testing
‘Class A+Servo’ mode used unless otherwise specified
Exact analyzer/filter configurations for each measurement are detailed in the full reports
CH1 (Blue) = Left, CH2 (Red) = Right

Full Measurement Reports

Full Report Example

With an Audio Precision analyzer, you can create sequences within the software that will perform a user defined set of measurements automatically. The output of this sequence is saved as a .PDF file.

This is helpful for the one doing the measurements, as it allows you to create a long sequence of many types of measurements and setup configurations that can complete a full set of tests on a device in under an hour, that to do manually would have taken a full day or longer.

For the reader, it is helpful as it provides quite a lot of information about the configuration of the analyzer itself for a given measurement, such as bandwidth, input sensitivity, reference levels and more. It allows the user to verify exactly how a measurement has been taken and for those with in-depth knowledge to either confirm that a measurement has been done ‘properly’, or to point out potential issues in the setup if they exist.

Whilst the full report is not useful for most readers, and the majority of people will only be interested in the main provided measurement post itself, it is provided for the purposes of transparency, and it also contains some additional measurements, or repeats of other measurements but with different test parameters that some may find interesting. Please note however that the full report cannot tell you anything about the external setup such as configuration of a DAC/Amplifier under test, or what the external dummy load is set to.

Measurement Explanations:

In this section, we’ll talk through what each of the measurements that we provide (and some that we don’t) for headphones, DACs, and amplifiers are, how they’re tested, and a bit of info about what they might mean. If you see a measurement somewhere and you’re not sure what it is, come back to this post! If you spot something that is missing, do let us know, we will be updating this with more measurements over time.

If you’d like to jump to the glossary of terms instead, click here.

Index

Bandwidth
CMRR (Common Mode Rejection Ratio)
Crosstalk
CSD (Cumulative Spectral Decay)
Delay/Latency
Dynamic Intermodulation Distortion (DIM)
Dynamic Range
Frequency Response
Impulse Response
Interchannel Phase
Intermodulation Distortion (IMD)
Intersample Overs Susceptibility
Jitter
Linearity
Low Level Signal Output
Multitone
Noise (20hz-20khz, 20hz-96khz and 20hz-1Mhz)
Nyquist Reconstruction Filter / Oversampling
Output Impedance
Power
Power On/Off Behaviour (Safety Test)
SNR
Square Wave Output
THD+N (SINAD)
THD+N vs Frequency
THD+N vs Output Level
Volume Matching vs Level of Attenuation

Bandwidth

The bandwidth of a device is the range of frequencies that it is able to record or output.

In simplest terms, if a device can output as low as 20hz, and as high as 100khz, its bandwidth is 20hz-100khz.

Typically though the bandwidth is not a hard limit, but instead the sensitivity or output of a device will slowly decrease as you go up or down in frequency, and bandwidth is often given as a description of attenuation at a certain frequency. Most commonly when a -3dB change is reached.

As an example, the bandwidth of the device below would be described as having a -3dB point of 10khz.

This may also sometimes be described as the ‘Corner Frequency’, ‘Cutoff Frequency’, or simply shortened to ‘Bandwidth’ without any further descriptors.

This is usually tested by inputting a constant level sine-sweep signal into a device, and seeing at what frequency -3dB of attenuation is reached at the output. Though can also be tested with other content such as white noise, or derived from an impulse response test. A square wave test can also be used as the rise time of a square wave can be used to calculate bandwidth.

Additional Reading:

Electrical4U - Cutoff Frequency

Electronics Tutorials - Frequency Response Analysis of Amplifiers and Filters

CMRR (Common Mode Rejection Ratio)

CMRR is a measure of how well a balanced or differential device rejects common mode noise and interference. (Note: This is NOT related to shielding, that is a separate issue.) Common mode noise is noise present in phase on both the + and - signal lines.

A significant benefit of balanced connections is that because one of the two connections is inverted and subtracted from the other, any external common mode noise should in ideal situations be canceled out.

If external noise caused a change of +0.1V to the positive line, and also +0.1V on the negative line, then when we invert that negative signal at the receiving end, we end up adding +0.1V and -0.1V together, resulting in 0V and canceling out the change.

For this to work perfectly, both the positive and negative polarity signal paths must have identical gain, requiring very precisely matched signal paths through an amplifier.

If an amplifier does not have well matched signal paths and therefore a slight difference in gain, the CMRR value will be worse.

CMRR can sometimes vary depending on frequency.

CMRR is tested by connecting the differential output of an analyzer to an amplifier, inputting a signal, and measuring the output level.

The output of the analyzer is then switched to common mode, and a second level measurement is taken.

Diagram of the test circuit for CMRR

CMRR is the ratio of the change in level between the two measurements compared to the original output level of the device itself. A device with infinite CMRR would show no change in output because all common mode content is rejected.

An XLR cable loopback on this test measures roughly -115dB on the APx555 at 1khz.

Additional Reading:

CircuitBread - What is Common Mode Rejection Ratio?

Audio Precision - Measuring CMRR

Wikipedia - Common-Mode Rejection Ratio

Crosstalk

Crosstalk is when signal from one channel leaks to the other, usually via a process called capacitive coupling.

This is typically a more prominent issue in devices with many audio channels in close proximity such as a mixing desk, but occurs in two-channel devices too. To test crosstalk, a high level signal is played through a device in one channel, and the output level of the other channel is measured. This is usually tested using a sine sweep rather than a single static signal, as crosstalk tends to increase at higher frequencies and therefore just testing at 1khz is insufficient. An example crosstalk measurement is shown below:

Crosstalk can be reduced most easily by physically separating/distancing the two signal paths, such as by using mono-block amplifiers instead of a stereo amplifier, though there are various aspects of device design that will influence crosstalk, and many stereo products can achieve extremely low levels of crosstalk.

Additional Reading:

Audio Precision (Video) - Crosstalk Basics

CSD (Cumulative Spectral Decay)

Cumulative spectral decay shows how the energy at a particular frequency decays over time after the input signal has stopped. It is effectively a frequency response chart, but with an additional time domain axis.

CSD is measured by performing an impulse response or transfer function measurement. For all linear, time invariant systems (which almost all audio products are until very low or high levels), these mathematically provide the same information.

A “normal” FFT frequency response applies one fourier transform to the entire impulse response, showing the total sum energy. In a CSD, a series of much shorter FFTs are applied to the FFT, meaning that CSDs have inherent tradeoffs of time and frequency accuracy (as shorter-time FFT windows are more coarse in frequency resolution).

CSD measurements are not often provided as they can be misleading to users without further information.

Firstly, CSD in any linear, time invariant system, shows the same information as frequency response. A peak in frequency response will show as ‘ringing’ on a CSD and vice versa, and can be influenced through the use of tools such as EQ. (Keep in mind that frequency responses and CSDs can be smoothed, and this may obscure narrow peaks/dips on either measurement.)

Headphones and speakers are in almost all circumstances approximately linear and time invariant, and therefore CSD is effectively just another way of displaying frequency response.

Additional Reading:

Jason Dai - Is CSD Really Important?

AudioJudgement - CSD Explained, and how to measure using ARTA

Latency

This measurement is typically only done for digital devices. It is simply the time taken between providing an input to the device, and it providing the output.

Most DACs have very low latency, but it can be affected by things such as oversampling, DSP, or digital buffers which add some latency.

Latency is usually not a concern for consumers, as it does not really matter if there is a 1ms delay between pressing ‘play’ and your music playing, or a 100ms delay. However when watching videos, delays around 50ms or higher can start to make things such as speech appear desynchronised.

It is also especially important for latency to be as low as possible in recording and audio production environments, so will be more of a concern there.

Additional Reading:

PreSonus - Digital Audio Latency Explained

Dynamic Intermodulation Distortion (DIM)

DIM is a technique used to measure the non-linearity of a device, and it is designed to be particularly sensitive to distortions produced during transient conditions typical of musical material.

In DIM measurements, a square wave at a frequency of 3.15 kHz is low-pass filtered and then combined with a sine wave at a frequency of 15 kHz.

If non-linearities are present, the DIM signal induces intermodulation distortion products at nine different frequencies ranging from 0.75 kHz to 13.35 kHz. DIM is then calculated as the ratio of the root mean square (RMS) sum of the levels of the nine intermodulation components, to the level of the 15 kHz sine wave. It is typically expressed as a percentage or in dB.

DIM is usually tested at various output levels, and plotted as a ‘DIM vs output level’ graph.

Additional Reading:

Audio Precision - DIM 30 and DIM 100 Measurements per IEC 60268-3

Dynamic Range

Dynamic range is a ratio (in dB) of the largest signal a device can produce (without excessive distortion), to the noise floor of a device.

It is practically the same as SNR, but with the key difference that whereas SNR first tests a high level output, then turns off the signal and measures noise, dynamic range first tests a high level signal, then puts a second signal 60dB lower through the device and measures the noise floor whilst that signal is playing.

A major reason for using DNR instead of SNR is because some devices have a ‘squelch’ or mute circuit that kicks in when the device is idle, resulting in the noise floor when idle being lower than the noise floor when the device is active. And therefore they ‘cheat’ the SNR test. Some DAC designs also inherently have more noise when active. The AES17 dynamic range test ensures that the device IS active when testing noise level. Additionally, SNR measures an arbitrary signal level vs noise level, whereas DNR measures the maximum device output vs noise floor.

It can be worth checking what the full scale signal level used for the test is, results are not always apples to apples comparable.

For amplifiers, dynamic range itself is not a particularly useful measurement, as an amp can get a better result simply by having the same or higher noise level as another device, but having a maximum signal output level far higher than you would ever need.

Therefore for amplifiers, RMS noise level is a more useful measurement to compare between devices, but for DACs, dynamic range is useful as users rarely run DACs with any volume attenuation.

Additional Reading:

Audio Science Review - Understanding Audio Dynamic Range / SNR

Frequency Response

Frequency response describes the level of different frequencies relative to each other. In headphones and speakers, this is the difference in loudness between different frequencies.

Most DACs and Amplifiers will have a ‘flat’ frequency response, where all frequencies up to at least 20khz are produced at equal level, however speakers and headphones can vary quite significantly, causing them to have significant audible differences in presentation.

This is generally regarded as the most important measurement for headphones and speakers, and in many instances may be the only measurement provided.

The graph above is an example of a frequency response measurement, with frequency on the X axis and SPL on the Y axis, which shows us the relative loudness for any particular frequency. It also has a second dotted line which is a ‘target curve’, a theoretical ideal frequency response result, which helps us to see how the actual frequency response deviates from a neutral sounding target.

Frequency response can be tested in many different ways. Some of the most common are playing sine waves at a series of different frequencies and recording the level (stepped sine), playing a “sweep” which transitions from very low to very high frequencies and deriving the impulse response, and playing wide-band noise (which contains every frequency at a known level), but frequency response can also be measured by other means, including using music itself with transfer function measurements!

Frequency response measurements are often not provided for DACs/Amplifiers unless there is something unusual to show, but are available in our full reports. Because of impedance interactions, the “source” impedance (an amplifier for a headphone, a DAC for an amplifier, etc), and the “load” impedance (the headphone for the amp, the amplifier for the DAC) can change the frequency response of the device, so the source and load must be known for this to be a useful test.

Impulse Response

A dirac delta function, otherwise known as a unit impulse, or just "impulse", is an instantaneous transition from zero to another value, then back to zero. The impulse response of a DAC or other digital device is tested by playing a signal called an ‘impulse’. This is complete digital silence, followed by a single sample at maximum value, followed by complete digital silence.

For an analog device, it is not possible to create a true perfect impulse as this would require infinite slew rate/bandwidth. However so long as the source providing the impulse has significantly higher bandwidth than the device we are looking to test, that is sufficient.

Mathematically, this impulse represents ALL frequencies at equal value, and therefore the output tells us a huge amount about the behavior of the device.

By using various transformation/analysis methods on the impulse response such as FFT, we can obtain frequency response as well as information about phase, cumulative spectral decay (CSD) and more. You can also get the impulse response of a device by performing an inverse Laplace transform on the frequency response, mathematically they provide the same information and are interchangeable.

You may have come across impulse responses yourself if you have used tools such as room correction convolutions.

Another important aspect to note, is that with impulse responses provided by a DAC, you will often see talk about ‘ringing’ and ‘pre-ringing’, as shown below.

These are often described as bad/undesired effects, however this is not actually the case.

Firstly, ‘pre-ringing’ occurs with linear-phase filters, which means that there is no phase alteration caused by the filter. Whereas a minimum-phase filter with pre-ringing WILL cause phase shift. Strictly speaking, linear phase is more accurate or ‘transparent’.

Secondly, ringing itself occurs ONLY in the presence of an ‘illegal’ signal, containing frequency content above half the sample rate. The reason the impulse response test itself shows so much ringing is because it represents ALL frequencies at equal level, so there is an enormous amount of illegal content to filter out.

For any normal music, ringing never actually occurs unless there is a synthetic component with content above the Nyquist frequency (22.05khz for standard audio) or the digital audio is clipping.

You may see DACs with impulse responses which advertise ‘low ringing’ as a feature, however this is actually simply because their digital filter is not filtering out much at all.

The amount of ringing is directly correlated to how effective the filter is, and counter-intuitively, more is actually usually better. Ringing does not occur with properly recorded music, but the trade-offs you have to make to reduce ringing such as early treble rolloff will affect ALL content.

Impulse response is often derived as part of a ‘Transfer Function’ measurement rather than measuring directly. Transfer function explains exactly how the signal is altered when being passed through a device under test, and any input signal can be used.

Additional Reading:

Audio Precision (Video) - Transfer Function Measurements

Audio Precision - Transfer Function Measurements

Interchannel Phase

Phase describes how closely in time two signals are aligned and is described using degrees.

0 deg or 360deg means two signals are perfectly in phase/aligned, such as the two 1khz sines below.

180 deg means they are exactly half a cycle out of phase, such as the two 1khz sines below:

Sometimes, DACs, particularly those using two DAC chips instead of one, can exhibit an issue where one is converting slightly earlier than the other, resulting in the left and right channels being slightly out of phase. To test this, we simply play a swept-sine through a DAC and record the difference in phase between the two channels at various frequencies. Ideally this should be zero.

Interchannel phase vs frequency for a DAC with incorrect digital processing

Intermodulation Distortion (IMD)

Intermodulation distortion is distortion resulting from two or more signals mixed together that is not a harmonic/multiple of either frequency.

It has been shown to be significantly more audible than harmonic distortion.

The most common methods for testing this are firstly SMPTE, using a strong, low-frequency interfering signal (60hz), and a weaker high frequency signal (7khz).

Secondly, CCIF/ITU-R IMD testing uses two high-frequency tones of equal level, and allows testing of high-frequency distortion with limited measurement bandwidth - for example, with 19 and 20khz test frequencies, low order distortion can be seen with only 21khz bandwidth.

As a worked example of SMPTE testing: The 60hz+7khz signal is played through a device, and then the 60hz signal is filtered out from the measured result.

Intermodulation distortion products at 60hz intervals are created around the 7khz signal, and the IMD value is given as the ratio of the RMS level of these distortion components to the level of the 7khz signal. The image below shows a scope view and FFT of an SMPTE signal

The image below shows a zoomed in FFT showing the IMD components around the 7khz tone.

Additional Reading:

Sound-AU - Intermodulation Distortion

Audio Precision - More about IMD

Intersample Overs

In digital audio, there is a maximum possible value. For 16 bit, this is where all 16 bits are ‘1’, so 1111111111111111. Converting from binary to a regular numerical value, this equals 65535. Meaning for 16 bit audio there are 65536 different possible values a sample can have (because we also include 0).

With a normal digital audio file, you cannot go above this value, it’s not possible. This maximum value is called ‘0dB full-scale’ or ‘0dBfs’ and no samples can be stored that are higher than this value.

However, when performing DSP that alters the level of samples, what happens when you try to increase them too much?

Typically, the result is that the audio will ‘clip’, where you get a flat line wherever samples have reached the maximum. If we take a full-scale 1khz 0dBfs sine for example, and apply 12dB of gain, this is what happens:

A similar situation occurs within DACs themselves when they oversample. Because in some situations, the digital samples might all be below 0dBfs, but the actual waveform itself goes above it. See the image below of an 11.025khz sine; all the digital samples (squares) are below 0dBfs, shown as a red line, but the actual waveform goes above 0dBfs.

Therefore, when a DAC oversamples, adding the extra samples to reconstruct the waveform, it may clip, and so we get an output from the DAC that looks like this:

The solution is to reduce the digital volume of the audio by a few dB BEFORE oversampling, and many DAC manufacturers such as RME, Chord, and Benchmark do indeed do this. This means that when we play the same signal as before, they can properly reconstruct and output it with no issues:

Unfortunately, many DACs do not do this, and as a result will clip in the presence of any intersample overs. Clipping is a particularly audible type of distortion and therefore something that we check all DACs for. We check with both +3dB intersample overs and +1dB intersample overs, as some DACs have a little bit of extra headroom, but not enough to account for all scenarios where intersample overs can occur.

Some DACs also have a behavior where in the presence of intersample overs they will not clip, but instead the value of the sample gets ‘wrapped around’ to positive if it was negative before, or negative if it was positive before.

This is a particularly worrying situation as it creates an enormous transient that could potentially damage speakers/headphones.

Additional Reading:

Benchmark - Intersample Overs in CD Recordings

Mnaganov - DAC Clipping on Intersample Peaks

Jitter

Jitter is a measure of time-domain accuracy which affects digital devices.

Because with digital audio, it is stored and converted as discrete samples, not only must a device convert the value of a sample accurately, it must also convert it at precisely the right time. If the correct information is converted at incorrect times, this leads to distortion.

On the left is an example of a sine wave with no jitter. On the right is an example of the same data, but converted with extremely high jitter. All data/samples are identical, but the timing error distorts the analog output.

Jitter is tested and displayed differently depending on what is being measured.

DACs:

For DACs, a ‘J-Test’ is used, which is an undithered sine wave at exactly one-quarter of the sample rate frequency (11.025khz for 44.1khz, 12khz for 48khz), and in addition, the least-significant-bit, which is the lowest possible value that can be altered in the digital audio, is toggled with a frequency of usually 250hz.

This signal in essence causes the DAC to behave as a 2048x clock divider, and any time domain inaccuracies caused by jitter will show up as distortion components on the FFT as shown below.

Because this signal is above 10khz, all harmonics would be above 20khz, so harmonic distortion is not shown. Additionally because it is a single tone, intermodulation distortion is not a factor. The distortion components shown with this test are a direct result of modulation from clock variation (jitter).

NOTE: Do not directly compare the level of distortion components on a J-Test to the level of other frequency domain distortions. This is effectively a clever workaround to show time domain issues on a frequency domain plot. Some further maths is required to convert the results to time domain values and show the spectrum. Thresholds of audibility are not the same as other distortion types.

I2S:

For I2S digital sources, jitter is measured by connecting the I2S output to a purpose-built, extremely low noise 1024x or 2048x clock divider.

The resulting 11.025khz or 12khz square wave is fed to the analog input of the analyzer, and inspected in the same way as performing a J-Test on a DAC.

AES/SPDIF:

AES and SPDIF jitter can be measured directly using the APx555 analyzer’s advanced digital I/O module. A jitter spectrum up to 100khz is provided alongside RMS and peak values. Very low frequency jitter is filtered out, because we don’t really care about whether a DAC is operating 0.001% too fast or slow, that isn’t audible, we care about what the short term variations in timing accuracy are.

There is a LOT that can affect jitter from the clock source itself to signal-correlated jitter on AES/SPDIF, to the quality of the cable (yes really).

Additional Reading:

Julian Dunn - Jitter Theory

Audio Precision - Measuring Jitter with a J-Test

Linearity

Linearity is the behavior of a device in which the output signal strength varies in direct proportion to the input signal strength. This applies to both digital and analog devices, though with different factors.

Any type of distortion or other signal content that is not intended/not part of the original signal will contribute to nonlinearity.

To test linearity on a DAC, a 1khz sine is played at 0dBfs, and the output value is recorded. The digital signal is then stepped down by a small value, say 1dB, and the difference between the actual output value and expected output value is recorded. If the signal was stepped down by 1dB, the output should reduce by exactly 1dB. This process is then repeated down to a very low level, often -120dB, and the results are plotted as shown below.

In this instance, the device is linear down to about XXXdB where it begins to deviate from the expected value. Its output is too high in level, mostly because there is some noise contributing to the output.

To test linearity of an amplifier, the same process is done, but instead of a digital signal, an analog signal is input to the device.

Linearity can be tested with or without a bandpass filter. Testing without a bandpass filter means that noise is factored into the RMS level of the output signal, and so at lower levels the results will typically show a slow rise as noise becomes a dominant part of the signal, this rise occurs higher up the higher the noise floor of the device is.

Testing with a bandpass filter means that only the level of the fundamental signal itself is inspected and noise is mostly ignored, this isn’t usually necessary for amplifiers as noise is usually the main issue there and amplifiers rarely show deterministic behaviour at low levels, however for DACs, depending on topology, they can exhibit unusual and sometimes signal-correlated behavior at low levels, and preventing noise from obscuring the result can help to examine this behaviour better.

Additional Reading:

Texas Instruments - DAC Essentials: Static specifications & linearity

Low Level Signal Output

This is a test performed on DACs, whereby a -90.31dBfs sine is output and the scope view of the result is shown.

-90.31dBfs is chosen because this leaves only 3 possible digital values at 16 bit.

This test does not necessarily look for or evaluate anything specific, but can sometimes provide some insight into things such as whether a DAC is performing any sort of dithering/noise shaping, or whether there is any unusual periodic behavior that may not be apparent on an FFT.

We perform this test four times, using a 16 bit undithered sine, 16 bit dithered sine, 24 bit undithered sine, and 24 bit dithered sine.

By default, our measurement posts show the results with the analyzer set to a 96khz bandwidth to capture any high frequency issues and reduce ringing shown on the result in some instances, however results with the analyzer set to a 20khz bandwidth are available in the full reports if you’d like them.

Multitone / 32 Tone Test

This is a test where a signal composed of 32 individual sine waves are played through a device, using the APx555 as the analog source if testing an amplifier, or a 192khz sample rate if testing a DAC. The measured result is then shown on an FFT.

This measurement primarily addresses the concern held by some that single-sine testing may not adequately show distortion that could appear in complex music. In most cases it does not show anything that would not be shown through THD vs frequency testing or typical two tone IMD testing, however in some instances it may reveal unusual or unexpected behavior which would need to be investigated further.

Additional Reading:

Audio Precision - Using Multitones in Audio Testing

Noise (20hz-20khz, 20hz-96khz and 20hz-1Mhz)

All analog devices will have some level of noise, it is not possible to have none.

This test simply records the RMS value of the output of a device when not playing anything. The test is repeated three times, once with a bandwidth of 20hz-20khz, once with a bandwidth of 20hz-96khz and once with a bandwidth of 20hz-1Mhz. Lower RMS values are better as it means there is less noise.

An FFT is also provided to show if there is anything other than random noise, such as 50hz mains leakage or high frequency switching noise, though FFTs should never be used to compare the noise floors themselves as the apparent noise floor on the FFT can change depending on the number of points in the FFT analysis and number of averages. Use the FFTs ONLY to inspect non-random content, and use RMS values to actually compare noise levels. (See ‘FFT Gain’ for more info)

Nyquist Reconstruction Filter / Oversampling

Digital audio is based on the principle of Nyquist Reconstruction. Which states that we can perfectly reconstruct an original analog signal from sampled data, up to a maximum frequency of half the sampling rate.

However, we can only do this IF we perfectly band-limit, ie: filter out all content above half the sampling frequency.

DACs therefore use an oversampling or ‘reconstruction’ filter to interpolate between the samples in the 44.1khz data, adding extra samples to connect the dots in a way that filters out any content above the Nyquist frequency. Think of this as playing ‘connect the dots’ to recreate the smooth waveform from the samples.

The problem is that perfect band-limiting with an instant and infinite cutoff at 22.05khz would require infinite computing power which we cannot do, so we are limited in our ability to band-limit effectively and our filters will usually have a slower, gradual rolloff.

We can test the effectiveness of a DAC’s reconstruction filter by playing 44.1khz white noise, which contains all frequencies at equal levels, and inspecting how the DAC filters out unwanted content.

The image below is an example of a ‘good’ reconstruction filter. It keeps content below 20khz without attenuation, but attenuates content above this, and importantly, has attenuated extremely far by the time the Nyquist frequency of 22.05khz is reached.

The image below is an example of a ‘bad’ or ‘slow’ reconstruction filter, as it does not correctly filter out content above the Nyquist frequency, and also attenuates some of the content within the audible band (below 20khz).

The image below is an example of a nearly perfect reconstruction filter. It keeps all content all the way up to as close to the Nyquist frequency as possible, meaning as much of the original signal as possible is preserved, and then attenuates extremely steeply. This takes a fair amount of computing power, and so is usually only seen as a result of dedicated high performance reconstruction tools such as Signalyst HQPlayer or the Chord Hugo M-Scaler.

Reconstruction filters can also be ‘minimum phase’ or ‘linear phase’.

Minimum phase filters are simpler and can have less latency which is important for production scenarios, but they will introduce some phase shift, meaning higher frequency content passes through the filter faster.

Linear phase filters can be implemented in the digital domain, which do not impart any phase shift.

We can test for phase linearity most easily by performing an impulse response test. If there is no pre-ringing, and only post-ringing, the filter is minimum phase. If there is equal amounts of pre and post-ringing, the filter is linear phase. Some filters may be described as ‘intermediate-phase’.

Additional Reading:

TheProAudioFiles - Oversampling

Chord Electronics - Tapping into better audio (Note: This one is from a manufacturer, so keep in mind that there may be some bias at play here in regards to descriptions of audible effects, but there is some useful and interesting information about the maths and theory behind reconstruction)

Output Impedance (Damping Factor)

Output impedance is a measure of how much a source will drop in voltage when the load draws current. A source which had no change in output level depending on current draw would be said to have a 0Ω output impedance. In most situations, a lower output impedance is desirable.

‘Damping factor’ is a related term based on output impedance, though usually given as a ratio instead of an Ohm rating. To find the damping factor, you would divide the load impedance by the output impedance of the source. So for an 8Ω speaker on an amplifier with a 0.1Ω output impedance the damping factor would be 80.

Output impedance is tested by first playing a signal through a device, and measuring the output level when there is no load (the 200kΩ input impedance of the analyzer is also considered ‘unloaded’ because it is vastly higher than the output impedance). The same signal is then played, but with a load connected. Output impedance is then calculated using the following formula:

Output Impedance = Rload x ((Vunloaded / Vloaded) - 1)

Output impedance can also vary depending on frequency, and can be tested by performing the same test but with a sine sweep instead of a single frequency.

Usually, lower output impedance is desirable, however there are two main situations in which this is not the case.

Firstly, in some situations, subjectively people can find that amplifiers with higher output impedances sound better with certain dynamic driver headphones, as the combination of high output impedance, and the varying impedance vs frequency of the headphones, alters the frequency response of the headphones themselves, typically increasing volume in the bass region. Some amplifiers even have an option to deliberately increase output impedance for this reason.

Secondly, many DACs and preamplifiers will have an output impedance of around 50-100Ω for the purposes of short-protection. As this means if for example a 4V signal was accidentally shorted to ground with an output impedance of 100Ω, this would only result in 0.04A (0.16W) of current draw, whereas with a 0.1Ω output impedance it would be 40A (160W) and risk damaging the device. Therefore a higher output impedance is an effective protection measure, and does not usually matter when most amplifiers have an input impedance of tens of thousands of Ohms, thus meaning even with a 100Ω output impedance the ratio is likely to be in the realm of 500:1. Though some amplifiers have unusually low input impedance, and some DACs have unusually high output impedance, so do check your device specs.

Additional Reading:

Benchmark - Audio Myth “Damping factor isn’t much of a factor”

Power

Power is equal to voltage multiplied by current. It is a measure of work done, and some speakers and headphones can take quite a lot of power to get to higher volumes.

The amount of power an amplifier can deliver will vary depending on the impedance of the load, and this specification is most commonly tested by increasing the output of an amplifier until it reaches 1% THD. This is considered the maximum usable power an amplifier can deliver.

However the maximum power alone does not give a particularly good indication of how well an amplifier may drive a difficult pair of headphones or speakers, it is simply an absolute maximum that can be output without massive levels of distortion, or possibly shutting off entirely to protect itself. It tells you nothing about how the amp will perform for power output levels under that maximum.

To get a better picture of how an amplifier will perform for difficult to drive headphones, it’s best to look at the THD vs level curve.

The graph below shows that the amplifier is capable of outputting a maximum of 4.5W into 32Ω before hitting 1% THD, or about 0.7W into 300Ω.

However, whilst it can provide a MAXIMUM of 4.5W, distortion begins to increase drastically down at around 0.2W, and by the time we are at 2W (less than half the maximum), distortion is already at a very high 0.05% (-66dB). So for a difficult to drive headphone that may only need 0.5-1W at most, this amplifier may not actually power them properly even though it can supply significantly more power than that.

If we plot the THD vs level of another amplifier, which can deliver only 3.5W at maximum for 32Ω, we can see that despite the lower maximum power, it is performing significantly better at higher output levels for the same load, and will perform much better for that headphone that needed 0.5-1W.

Whilst using Watts on the X axis allows us to quickly see what the maximum power is, this display can actually be a bit misleading due to the fact that power follows a logarithmic scale. Twice the power is actually only +3dB!

Additionally, it makes seeing how an amplifier changes performance at the same output level for different loads quite difficult, as the traces shift left or right depending on both the actual amp output level, AND the impedance.

Instead, we can show it clearer with a logarithmic voltage scale (dBV):

This much more clearly shows us how the two amplifiers perform at the same output level for the same load, AND how one amplifier can change in performance as the load gets more difficult.

We primarily show power using a dBV scale for these reasons, as well as it having the bonus that it makes finding how an amplifier may perform for YOUR headphones quite a bit easier. For example if you have HD600, that has a specified impedance of 300Ω, and sensitivity of 105dB(1Vrms).

1Vrms = 0dBV, so just look on the 300 Ohm line, at 0dBV on the X-axis, that's the distortion your amp will have driving the headphones at 105dB SPL.

If you listen at 95dB SPL, just look at -10dBV instead. Easy! Though it is also worth considering that you should probably look at a level higher than what you think you listen at. Sensitivity of headphones is given at 1khz, and due to a combination of our hearing not being equally as sensitive at all frequencies, and the spectra of music not being flat (low frequency content is usually much higher in amplitude), it is not uncommon to see a ‘90dB’ listening level containing bass peaks/impulses at around 110-120dB. I would recommend that you look on the graph at where your headphones would reach 120dB, to ensure that you do not encounter any limits.

From Benchmark Audio: “If our goal is to build transparent audio systems, there should be no audible distortion. If we keep the total distortion at or below the threshold of hearing (0 dB SPL), we can guarantee that it will be inaudible. When this level of performance is achieved, we are not relying on masking to hide the distortion. Instead, the distortion is inaudible because it is reproduced at levels that are below the threshold of hearing.

If we want to listen at 80 dB, peaks will reach about 100 dB. At this playback level, the distortion must be lower than -100 dB (0.001%) to absolutely guarantee that it is inaudible.

If we want to listen at 90 dB, peaks will reach about 110 dB. At this playback level, the distortion must be lower than -110 dB (0.0003%) to absolutely guarantee that it is inaudible.”

Additional Reading:

GeoffTheGreyGeek - Understanding Amplifier Power

Benchmark - Interpreting THD Measurements, think dB not percent!

Power On/Off Behaviour

When an amplifier is turned on or off, it can sometimes output a signal that could be dangerous to the connected headphones or speakers.

This is usually either a high level transient, which you might hear as a loud click/pop, or potentially a high DC offset.

Some amplifiers have been shown to output high levels of DC until they warm up.

To conduct this test, we connect the amplifier to a load (usually 300Ω for headphone amplifiers), and turn it on from cold, recording the output. An example result is shown below:

In this instance, there is only a very small signal (no more than 3mV), and the DC offset is very low, so it is likely that connected headphones will be completely safe when turning this device on.

The test is then repeated, but when turning the amplifier off from warm.

Regardless of what this test may show, we would strongly recommend that you ALWAYS disconnect headphones from your amplifier before turning it on or off unless the manufacturer explicitly recommends you do otherwise.

SNR (Signal to Noise Ratio)

SNR is the ratio of a signal to the noise floor.

To test this, a signal is played through the device and the RMS level is recorded. Then, the signal is stopped, and the RMS level of the output of the device is recorded again to capture the noise level. SNR may also be measured with an FFT by simply excluding the signal and distortion bins (e.g. the bins centered on 1khz and its harmonics for a 1khz stimulus)

SNR is given as the ratio between these two values expressed in decibels.

SNR can be tested with a starting signal of any level, for DACs we usually use 0dBfs, and for amplifiers we use 4V (unity gain), 700mV (Headphone level) and 50mV (IEM level, Make sure when comparing SNR results that the starting signal used was the same level.

SNR may give misleading results in some circumstances as some devices have a ‘squelch’ or mute system that kicks in when nothing is playing, thereby reducing the level of noise when idle to a level below what the noise floor would be when active.

To combat this, the AES17 dynamic range test is used (see AES17 dynamic range for more info).

Additional Reading:

Audio Precision (Video) - SNR Basics

Square Wave Output

Square wave measurements are among the oldest audio measurements, dating back to the analog oscilloscope era. While they don’t contain information that isn’t present in other measures, they are required for verifying some specifications (such as slew rate), and their extremely low crest factor makes them a uniquely challenging test of amplifiers.

Slew rate is defined in terms of voltage change per microsecond (V/μS), and may be measured by inputting a square wave (whose ideal slew rate is infinite) into a device and measuring the rate of voltage increase between the 10% and 90% points during the rise time of the square wave. A slew rate limited amplifier will distort fast, high-level transitions.

Overshoot on the square wave can imply that the circuit is close to being unstable, possibly indicating a risk should the device be exposed to an electrostatic discharge. (Other factors can cause overshoot or ringing however, this is not always the cause.)

Additional Reading:

ESP - Squarewave Testing

Audio Precision - Square Waves

AP - Measuring Slew Rate or Rise Time

Texas Instruments - Slew Rate Introduction (video)

EDN - Rise Time And Slew Rate - Not Quite the Same

THD+N (SINAD)

This is perhaps the most well-known and used measurement. THD+N stands for ‘Total harmonic distortion plus noise’, and SINAD stands for ‘SIgnal to Noise And Distortion’.

These are actually the same measurement - the relative level of distortion+noise to the desired signal - but THD+N is usually expressed as a percentage whereas SINAD is typically expressed as a ratio in decibels.

It is a measure of the RMS value of the signal, compared to the RMS value of all noise and harmonic distortion.

To test this, a test tone (could be any frequency but 1khz is most commonly used) is played through a device. The resulting output of the device is then fed to the analyzer. The RMS value is recorded, and then the 1khz tone is filtered out. The RMS value of the remaining signal, which now contains only the noise and distortion components, is measured, and THD+N/SINAD is given as the ratio between these two values.

Whilst the 1khz tone can be filtered out digitally, in many high performance measurement systems including our APx555, the 1khz tone is filtered out in the analog domain BEFORE the analog-to-digital converter, because all real devices, including the ADC, have harmonic distortion. Therefore if the full signal was fed to the ADC, the ADC’s own distortion would contribute to the measurement. By filtering out the 1khz tone, the ADC will not produce any harmonics because there is no tone present, so we measure only the harmonics produced by the device under test, and additionally we can use a lower input sensitivity such as 310mV to more accurately measure the low-level noise, whereas the self-noise from the analyzer would be higher if we had to use a higher 5V input sensitivity to accommodate the full signal.

Example of the ‘residual’ of a 1khz THD+N test, which is all the content with the 1khz tone removed.

The result can be given simply as a THD+N percentage or SINAD dB value, however a view such as this one is often given:

This is part of the Audio Precision dashboard view, and shows the fundamental signal frequency, THD+N value, SINAD value, and the RMS value of the full signal.

Whilst THD+N/SINAD is a helpful number to quickly get a rough idea of the performance of a device, this single figure alone is not comprehensive, as it is only testing performance with a single tone at a single frequency, and also makes no distinction between noise, even order harmonics, odd order harmonics, high vs low order harmonics, mains leakage etc. Other forms of distortion that only show in situations where multiple input tones are present (such as Inter-Modulation Distortion) are not shown in SINAD figures.

Looking at the FFT produced during this test shows us a bit more information about what is contributing to the THD+N figure.

Harmonics at even multiples of the fundamental frequency are called ‘even order’ (for a 3khz signal these would be 6khz, 12khz, 18khz and so on), and harmonics at odd multiples of the fundamental frequency are called ‘odd order’ (for a 3khz signal these would be 9khz, 15khz, 21khz and so on).

The spikes at 50hz or 60hz and multiples of that value are called ‘mains leakage’, and could be due to either a ground loop or noise picked up by the signal path usually from the PSU transformer.

Even order harmonics are considered to be less audible and more pleasant than odd order harmonics.

Lower order harmonics are less audible as they are masked more by the fundamental frequency.

For the above reasons, don’t go by THD+N/SINAD alone. A device with 30dB poorer SINAD may actually be considerably better if the main components of that distortion are 2nd order and 3rd order distortion, compared to a device with 30dB better SINAD that is primarily composed of 5th/7th/9th order distortion for example.

Additional Reading:

AudioInterfacing - What is Total Harmonic Distortion Plus Noise

Audio Precision (Video) - THD+N Basics

Audio Precision - THD and THD+N, similar but not the same

THD+N vs Frequency

THD+N is most often tested at 1khz, but does not always remain constant vs frequency. For this reason, it is often critical to test THD+N at a range of frequencies and show the results as done in the graph below:

In this instance, THD+N at 20khz is around ten times higher than it is at 1khz.

THD+N testing at higher frequencies requires some compromise though. Usually for standard 1khz THD+N testing the analyzer bandwidth is set to 20khz to ensure that noise above 20khz does not contribute to the result. But a 20khz filter means that any harmonics above 20khz will also be filtered out. This results in a sudden dip on the graph usually at 6.67khz, where the 3rd harmonic is now above 20khz, and at 10khz, where ALL harmonics are now above 20khz:

For this reason, THD+N vs frequency is usually tested with a higher frequency cutoff filter instead of 20khz. This does mean that noise from 20khz-40khz is factored in though, so don’t use this measurement as an absolute value comparable to standalone 1khz THD+N tests, it should be used only to see if there is a profile of rising distortion into higher or lower frequencies.

Additional Reading:

Nihtila - THD+N vs Amplitude and Frequency

THD+N vs Output Level

(This mostly relates to DACs, for amplifiers, see ‘Power’)

THD+N for a DAC is most often tested at high levels, usually either 0dBfs output, or sometimes the level is adjusted to match 4V line level. But as well as THD+N varying according to frequency, it can also vary according to output level.

We test this by playing a 1khz tone through the device at a high level, and measuring THD+N. Then, the signal level is stepped down a small amount, and THD+N is measured again. This is repeated to a low level, usually -120dB, and produces a graph like this:

It is not uncommon for DACs to see a slight drop in THD+N towards the upper limit of their output. And certain DAC topologies will show quite large changes on the THD+N vs output level chart.

Additional Reading:

Nihtila - THD+N vs Amplitude and Frequency

Volume Matching vs Attenuation Level

Amplifiers typically use a potentiometer for volume control. This is the component attached to the volume knob that adjusts the signal level being fed to the amplifier itself.

Potentiometers are rarely perfectly matched, and therefore there will be a slight difference in the level of attenuation between the left and right channels.

Usually this difference is very small and not noticeable, but gets worse at very low levels, sometimes causing noticeable channel imbalance when listening very quietly, with IEMs for example.

This test measures the difference in gain between the left and right channels, at different levels of attenuation.

A typical 4V balanced or 2V unbalanced signal is fed to the amplifier, and then the amplifier is adjusted until one channel reaches gain values of +6dB (if possible), 0dB, -6dB, -12dB, -26dB, -32dB, and whatever value is required to achieve 50mV (IEM level) output. At each level, the difference in gain between the two channels is recorded, with lower values being better.

The volume is then reduced until the difference between the two channels reaches 1dB. The level of attenuation at which this happens is recorded. The same is then done for the point where the difference reaches 3dB.

The lower the result for each level, the more closely matched the two channels are. A set of example results is below:

+6dB (Max) = 0.03dB

0dB =0.18dB

-6dB = 0.44dB

-12dB = 0.61dB

-26dB = 0.19dB

-32dB = 0.08dB

-38dB = 0.49dB (50mV IEM Level Output)

1dB channel difference reached at -43dB

3dB channel difference reached at -56dB

Additional Reading:

PracticalDevices - Channel Imbalance in your Volume Pot - Some Thoughts

Glossary of terms:

In this section we’ll provide definitions for various terms, acronyms and descriptions that you might see surrounding measurements. If ever you’ve seen an objective audio term and you weren’t sure what it meant, this section should provide an answer.

If there’s something not listed here let us know and we’ll add it!

Index

Bandwidth

CMRR (Common Mode Rejection Ratio)

Crosstalk

CSD (Cumulative Spectral Decay)

Delay/Latency

2nd Order / Second Order Harmonics:

A form of distortion. A tone not present in the original/input signal that exists on the output, at a 2x multiple of the fundamental frequency. For example, the 2nd order harmonic for a 3khz sine wave would be 6khz.

3rd Order / Third Order Harmonics:

A form of distortion. A tone not present in the original/input signal that exists on the output, at a 3x multiple of the fundamental frequency. For example, the 3rd order harmonic for a 3khz sine wave would be 9khz.

AC Coupled:

An AC coupled device only allows AC signals to pass through, but blocks DC.

This is done via the use of a high pass filter, often either a transformer or DC blocking capacitor. AC coupling should be done such that the high pass filter does not affect content close to or above 20hz, as this may then have audible effects.

AC coupling is used for a variety of reasons, most commonly protection in amplifiers as it prevents any DC offset on the input signal from being amplified, as well as preventing any loss in performance in ADCs that do not perform properly in the presence of DC offset.

ADC (Analog To Digital Converter):

An ADC is a device that records an incoming analog signal digitally, commonly found in microphone/recording interfaces and audio analyzers.

AES:

AES typically refers to the AES digital audio connection found on many digital audio devices. AES digital audio uses an XLR plug, but AES cables and analog XLR cables are not the same. An AES cable should have a characteristic impedance of 110Ω.

The actual digital information transmitted over AES is actually nearly identical to SPDIF, and an SPDIF source can be connected to an AES input via the use of a passive converter and vice versa. It is essentially a balanced version of SPDIF, however it also uses a much higher signal level.

AES can also refer to the Audio Engineering Society.

Amplifier:

An amplifier is a device intended to increase the amplitude of a signal, and to provide a source of current when driving a load such as headphones or speakers.

Amplifiers are not always used to amplify the incoming signal, and in fact in most headphone systems, the output voltage of the DAC is higher than the output voltage driving the headphones, the signal is attenuated not amplified, but the DAC outputs would not be able to supply enough current to effectively power the headphones, and so an amplifier is used.

Analyzer:

An audio analyzer is a device designed for the purpose of inspecting and analyzing the output of audio devices. It will include analog to digital converters, and will often include other components useful in various measurements such as a clean sine wave generator, notch filter, adjustable input impedance and various software features to facilitate taking complex measurements.

A standard ADC can be used for basic measurements instead of an analyzer if it is accurate enough, but an analyzer is often required for more advanced or accurate analysis. There are many different brands and models of analyzer available. We currently use an Audio Precision APx555B

Analog:

A continuously variable physical property. In audio this refers to analog audio signals where voltage changes in amplitude continuously. It could be measured with infinite timing and amplitude precision with the right tools.

This is in contrast to ‘digital’, where information is stored as discrete, individual figures. Digital audio for example stores amplitude information in samples, typically 44100 times per second. Nothing exists between them, and these samples have a finite level of precision according to their bit-depth.

Attenuate / Attenuation:

To reduce the amplitude/level of a signal. This can be done digitally, usually called ‘DSP Volume’, or in the analog domain with devices such as potentiometers, stepped attenuators, or series resistors. The level of attenuation is typically described in decibels (dB).

Attenuation also describes the level of reduction of out of band signals filters such as DAC and ADC reconstruction filters, or power filter/conditioners

Balanced:

(Please read ‘single ended’ first)

A balanced audio configuration uses two signal connections. It requires that both of these connections have an identical impedance to ground. Balanced does NOT require that both the positive and negative connections carry an active signal. It could be that only the positive connection carries the signal, and the negative connection remains at 0V. So long as the impedance to ground on both is identical, the connection will reject external noise.

If the impedance to ground on both connections is not the same, noise rejection will not be fully effective. This can be tested and measured as ‘CMRR’. (See ‘CMRR’ in the Measurement Explanations section for more info)

If the negative connection carries the same signal as the positive but with inverse polarity, this is referred to as ‘differential’ operation. (See ‘differential’ for more info).

Bandpass filter:

A bandpass filter is a filter that allows only signals within a specified frequency range to pass through, filtering out signal content that is higher or lower than this range.

Bit Depth:

In digital audio, the signal is sampled at a rate of usually 44.1khz. Each of these samples is a number that describes the amplitude of the signal at that point in time. Each sample will have a ‘bit depth’, which is the number of bits stored in each sample.

16 bit audio has a maximum value (where all bits in a sample are ‘1’) of 65535, meaning that there are 65536 different possible values (as we also include 0), and this equates to 96dB of dynamic range. There is not enough precision to describe a change in amplitude smaller than 96dB below the maximum signal level.

Each bit added doubles the number of values a sample can have, adding 6dB of dynamic range. And so 24 bit audio has a maximum value of 16,777,215, providing a dynamic range of 146dB.

32 bit audio has a maximum value far higher, however no real device exceeds 24 bit resolution due to noise limitations, and therefore 32 bit audio is typically only useful when audio is being repeatedly modified and altered during production to preserve accuracy, and then the final result is exported as 16 bit or 24 bit.

Bit depth usually refers to the number of bits in each sample, but can also describe other things such as the level at which dithering is being applied.

BNC Connector:

The BNC connector is a single-ended, locking connector. It is commonly used for digital audio, in particular, carrying either SPDIF signals or signals from a dedicated external clock device. However it can in some instances be found in use for standard analog connections, such as on analyzers and oscilloscopes.

Buffer:

Digital information can either be continuously transmitted in a stream, or it can be captured and stored.

In some cases, digital information may need to be stored for a short duration before being re-transmitted to another device or component, and these short-term storages are called ‘buffers’.

Buffers can serve many purposes, including as a ‘digital reservoir’ to make sure that in the event a digital connection is interrupted, the device can continue to access and convert the received information for a little longer whilst the connection is restored or catches up. Similar to how youtube loads some of the video ahead of time, rather than streaming each frame to you in real time, allowing playback to continue even if your connection is interrupted for a few seconds.

In audio, a buffer is typically used either when a device requires access to a larger number of digital audio samples, not just the current one, such as for the purposes of performing DSP. Or sometimes to decouple from the clock signal provided by a source. Some DACs rather than converting audio as and when it is received and instructed by the SPDIF source clock, will buffer the data and then convert using their own internal clock.

For USB audio, all data is buffered, and the DACs internal clock is used.

Capacitive Coupling:

Capacitive coupling is the transfer of energy/signal either within an electrical circuit or between two separate electrical circuits induced by the electric field.

This means that in audio, signal can sometimes be transferred to another part of a circuit even when no direct connection exists via an electrical conductor. Most commonly this manifests as crosstalk between the left and right channels, particularly in devices where the two signal paths are in close proximity.

Channel:

An audio channel is a single audio signal path. In a headphone or speaker system, there are usually two channels, one for left, and one for right.

Some systems such as surround-sound setups may have more than two, commonly 5 or 7 audio channels.

Sometimes channel count may be shown with a decimal, such as ‘2.1’, ‘7.1’, ‘2.2’ etc. This is used when there is a subwoofer channel in use. The first number denotes the number of full range channels, and the number after the decimal denotes the number of subwoofer/low frequency only channels.

Systems with one channel are called ‘mono’, systems with two channels are called ‘stereo’, and systems with more are typically just referred to as ‘multi-channel’ or ‘surround-sound’. The number of subwoofers does not usually affect this naming scheme.

Characteristic Impedance:

Characteristic Impedance or surge impedance (not to be confused with standard or DC impedance) is a measure of what the input impedance of a cable would be if it were infinite in length.

It is used for RF and high frequency systems, as a transmission line of finite length that is terminated at the receiving device with an impedance equal to the characteristic impedance of the cable, appears to the source like an infinitely long transmission line and produces no electrical reflections.

Characteristic impedance cannot be measured with a multimeter. An LCR meter must be used to measure the parallel capacitance of the cable, and then the series inductance.

Characteristic impedance is given using the formula:

Characteristic impedance = sqrt(inductance / capacitance)

A cable can have a characteristic impedance of for example 75Ω whilst having a DC resistance near 0Ω. Don’t be worried if the AES/SPDIF cable you bought measures 0Ω on a multimeter.

Cables with incorrect characteristic impedance used for clock signals and digital audio may produce electrical reflections and have an adverse effect on jitter.

Class A:

In a class A amplifier, a transistor conducts for a full 360 degrees of the output cycle. And is biased to be ‘always on’. This has benefits including being able to use only one transistor if desired (whereas Class B/AB must use at least two), avoiding any crossover distortion caused by transistors switching on/off, and because the amplifying device is ‘always on’, there is no "turn on" time, no problems with charge storage, and generally better high-frequency performance and feedback loop stability (and usually fewer high-order harmonics).

However, because the device is always on even when not playing/amplifying anything, any energy not used to drive the output is converted to heat. Meaning Class A devices always pull the same power from the wall regardless of how much power they are outputting to a load, and will have higher power consumption and heat dissipation than other amplifier classes. Class A amplifier energy efficiency will almost never exceed 50% at absolute best. Meaning a ‘6W’ per channel Class A amp (6W x 2) would need to draw at the very least a constant 24W from the wall and likely more.

Class B:

In a class B amplifier, a transistor only conducts for 180 degrees of the output cycle. Because of this, at least two transistors must be used to be able to conduct the full output:

Class B amplifiers can be much more energy efficient than class A, with a maximum theoretical efficiency of 78.5% (with a full scale sine wave), but has the drawback that there is inevitably some small level of mismatch in the crossover region where the signal moves from being conducted by one transistor to the other. This creates an effect known as ‘crossover distortion’.

This is an extremely audible type of distortion, and for this reason, class B amplifiers are almost never used in audio, with class AB being far more common.

Class AB:

In a class A amplifier, a transistor conducts for somewhere between 180 degrees and 360 degrees of the output cycle.

Each transistor will conduct for the same 180 degrees as in a class B amplifier, but also a small amount above this, which helps to reduce crossover distortion, without the energy inefficiency of class A.

Class AB is more efficient than class A, but less efficient than class B. However in many cases it provides better linearity than class B.

The ‘class A bias’ describes what level of current the transistors are biased to operate within class A - that is, with no transistors shutting off. Whenever the device is providing less than this amount of current to the load, the amplifier will be operating as though it were class A, and any signal that draws current above this value behaves as class B.

Class D:

A class D amplifier uses some form of pulse-width modulation to drive the output. The transistors function as electronic switches instead of linear gain devices. They are either on or off, and the incoming signal is converted to a stream of pulses by pulse-width modulation or pulse-density modulation.

This stream of high amplitude pulses is then passed through a low-pass filter, which removes all high frequency components and leaves only the amplified signal.

Because of this binary (on/off) behaviour, class D amplifiers are sometimes referred to as ‘digital’ amplifiers (though the ‘D’ in ‘Class D’ does NOT actually stand for digital).

The primary benefit to class D amplifiers is power efficiency, achieving efficiency over 80% quite easily. Some can also accept a digital signal directly, which is then converted to a PDM or PWM output rather than using an analog input signal.

There are drawbacks to class D, including dynamic range being directly related to switching frequency, the necessary output lowpass filters causing phase shift at higher frequencies, and unlike linear gain Class A, B, and AB amplifiers, class D amplifiers are also susceptible to jitter in a similar fashion to DACs and ADCs.

Clipping:

Clipping describes when a signal is unable to go above a particular value, resulting in the peak of a waveform being ‘clipped off’.

This can occur in the digital domain when audio is amplified or processed, but samples are unable to be increased to a value above 0dBfs. Or in the analog domain when a limitation of either output voltage or current prevents a device from being able to output above a particular signal level.

CMRR:

Common Mode Rejection Ratio. A measure of how well a balanced or differential amplifier (or cable) rejects common mode noise by nature of closely matched differential signal paths. Not to be confused with shielding effectiveness.

(See ‘CMRR’ in the Measurement Explanations section for further info)

Corner Frequency:

The frequency at which a filter or device has reached -3dB attenuation, used for describing the bandwidth of analog devices and filter responses.

(See ‘Bandwidth’ in the Measurement Explanations section for further info)

Common Mode:

Common mode describes a signal that is applied identically to two inputs of an electrical device. When referenced to ground, a common-mode signal appears on both lines of a two-wire cable, in phase and with equal amplitudes. Most commonly this describes common mode noise in a balanced audio system. (See ‘CMRR’ for more info).

This is in contrast to ‘differential mode’ where a signal is applied with opposite or inverted polarity to two different inputs of a device.

Crossfeed:

Crossfeed is a feature of some audio devices which feeds some of the signal from the left channel into the right channel and vice versa. The signal is crossfed with a slight delay, and usually a low-pass filter.

This aims to emulate the effect of listening to a pair of speakers, where each ear can hear both channels, but at slightly different amplitudes and times, as opposed to headphones where each ear hears only one channel.

Crossover Distortion:

Crossover distortion is a type of distortion that occurs specifically when one of the signal transistors turns off during the waveform. Most often as a result of class B amplifier design, where the signal switches from being amplified or provided by one transistor to another.

Current:

Current is the rate of flow of electric charge.

Current Source Amplifier (Current Mode / Current Drive):

A current source amplifier is an amplifier that regulates the level of current supplied to a load, rather than regulating voltage.

Current source amplifiers have a damping factor of 0, and will cause many speakers and headphones to sound different, as the impedance curve of a headphone/speaker no longer affects the resulting SPL. This typically results in an increase in volume somewhere in the midbass region.

Current source amplifiers are quite rare due to both the fact that they alter the frequency response of many devices, and some challenges involved with designing them, particularly involving having them remain stable whilst no load is connected.

Current source amplifiers can be easily identified by an extremely high output impedance. The output impedance of an Enleum HPA-21 for example is 2.5 million Ohms.

If a device lists terms such as ‘current feedback’ or ‘current gain’, that does not mean it is a current source amplifier and these terms refer to other aspects of amplifier design. To regulate voltage an amplifier must have an output impedance LOWER than the load. To regulate current an amplifier must have an output impedance HIGHER than the load.

A device that has an output impedance equal to the load impedance is sometimes referred to as a ‘power mode amplifier’ by some community members, as the amplifier is regulating voltage and current equally, so it could be described as regulating power.

DAC (Digital to Analog Converter):

A digital to analog converter is a device that converts digital audio, stored either as sampled PCM data, or pulse-density-modulated DSD data, to an analog output.

Damping factor:

Damping factor is the ratio of the impedance of a load to the output impedance of an amplifier. (See ‘output impedance’ in the Measurement Explanation section for more information)

dB (Decibels):

A decibel is a relative unit of measurement that expresses the ratio of two values on a logarithmic scale.

Whereby +1dB = 1.259x, +3dB = ~2x, and +6dB = ~4x and so on.

dB itself is simply a ratio, however there are other units prefaced by dB that express a particular unit on a logarithmic scale, such as dBSPL (sound pressure level) and dBV (voltage).

‘dBrX’ is an expression of ‘decibels relative to reference value X’, and you may see values such as ‘dBrA’ used on our measurements, where A is a reference voltage.

dBfs / 0dBfs:

dBfs is short for ‘decibels full scale’. 0dBfs represents the maximum, or ‘full scale’ level of a device. In the case of a DAC, this is the maximum value it can output, and for an ADC, it is the maximum value it can record. Any signals above 0dBfs will become ‘clipped’ as the device is not capable of outputting or recording the required/intended value. (See ‘clipping’ for more info)

dbV:

dBV is short for ‘decibel Volt’. It is a logarithmic scale for voltage, with 1Vrms being equal to 0dBV.

For a decibel scale, every 6dB is a 2x change in value. So 6dBV = 2Vrms, 12dBV = 4Vrms, and -6dBV = 0.5Vrms.

DAC (Digital to Analog Converter):

A digital to analog converter is a device which takes a digital signal input, typically either PCM or DSD audio, and converts it to a continuous analog output signal.

DC (Direct Current):

Direct current is a one-directional flow of electric charge, where voltage remains positive or remains negative, as opposed to AC (alternating current) signals such as audio in which the voltage alternates between a positive voltage and negative voltage.

DC is typically referred to when discussing power supplies. Audio devices should not have a significant level of DC as this may damage connected devices.

DC can be eliminated via the use of a transformer or DC blocking capacitor, both of which only allow AC signals to pass through.

DC Offset:

DC offset is a measure of how much a signal is offset from 0V.

A sine wave with peaks of +1V and -1V has 0V DC offset. A sine wave with peaks of +1.5V and -0.5V has a +1V DC offset. Typically this also means that when nothing is playing, the output of the device would show a steady +1V DC voltage.

Audio devices can sometimes have a DC offset, which could pose a risk to connected devices. A DC offset present on a signal can also cause an ADC to perform worse, and therefore many ADCs will be ‘AC coupled’ to prevent this.

DC Coupled:

A DC coupled device allows both AC signals and DC signals to pass through. Compared to an AC coupled device which has a high-pass filter, allowing AC signals to pass through, but blocks DC signals. DC coupled amplifiers have no low-frequency cutoff, and may amplify DC offset on the incoming signal if present.

DDC (Digital to Digital Converter):

A DDC or digital to digital converter is a device that converts an incoming digital signal to another format. Most commonly, this involves taking a USB input and providing I2S, AES and SPDIF outputs. However the term can be used more broadly to describe any device with digital input and digital output. Some types of digital to digital converter are also called ‘reclockers’, as their primary intention is to provide a digital output with extremely low jitter to a DAC.

Delta Sigma:

Delta Sigma is a description given to a category of DACs which instead of converting PCM signals at their native bit depth (such as an R2R DAC), use delta-sigma modulation to convert the high bit depth, low sample rate PCM information to a much higher sample rate, low bit depth format.

Almost all modern DACs are delta sigma. It was created because producing an R2R design accurate to 16 bit or higher is exceptionally difficult, and creating a for example 1 bit or 5 bit converter that can run at extremely high speed is much easier.

Mathematically, speed can be traded for native resolution. If you need to output 80% voltage, but the DAC is 1-bit, meaning it can only output 100% or 0%, then you can instead run it at a very high speed, switching the output on and off a few thousand times, with it being ‘on’ about 80% of the time. When you pass this signal through a low pass filter to effectively ‘average out’ the result, the resulting voltage is 80%.

Differential:

(Please read ‘single ended’ and ‘balanced’ first)

A differential audio configuration is a two wire connection that provides the signal on one conductor, and an inverted polarity version of the same signal on the other conductor. A differential amplifier amplifies the difference between these two signals.

Due to the 6dB higher differential signal, a differential signal will have a 6dB higher signal to noise ratio than a balanced signal.

Digital:

Digital information is stored as discrete, individual figures. Digital audio for example stores amplitude information in samples, typically 44100 times per second. Nothing exists between them, and these samples have a finite level of precision according to their bit-depth.

This is in contrast to ‘analog’ where a physical property varies continuously and could in theory be measured with infinite timing and amplitude precision with the right tools. A slider that can be set to any position between two points is ‘analog’.

Discrete:

Discrete is a term used to describe a design that uses individual, dedicated components rather than ICs (integrated circuits) which combine several different components into a single device. Most commonly used to denote that an amplifier or DAC output stage uses discrete transistors and other components instead of IC opamps.

Distortion:

Distortion is any unwanted alteration to a signal. If a pure 1khz sine wave is played through a device, the result would ideally be a pure 1khz sine wave. However any real device will change the signal slightly, and any change is considered a distortion of the original signal.

Distortion can be broken down into different types, the most common of which is ‘THD’ (See ‘THD+N’ for more information), however distortion as an umbrella covers any unintended change to a signal.

‘Linear’ distortion refers to types of distortion where the signal is altered without producing additional frequencies. Such as an incorrect change in amplitude.

‘Nonlinear’ distortion refers to types of distortion where the signal is altered and additional frequencies are added, such as harmonic distortion.

Dither / Dithering:

Dithering is the process of adding a small amount of random noise to digital audio, to mask quantization errors and distortion that occur when reducing the bit depth of an audio file.

It seems counterintuitive, but works very effectively. Without dithering, truncation distortion can occur.

The most standard type of dithering is ‘TDPF’ or triangular density probability function. But some products or tools use high performance noise shaping to push the quantization noise outside of the audible band where it can be filtered out, leaving higher effective dynamic range within the audible band.

Driver:

A ‘driver’ is the part of a speaker or headphone that moves or ‘drives’ the air. In speakers this is the cone shaped part you see which moves. This is known as a ‘dynamic driver’. Dynamic drivers are the most common, but there are others such as planar magnetic, electrostatic, and balanced armature.

‘Driver’ can also refer to a software component needed on some systems for the full functionality of a USB DAC or audio interface.

DSD:

DSD (Direct Stream Digital) is an alternative method of storing and converting digital audio.

The vast majority of modern digital audio uses PCM (See ‘PCM’ for more info), which is a low sample rate, high bit-depth format. DSD is a very high sample rate format with a bit-depth of only 1-bit.

DSP:

DSP stands for ‘Digital Signal Processing’. This encompasses any process that alters the content of digital audio, such as upsampling, EQ, digital volume control, and digital crossfeed.

Dummy Load:

A dummy load is a device used when taking measurements of a headphone amplifier. It provides a specific impedance load that simulates having a headphone connected. A real headphone cannot be used as the behaviour of the headphone itself would impact the measured results.

Dynamic Range:

Dynamic range is a ratio of the maximum signal level a device can output to the noise floor. (See ‘AES17 Dynamic Range’ in the Measurement Explanation section for more info)

Electrical Reflection:

An electrical reflection is when a signal transmitted through a cable reflects back down in the other direction. The reflected signal overlays the outgoing signal and can cause interference to the original signal.

This is not an issue in audible range analog connections, but can cause issues with some very high speed digital connections. The most effective solution is to use a line termination resistor, which receives the electrical energy from the signal and prevents it reflecting back down the cable. This works best when the termination resistor is equivalent in impedance to the cable’s own characteristic impedance.

ENOB (Effective Number of Bits):

ENOB is a measure of the dynamic range of a DAC or ADC. It is the effective resolution in bits of a device after nonlinearities and noise is taken into account. An ENOB of 16 bit equates to -96dB THD+N. An ENOB of 24 bit equates to -146dB THD+N.

ENOB = (SINAD - 1.76) / 6.02

Even Order:

Even order harmonics refer to harmonics that occur at even multiples of the fundamental frequency. For 1khz, this would be 2khz, 4khz, 6khz and so on. For 3khz this would be 6khz, 12khz and so on.

Feedback:

Feedback is a method of correcting for nonlinearities in an amplifier circuit. The most commonly used method is ‘negative feedback’ whereby a portion of the output signal of an amplification circuit is fed back to the input, and this signal is subtracted from the input signal itself. This means that distortion present on the output is subtracted from the input and should be canceled out to a large extent. (Read more here)

Opamps facilitate negative feedback loops as the output can be looped back to the inverting input of the opamp and summed with the non-inverting input.

FFT (Fast Fourier Transform):

Fast Fourier Transform is a method of separating and displaying a signal as a measure of its individual frequency components. (See ‘FFT’ in ‘The Basics’ section for more information)

FFT Gain:

FFT gain or FFT processing gain is the effect whereby the noise floor shown on an FFT is reduced when a higher point FFT is run, even if the actual noise level has not changed. This can also be taken advantage of to inspect content that exists below the noise floor of a device.

This is because of the fact that FFT displays each ‘bin’ or frequency range as a measure of amplitude. If you increase the number of bins in the FFT, and therefore the number of frequencies that the measured noise signal is divided into, probability dictates that less of the noise will fall into any one bin because there are now more of them.

If our noise of 1Vrms exists between 10hz and 100hz, and we only have two bins in that range, then each bin would show roughly 0.5Vrms amplitude because all the noise has a roughly 50% chance of falling in one or the other.

If we increase the number of bins to 4, then each bin would show roughly 0.25Vrms amplitude as noise now only has a 25% chance of falling within one of these 4 bins.

Filter:

A filter alters the amplitude of frequencies within a specified range. This can be implemented digitally or in the analog domain.

The most common types of filters you are likely to see discussed in audio are bandpass, lowpass, highpass, peak, and shelf. DACs have ‘reconstruction filters’, though this is actually just a name for a particular design of digital lowpass filter.

Frequency domain:

The frequency domain describes signals as a function of frequency and amplitude, but not time. The most prominent example being FFT. An FFT shows you the frequency and amplitude information for a signal, but does not show any information relating to time.

Gain:

Gain is a measure of how much a signal is amplified or attenuated, usually expressed in decibels.

Applying 6dB gain to a signal doubles the voltage level.

Harmonics:

A harmonic is a type of distortion or unwanted signal content that occurs in all real devices. It is a small amount of added signal content at a multiple of the fundamental or input signal frequency. For example the harmonics of a 1khz sine wave would be 2khz, 3khz, 4khz sine waves and so on.

High Pass Filter:

A high pass filter is a filter that only allows signals above a specified frequency to pass through, hence ‘highs pass through, high pass’.

HRTF (Head Related Transfer Function):

HRTF is the transfer function of a human head (and torso where applicable). It describes how sound is altered by the physical shape of the human body, particularly the head, outer ear and inner ear, before reaching the eardrum itself.

I2S:

I2S (sometimes labelled IIS) is an audio communication protocol most commonly found internally within DACs. It consists of a word clock line, bit clock line, a positive data line and negative data line, occasionally with additional lines for tasks such as flagging when content is DSD instead of PCM.

I2S has also become popular as an external connection method between DACs and digital sources, most commonly over an HDMI connection.

This external method of I2S communication is called ‘I2S LVDS’ and whilst it uses the HDMI connector and cable, it is NOT the same as HDMI and connecting an HDMI source to an I2S device or vice versa may cause damage.

Impedance:

Electrical impedance, denoted ‘Z’ and measured in Ohms (Ω) is the opposition to the flow of current. A high impedance load will draw less current for a given voltage than a low impedance load.

It is most commonly found as part of the specification of a headphone or speaker, and additionally devices such as DACs and amplifiers will have an ‘input impedance’ or ‘output impedance’ specification. (See ‘input impedance’ and ‘output impedance’ for more info)

Standard impedance should not be confused with characteristic impedance (See ‘characteristic impedance’ for more info).

Additionally in other situations, ‘impedance’ may refer to a non-electrical property, such as acoustic impedance which describes the resistance to change in sound pressure level.

Impulse Response:

Impulse response shows how a pure impulse, which mathematically represents all frequencies at equal levels, is altered when being passed through a device. Thereby describing how the device alters content being passed or played through it. (See ‘Impulse Response’ in the Measurement Explanation section for more info)

Integrated Amplifier:

An integrated amplifier is an amplifier that combines the functionality of both a preamplifier and a power amplifier in one unit, allowing volume to be controlled, and often also input switching, without the use of a second product.

Intermodulation:

Intermodulation is the alteration of amplitude of one signal caused by interference from another. This produces distortion not at a harmonic (multiple) of one frequency, but at multiples of the difference between the two frequencies, and the sum of the two frequencies. For example intermodulation distortion between a 7khz tone and 60hz tone would produce distortion components at 7060hz, 7120hz, 6940hz, 6880hz and so on. (See ‘IMD’ in the Measurement Explanation section for more info).

Interpolation:

Interpolation is the process of estimating or calculating what the value between two known points should be. In audio, this typically refers to oversampling, whereby a DAC must interpolate and add new PCM samples between the existing 44.1khz PCM samples in order to reconstruct the original recorded waveform.

Intersample Overs:

Intersample Overs are when the sample values necessary to describe the original analog waveform fall above the maximum possible digital value. This occurs when music mastered with insufficient headroom is played through an oversampling DAC that has no internal digital headroom. (See ‘Intersample Overs’ in the Measurement Explanation section for more information)

J-Test:

The J-Test is a test used to evaluate the jitter performance of a device. (See ‘Jitter’ in the Measurement Explanations section for more info)

Jitter:

Jitter is the name given to time domain inaccuracies. Because digital audio is composed of samples at even intervals, not only must the data itself be converted accurately, but also it must be converted at precise intervals. (See ‘Jitter’ in the Measurement Explanations section for more info)

LCR Meter:

An LCR meter is a tool used for measuring inductance (L), capacitance (C), and resistance (R) of an electrical component or cable.

Linear, Time Invariant:

Linear describes a system where the output is precisely proportional to the input in both amplitude and time. This means it would be free from any alteration such as distortion or noise.

A change in input amplitude of X will result in precisely a change in output amplitude of X.

Time Invariant means that inputting a signal Y now, will result in precisely the same output as if it were input 2 seconds later, other than a delay of 2 seconds. The output does not depend on when the input is provided.

Most audio systems are considered to be approximately linear. With ‘nonlinearities’ contributing to a change in the output signal such as harmonic distortion and noise determining how linear a system is.

Most audio systems are considered to be ‘time invariant’, as they do not have properties which change over time. (Very long term effects such as capacitor aging are not considered in this. It almost always refers to only effects within a practically applicable timespan).

Linear Power Supply:

A linear power supply is a power supply that first uses a transformer to ‘step down’ or reduce the incoming AC mains voltage, and then rectifies the low-voltage AC to DC.

Linear power supplies are less efficient than switching/switch-mode power supplies, and are both larger and more expensive due to the need for a sizeable transformer, but produce no switching noise.

Line Level:

In audio, ‘Line Level’ refers to the typical or expected signal level on an XLR or RCA connection. For XLR this is 4V RMS, and for RCA this is 2V RMS.

Line level is the most commonly used value for these connections, but they are not hard and fast rules. Most audio products will happily accept signals higher than normal line level, and many modern DACs have output levels of about 5V RMS on XLR.

Load:

The ‘load’ is the device or component connected to the output of an amplifier. ‘Unloaded’ means there is nothing connected, with nearly infinite impedance between the signal and ground and therefore no current being drawn.

A load may be described as ‘difficult’ if it is low impedance, as this will draw more current for a given voltage than a high impedance load.

Typical loads are speakers, headphones and dummy loads. But may also describe amplifiers themselves, such as when an amplifier is connected to a DAC.

Amplifiers will usually be designed to have extremely high input impedance, so that they do not pull almost any current from the source device they are connected to.

Loopback:

A ‘loopback’ is when the output of a device is fed back to the input of the same device. This is most common when doing a loopback on an audio analyzer, to check what the baseline performance of the analyzer itself for a particular measurement is, and ensure that it is not limiting the measured performance of a device.

Low Pass Filter:

A low pass filter is a filter that only allows frequencies below a specified cutoff frequency to pass through, hence ‘lows pass through, low pass’.

Mains Leakage:

Mains leakage is the term used to describe when mains electricity causes interference with a device. Showing as 50hz (or 60hz depending on region) and harmonics of that frequency.

This is most common in devices with linear power supplies, as the transformers cause magnetic interference with the rest of the circuit.

Masking:

Perceptual masking in audio is when the audibility of one signal is reduced due to the presence of another signal.

An example being harmonic distortion, where harmonics lower in order (closer to the fundamental) are less audible as they are ‘masked’ by the fundamental tone.

A low level 2khz tone may be audible if played by itself, but may not be in the presence of a higher level 1khz tone as the 1khz tone masks it.

Modulate / Modulation:

Modulation is the process of varying one or more properties of one signal (called the carrier), with a separate signal.

This can be intentional, such as DSD, which uses pulse density modulation, modulating the frequency of equal amplitude pulses using an audio signal.

But can also be unintentional, such as when variations in clock output (jitter) modulate the audio waveform.

Mono:

The term used to describe an audio system that has only one audio channel. A system with two speakers can still be mono if both are outputting the same signal.

Nonlinear / Nonlinearity:

A nonlinearity is any behaviour in a system whereby the output is not directly proportional to the input. A perfectly linear system would have no noise or distortion of any type that was not present on the input signal.

NOS (Non-Oversampling):

NOS describes when a DAC converts the incoming 44.1khz PCM data natively, without any oversampling or reconstruction filter.

This is something only doable on some ‘R2R’ DAC designs, as most DACs operate on some form of delta-sigma processing, and must oversample in order to achieve an acceptable level of accuracy.

True NOS requires a converter that is natively accurate to the bit depth of the data (typically 16 bit). Most modern DACs are natively 1-5 bits, but achieve a higher effective number of bits / accuracy through the use of high speed delta-sigma modulation.

Some devices can closely simulate a NOS output, by using a ‘zero-order-hold’ reconstruction filter, whereby they DO oversample and/or perform delta sigma processing, but attempt to hold the output voltage at the level of a sample until the next sample is received. This is not technically NOS as oversampling is still taking place, but the filter design is a digital method of simulating analog zero-order-hold.

NOS is technically inaccurate and does not reconstruct the signal according to Nyquist theorem. But some listeners find it subjectively enjoyable or preferable to oversampling.

Notch Filter:

A notch filter is a type of filter that attenuates a single frequency to an extremely high degree, but ideally has as little impact on surrounding frequencies as possible. This is most commonly used in analyzers and measurement setups to filter out a 1khz tone, leaving only the noise and harmonics to be measured.

Nyquist Frequency:

The Nyquist frequency is the frequency at half the sampling frequency of a device or file.

For standard ‘redbook’ digital audio of 44.1khz, the Nyquist frequency is 22.05khz.

Devices cannot accurately capture or reproduce content above the Nyquist frequency.

Odd Order:

Odd order harmonics refer to harmonics that occur at odd multiples of the fundamental frequency. For 1khz, this would be 3khz, 5khz, 7khz and so on. For 3khz this would be 9khz, 15khz and so on.

Opamp (Operational Amplifier):

An opamp is a voltage-amplifying component that has a differential input (a positive non-inverting input, and a negative inverting input) and single ended output. They are typically an IC (integrated circuit) component, but discrete opamps also exist.

They are commonly used due to the fact that the differential input facilitates the ability to construct a negative feedback loop, where the output is looped back to the inverting input and summed with the non-inverting input. Thereby distortion present on the output is subtracted from the input signal and cancelled out.

Oscilloscope:

An oscilloscope is a device used to measure and display the electrical output of a device as a waveform (amplitude over time). Some can also perform other functions such as calculating frequency or RMS values, calculating rise/fall time, or showing an FFT.

Out of Band:

The term ‘out of band’ refers to any signal that falls outside of the bandwidth being discussed. This typically either refers to content outside of the audible band of 20hz-20khz, or content above the bandwidth of digital audio, above the Nyquist frequency.

Output Impedance:

The output impedance is a measure of a device's propensity to drop in output voltage when the load draws current. (See ‘Output Impedance’ in the Measurement Explanation section for more info)

A high output impedance can alter the frequency response of connected loads if the impedance of the load varies depending on frequency, such as dynamic driver speakers or headphones.

Oversampling Filter:

An oversampling filter (also called a Nyquist filter, reconstruction filter, or upsampling filter) is the filter design used by a DAC, ADC or software to remove and attenuate content above the Nyquist frequency when oversampling/upsampling and reconstruct the original signal accurately. An ideal oversampling filter would remove all content above the Nyquist frequency and allow everything else to pass through.

Some oversampling filters attenuate too early, attenuating some content in the upper frequencies of the audible band.

Some oversampling filters do not attenuate fully by Nyquist. In an ADC, this causes unwanted ‘aliasing’, folding content above the Nyquist frequency back down into the audible band. In a DAC, this causes unwanted ‘imaging’ (not the same as the subjective term), reflecting audible band content up into the ultrasonic frequencies.

Passband:

The ‘passband’ is the range of frequencies which a bandpass filter allows to pass through.

PCM (Pulse Code Modulation):

PCM is the main way in which digital audio is stored. It consists of samples which describe the amplitude of an audio signal at that point in time, and these samples come at a frequency of usually 44.1khz.

It is an efficient way of storing audio digitally, as according to Nyquist Theory, it can capture, and we can reconstruct, the original audio signal up to a maximum of half the sampling frequency. For 44.1khz audio, this is 22.05khz.

Pentaconn (4.4mm Balanced):

Pentaconn is the name given to a 5-pole 4.4mm jack connector used for analog audio, most commonly found on balanced IEMs and headphones.

Phase:

Phase describes the alignment of a periodical/repeating waveform. It is measured in degrees.

Two sine waves that are ‘in phase’, or at 0deg/360deg will rise, peak and fall at exactly the same time.

Two sine waves that are ‘out of phase’ by 180 degrees, will have one rise as the other falls and vice versa, moving exactly opposite.

‘Absolute Phase’ or ‘absolute polarity’ refers to whether two audio channels, which are in phase with each other, are in phase with the intended signal. This is not usually audible (and many recordings themselves have inverted phase for some or all elements), but can be in some situations. ‘Relative phase’, which describes whether the left channel is in or out of phase with the right, is much more important and immediately obvious when it is incorrect.

Polarity:

In audio specifically, polarity refers to whether an audio signal/waveform is completely aligned with another, or inverted.

Two signals would be said to have the same polarity if when one increases in voltage from -1V to 2V the other does the same.

Two signals would be said to have inverted polarity if when one increases in voltage from -1V to 2V the other decreased from 1V to -2V.

Inverted polarity signals are used in balanced and differential connections.

Post Ringing:

Post ringing is a term used to describe an oscillation on an impulse response output after the impulse itself. As opposed to ‘pre ringing’ where the oscillation occurs before the impulse itself.

A minimum phase system will have only post ringing and no pre ringing. A linear phase system will have equal pre and post ringing.

Potentiometer:

A potentiometer is a voltage divider component, most commonly used for amplifier volume control. When the user turns the volume knob, this turns the potentiometer, and alters the signal level being fed to the amplifier itself, thereby adjusting volume.

Power:

Power is a measure of work done, it is equal to voltage multiplied by current. This is most commonly used to describe the maximum output level an amplifier can supply for a particular load impedance.

Power is not a hard limit however, and a device capable of supplying up to 6 Watts of power may not perform equally at all output levels up to that point. (See ‘Power’ in the Measurement Explanation section for more info)

Power Amplifier:

A power amplifier is an amplifier intended for driving speakers, that does not have inbuilt volume control. It is intended to be used with an external pre-amplifier for volume control and input switching functionality.

Pre-Amplifier:

A pre-amplifier is a device intended to provide functionality of volume control, input switching, and sometimes to introduce a favourable colouration to a system.

Many audio systems use ‘power amplifiers’ which have no inbuilt volume control, and so a pre-amplifier is used for this purpose. A pre-amplifier is a line-level device and is not intended for driving speakers or headphones, though some may have additional features such as a built in headphone amplifier, left/right balance control or EQ.

Pre Ringing:

Pre ringing is a term used to describe an oscillation on an impulse response output before the impulse itself. As opposed to ‘post ringing’ where the oscillation occurs after the impulse itself.

A minimum phase system will have only post ringing and no pre ringing. A linear phase system will have equal pre and post ringing.

Pulse Density Modulation (PDM):

Pulse density modulation is a method of representing an analog signal using a stream of equal value pulses. Most commonly found either in the ‘DSD’ file format, or some DAC topologies.

A fixed frequency system is used, and for each frequency interval the output can either be on (“1”) or off (“0”) for the full interval.

The higher in amplitude the intended analog output should be, the more frequently the intervals will be ‘on’, ie: the more dense the pulses will be. And the lower in output the intended analog output should be, the more frequently the intervals will be ‘off’, ie: the less dense the pulses will be. If the analog signal was intended to be 80% of maximum output, roughly 80% of the pulses would be ‘on.

By modulating the density of these pulses, and then applying a low-pass filter to the output, the analog signal is produced. PDM systems become more accurate the faster they operate, as their effective dynamic range is directly proportional to how many pulses can be modulated in a given time period.

Pulse Width Modulation (PWM):

Pulse width modulation is a method of representing an analog signal using a stream of equal value pulses. Most commonly found in some DAC topologies.

It is similar to PDM in that a fixed frequency system is used, but different because for each frequency interval, the pulse can be ‘on’ or active for a fraction of it. By modulating the width of pulses, and then applying a low-pass filter to the output, the analog signal is produced. PWM systems become more accurate either by increasing their speed, as their effective dynamic range is directly proportional to how many pulses can be modulated in a given time period, or by increasing the degree of precision to which the width of each pulse can be adjusted. Most PWM systems in audio products have a width-modulation accuracy of about 5 bits, or 31 different potential width values.

PWM has an advantage over PDM in that switching frequency is constant, rather than varying depending on signal content.

Redbook:

Redbook is the nickname given to standard CD quality audio, which is 16bit, 44.1khz. The name originates from one of the US government publications on CD format specifications, of which CD audio specifications were contained within a red book.

R2R:

R2R (R-2R) is a digital to analog conversion design which uses two precision resistors for each digital bit, to convert a digital binary number into an analogue output signal proportional to the value of the digital number.

Because of the repeating nature of this R-2R circuitry (16 times for 16 bit for example), it looks similar to, and is often referred to as a ‘ladder’ or ‘resistor ladder’.

R2R DACs have the benefit of requiring no digital processing of PCM or ‘Pulse Code Modulation’ signals, and can convert PCM digital audio natively, unlike delta sigma designs, but the disadvantage that their accuracy primarily relies upon having extremely tight tolerances in the resistor values. To build an R2R ladder accurate to 16 bit without any additional error correction methods would require all resistors to be within 0.0016% of their intended value.

Most modern R2R DACs use some form of error correction or compensation to achieve higher accuracy.

RCA:

RCA (sometimes referred to as ‘cinch’) is a single channel, single ended analog connector intended for carrying line level signals. Commonly found on DACs and amplifiers.

Reconstruction Filter:

An reconstruction filter (also called a Nyquist filter, oversampling filter, or upsampling filter) is the filter design used by a DAC, ADC or software to remove and attenuate content above the Nyquist frequency when oversampling/upsampling and reconstruct the original signal accurately. An ideal oversampling filter would remove all content above the Nyquist frequency and allow everything else to pass through.

Some oversampling filters attenuate too early, attenuating some content in the upper frequencies of the audible band.

Some oversampling filters do not attenuate fully by Nyquist. In an ADC, this causes unwanted ‘aliasing’, folding content above the Nyquist frequency back down into the audible band. In a DAC, this causes unwanted ‘imaging’ (not the same as the subjective term), reflecting audible band content up into the ultrasonic frequencies.

Relay Attenuator:

A relay attenuator is a method of volume control whereby instead of signal being passed through and attenuated using an adjustable potentiometer, it is passed through resistors to attenuate the signal. The signal path and which resistors the signal is passed through is altered through the use of switching relays.

This is a specific type of stepped attenuator which is electronically controlled.

Residual:

Residual signal in audio is the signal you get if you take away the ‘intended’ signal.

For example, if you play a 1khz tone through the device, the ‘residual’ is everything coming out of the output that is NOT the 1khz tone. It is all of the distortion and noise.

Ringing:

Ringing is the term given to oscillations visible on the output of an impulse response test. Ringing occurs when a signal with content above the bandwidth of the device/system being tested is put through it.

It can also show in other situations where signal content may exceed the bandwidth of a device, such as square waves or when a waveform becomes clipped.

RMS:

In simplest terms, RMS can be considered roughly as the ‘average value’.

Mathematically, the root mean square of a set of numbers is defined as the square root of the mean square (the arithmetic mean of the squares) of the set.

For alternating electric current (such as audio signals), RMS is equal to the value of the constant direct current (DC) that would produce the same power dissipation in a resistive load. An example being a 4Vrms sine wave, which despite having a +/-5.656V peak, dissipates the same power as a constant 4V DC signal would for a given load.

You may often see voltage values given as ‘Vrms’ to indicate that it is an RMS and not absolute value.

However in many instances when discussing audio, this is simply shortened to V rather than Vrms.

Sample Rate:

Sample rate is the frequency of samples in modern PCM digital audio, where each sample describes the amplitude of the waveform at that point in time.

The most common and standard sample rate for music content is 44.1khz.

Servo (DC Servo):

A DC servo circuit is a type of negative feedback system primarily intended to eliminate DC offset at the output of an amplifier without needing to put a DC blocking capacitor or transformer in the signal path.

SINAD:

SINAD (Signal above noise and distortion) is the reciprocal of THD+N. It is the same measurement but expressed in dB instead of a percentage.

Sine Sweep:

A sine sweep is a measurement where a single sine wave of increasing frequency is played through a device. This can be used to test many factors such as frequency response or THD+N vs frequency.

Slew Rate:

Slew rate is the rate at which a device can increase or decrease the output voltage. It is directly related to bandwidth as higher frequency signals require a faster slew rate.

Slew rate = 2 π f V

f = maximum frequency

V = peak voltage

Surface Mount Device (SMD) / Surface Mount Technology (SMT):

SMD/SMT refers to a method in which electrical components are mounted directly onto the surface of a printed circuit board (PCB), as opposed to ‘through hole’ mounted devices which have leads which go through holes in the PCB and are then soldered.

SMT devices are usually smaller than through hole equivalents as they require no leads and can be installed using precise, automated machinery.

This has many benefits such as being able to fit more components into an area on a board, reducing cost per component, and allowing for multi-layer PCB technology to operate directly beneath the component, but SMT devices are typically much more difficult or nearly impossible for the average user to repair.

Switch Mode Power Supply / Switching Power Supply (SMPS):

A switching power supply is a power supply which converts incoming mains AC to DC through the use of very high frequency switching/DC-pulsing components and filter capacitors. (See this article if you’d like to know more about how they work).

Switching power supplies have advantages over linear power supplies in that they can be much smaller as the high frequency operation allows the step-down transformer to be more compact, they have much higher efficiency, and are typically both much cheaper and smaller.

They also will not produce as much 50hz/60hz mains leakage as they have no large transformer, but instead produce high frequency switching noise which must be filtered out.

SPDIF:

SPDIF (Sony/Philips Digital Interface) is a digital audio communication protocol found in many audio products.

SPDIF can be transmitted electrically, typically over an RCA or BNC cable which should have a characteristic impedance of 75 Ohms, or it can be transmitted optically, using the TOSLINK connection.

Optical SPDIF carries the same information as electrical SPDIF, just using amplitude of light instead of voltage.

SPDIF encodes a clock signal within the data stream, and as such the SPDIF source can in some cases have an impact on performance of the receiving device.

Sound Pressure Level (SPL):

Sound pressure or acoustic pressure is the local pressure deviation from the ambient atmospheric pressure, caused by a sound wave. It is expressed in decibels.

Stepped Attenuator:

A stepped attenuator is a signal attenuation device typically used for volume control instead of a potentiometer.

The user turns a control, which puts various resistors in the signal path to attenuate it depending on position. An electronically controlled stepped attenuator is often referred to as a ‘relay attenuator’ as it uses relays to switch which resistors are in the signal path, rather than the user manually adjusting it.

Stereo:

A system with two audio channels, typically left and right.

Streamer:

A streamer is a device used to play music over a network. Either by retrieving files from network accessible storage, or by providing a streaming endpoint for services such as Roon, Spotify, Tidal and DNLA/UPNP casting.

A streamer is typically a digital only device, and must feed an external DAC, but some devices combine both streaming and digital to analog conversion in one unit.

THD+N:

THD+N (Total harmonic distortion plus noise) is a measure of the RMS value of all of the harmonic distortion and noise at the output of a device. It can vary depending on the frequency of the test signal, output level, and the load as well as other factors, and gives a good indication of how well a device is performing in various situations.

The RMS value of ONLY the harmonics can also be measured, whilst excluding noise. This is simply THD, without the +N.

Time Domain:

The time domain describes signals as a function of amplitude and time, but not frequency. The most prominent example being a scope or waveform view. A scope view shows you the amplitude of a signal over time, but does not describe frequency.

The time domain and frequency domain are mathematically linked and one can be derived from the other using fourier transforms or inverse laplace transforms.

Tone:

A signal of a particular frequency.

Topology:

The way in which parts are arranged or a product is designed. More broadly can be used to describe a design as a whole, but is typically used to refer to a principle of operation, for example ‘A class D topology’ or ‘R2R topology’.

TOSLINK (Optical SPDIF):

TOSLINK (Toshiba Link) is a standardized optical fiber connection system used for transmission of digital audio. It uses the SPDIF transmission protocol.

Transducer:

A transducer is a device that converts energy from one form to another. In audio, this almost exclusively refers to drivers, which convert electrical energy into acoustic/sound-pressure energy.

Transient:

A transient is a high amplitude, short duration sound at the beginning of a waveform that occurs in music. An example being the initial crack of a drumstick against the skin before the reverberation or body of the drum itself.

Undithered:

A signal which has not had dithering applied. (See ‘dithering’ for more info)

Upsample / Upsampling:

Upsampling is the process of applying a Nyquist reconstruction filter to digital audio, and in the process converting it to a higher sample rate, before feeding it to a DAC. This is usually done to facilitate the use of a higher performance reconstruction filter than what the DAC uses internally, and can sometimes be accompanied by processes such as high performance noise shaping/dithering. (See ‘Nyquist Reconstruction Filter’ or ‘Oversampling Filter’ for more info)

Voltage Source Amplifier (Voltage Mode / Voltage Drive):

A voltage source amplifier is an amplifier that regulates the level of voltage supplied to a load, rather than regulating current. Almost all amplifiers on the market are voltage source amplifiers.

Voltage source amplifiers have an extremely high damping factor and can be easily identified by a low output impedance. The output impedance of most headphone amplifiers is typically around or lower than 1 Ohm.

If a device lists terms such as ‘current feedback’ or ‘current gain’, that does not mean it is not a voltage source amplifier and these terms refer to other aspects of amplifier design. To regulate voltage an amplifier must have an output impedance LOWER than the load. To regulate current an amplifier must have an output impedance HIGHER than the load.

A device that has an output impedance equal to the load impedance is sometimes referred to as ‘power mode’ by some community members, as the amplifier is regulating voltage and current equally, so it could be described as regulating power.

Volume:

Volume is the degree of loudness or intensity of a sound. Real volume is more accurately defined as sound pressure level (SPL), but is affected by factors such as amplifier gain and headphone/speaker sensitivity.

Waveform:

The waveform is a graphical representation of the amplitude of a signal over time. It is the time domain view of audio. Sometimes describes as a ‘Scope View’ as oscilloscopes show the waveform of a signal, as opposed to an FFT which shows the frequency domain view.

XLR:

The XLR connector is a type of electrical connector found in audio. Most commonly used for balanced and differential audio systems.

XLR-3 has three pins, and is commonly found as an interconnect between a balanced DAC and amplifier. XLR-4 has four pins and is commonly used for balanced headphones.

This is a companion discussion topic for the original entry at https://headphones.com/blogs/features/the-glossary-of-audio-measurements-and-terms

GoldenSound · April 3, 2023, 3:39pm

Hopefully this is useful for many!

generic · April 3, 2023, 3:44pm

Longest post award!

pk500 · April 3, 2023, 6:12pm

Effing A: This is SUPER comprehensive and informative.

Well done. Thank you!

Topic		Replies	Views
The Glossary of Audio Measurements and Design Terms! Audio Science	2	406	April 4, 2023
"Musical" vs. "Analytical/Sterile/Clinical" - Meaning of Terms? Audio Science	78	6398	May 7, 2020
Recommended Reading: Understanding Audio Measurements Audio Science	1	1109	January 3, 2020
Measurements: Charts, Graphs, Software & Methods Audio Science	203	18077	November 26, 2024
Audibility, Measurements & Test Tones Audio Science	6	1106	November 25, 2018