Headroom is a foundational concept in audio technology that is often overlooked in the age of digital recording, especially among beginners. Understanding the fundamentals of audio, including the definition of headroom, will enhance your knowledge and, ultimately, your skill when working with audio.
What is headroom in audio? Headroom is the difference between an audio system's maximum signal handling capability and the signal level within the system. For audio devices, headroom is often defined from the nominal level (average level). In mixing/mastering, headroom is often defined from the mix/master bus peak level.
There's much more to headroom than this broad definition, and headroom can mean different things in different circumstances. In this article, we will go deep into the concept of headroom to better understand how it applies to our work in recording, mixing and mastering music and audio.
We'll begin by describing headroom as it relates to the nominal level of audio equipment. After that, we'll describe headphone as it relates to the peak level of audio within a system.
Overview Of Headroom
Before we get into the two different ways headroom is used in communication, I want to give a simplified explanation here. If you're already knowledgeable about audio terminology, this will make sense. Otherwise, if you're still confused after reading the entirety of this article, please return to this section for clarification. I hope this helps!
An audio device will be designed and calibrated to perform its best at a set nominal/average audio signal level. This is often defined as 0 VU (Volume Units), typically calibrated to +4 dBu in professional analog equipment.
The headroom of such devices is the difference between their ideal nominal level and their maximum signal level handling capabilities. Exceeding this maximum level will result in clipping distortion.
Analog systems typically max out at +24 dBu (even though dBu isn't technically a measurement of peak levels). Digital systems always max out at 0 dBFS.
The headroom of an analog device calibrated to a nominal level of +4 dBu with a +24 dBu maximum is 20 dB. It can be said that there is 20 dB of available headroom.
We can use up this available headroom without clipping the signal. The transient peaks of the signal will surpass the nominal level, thereby eating up headroom. So if an analog signal's nominal level is +4 dBu and its peaks happen at +16 dBu, we're left with 24 – 16 = 8 dB of headroom.
Analog systems are calibrated to an ideal nominal level to give enough headroom to avoid clipping while also getting enough signal to avoid excessive noise in the signal. It's a balancing act.
Digital systems don't have inherent “nominal levels” because the noise floor isn't nearly as problematic in digital systems. However, they certainly do have a hard ceiling for maximum signal level handling at 0 dBFS.
We have lots of headroom available in digital systems, and any difference between the highest peak signal level and 0 dBFS is considered “available headroom”. For example, if our mix peaks at -6 dBFS, we have 6 dB of headroom left before digital clipping.
Headroom can be thought of as a buffer that can be used up. The headroom of a system tells us how more signal the system can handle from nominal or peak levels, depending on if we're describing the device itself or the signal within it.
In general, we want to leave plenty of headroom when recording and mixing. Mastering is when we “use up” the available headroom and make the audio “loud” with special processing that brings the levels up while avoiding clipping.
The First Definition Of Headroom (Equipment Headroom)
The term headroom is often thrown around, and sometimes it's not truly understood. It's important to know what it is.
Headroom, technically speaking, is the available level above an audio system's nominal level and its maximum level.
Headroom is defined as the available level between an audio system's maximum level and the nominal level (average level) the audio system is designed for. The maximum level before clipping of fixed-point digital systems is 0 dBFS, while professional analog systems typically max out around +24 dBu.
However, what's confusing is that headroom can also refer to the available level between a system's maximum level and the peak level of the signal within the system. We'll cover this in the second definition of headroom.
To break this down further, we need to understand what nominal level and maximum level mean.
Starting with the easier definition, the maximum level of a system refers to the point at which the audio system will clip.
In fixed-point digital audio, the maximum level is 0 dBFS (decibels full scale). This is the hard ceiling of digital audio. If we attempt to push our levels above 0 dBFS, we'll digitally clip the signal, where the tops of the waveform are completely flattened.
With professional analog audio, the maximum level is often set at +24 dBu (decibels relative to 0.775 volts RMS with an open or unloaded circuit). However, this level may vary depending on the specifics of the equipment. The “maximum level” in the analog realm tends to be much more forgiving, causing “soft clipping”, where the tops of the waveform rounded significantly, though not completely flattened.
While many analog devices (especially mixers) will have metering that extends to the maximum level, many others (especially those with VU meters) will only meter a few dB above nominal “0 VU”. In the latter case, it is practically impossible to meter how far the signal is from the maximum, though we can listen for clipping to tell where the max point is. This is where knowing the inherent headroom above nominal is worthwhile.
It's critical to note that dBu is in reference to 0.775 volts rms (root mean square). rms is the square root of the mean square of a set of audio levels. Because audio signals often have maximum positive and negative peaks, rms is used to give an “average signal level”.
This means that dBu is technically a measure of RMS or “average” signal level, not peak level. It's worth noting that even though dBu is often used to relate peak level, it's technically incorrect.
I dive deeper into the complexities of decibels in my article What Are Decibels? The Ultimate dB Guide For Audio & Sound.
Clipping in digital and analog audio systems causes distortion. Digital clipping tends to sound very harsh and gritty, approaching the sound of a square wave (since the tops of the waveform are flattened). Analog clipping is considered a bit more musical and varies depending on the device being overloaded (tape, tubes, transistors and transformers all have their own characteristics).
The nominal level of an audio system refers to the average signal level that the system is designed to work with.
This one is a bit trickier to wrap our heads around, especially if we haven't worked with much analog equipment. That's because, in purely digital systems, there really isn't a strong case to be made regarding nominal levels, especially in modern 32-bit floating systems.
To understand nominal level, let's instead focus on analog systems before applying the concept to digital audio.
Analog equipment deals with analog audio signals, which are effectively AC electrical signals. This electricity passes through electrical components (resistors, capacitors, inductors, transistors, tubes, transformers, shall I go on?). Each component has its own limitations, which make up the audio device's limitations as a whole.
We must be aware of two limitations, the first of which has already been mentioned: analog audio equipment will have a maximum level at which they'll begin to clip.
The second limitation is the noise floor, which is caused by the inherent noise produced by the electrical components within the analog audio device. All analog equipment will introduce some amount of internally-generated noise to the signal.
The difference between the maximum level and the noise floor is referred to as the device's dynamic range.
So we have a situation where we'll overload the device and clip the signal if the signal level is too high. On the contrary, if the signal level is too low, the signal-to-noise ratio will be low, and we'll have noisy audio from the device onward in the signal chain.
The nominal level is defined as the optimal level the device is designed to work at, ensuring the best dynamic range and headroom. Maintaining the proper nominal level will ensure a good signal-to-noise ratio and give plenty of buffer to keep the peaks of the signal from overloading the device.
Most professional analog devices are designed to operate with a nominal level of 0 VU (Volume Unit), which is calibrated to equal +4 dBu.
Bringing the discussion back to digital, digital systems do not have the same inherent limitations for noise, though they do have an absolute ceiling.
However, digital audio is only really a storage method. If we want to hear the audio, we must convert it to analog (to drive headphones or speakers). Furthermore, if we want to record sound, we use microphones (which are inherently analog) to convert sound into electrical signals before converting it to digital information. Analog instruments like bass guitar, which can be directly injected into a recording device, must also be converted from analog to digital for use in a digital system.
The analog-digital (A/D) and digital-analog (D/A) converters used to interface between analog and digital audio devices have their own nominal level, headroom and noise floor. Therefore, in practice, digital systems aren't completely removed from the concept of nominal level.
Furthermore, many audio plugins, especially those designed to emulate analog processors, will generally have their own nominal level sweet spot.
Unfortunately, there is no single standard to convert between digital and analog levels. Standards range from +4 dBu = -9 dBFS (Belgium VRT) to +4 dBu = -20 dBFS (American and Australian Post), with many A/D-D/A converters and audio plugins operating nominally at -18 dBFS or -20 dBFS = +4 dBU = 0 VU.
If that's at all confusing and you don't have a specifications sheet for your equipment or software, it's okay to assume -20 dBFS as the nominal level for digital audio. Remember that it's the “optimal level”, so maintaining an average of -20 dBFS is recommended but not necessarily crucial.
To give more context to the -20 dBFS recommendation, we can look at the typical +24 dBu analog clipping point and the hard 0 dBFS digital clipping point. If we match these clipping levels and take +4 dBu as our nominal level, we have 0 VU = +4 dBu = -20 dBFS.
How Does “Equipment Headroom” Apply To Recording, Mixing & Mastering?
As you may have gathered from the sections above, the inherent headroom of audio equipment will affect each stage of music/audio production, from recording to mixing to mastering.
We may have heard something along the lines of “don't use up all the headroom of the system”. This means keeping the signal level below the clipping point. When running audio through equipment, the dynamics of the signal will likely cause the levels to exceed the nominal level (0 VU), even if only at the peaks.
Having headroom is essential for keeping the peaks of the audio from clipping. In some cases, it can also be advantageous, as we'll find out in the following paragraphs, to run average signal levels above 0 VU.
Knowing the “sweet spot” and headroom of each device we use will help reduce unnecessary clipping and noise in our recordings. The process of feeding each device with the ideal signal level is known as gain staging. In practice, gain staging means adjusting the outputs of each device to drive the following input at the ideal signal level.
If we consider the concept of nominal level discussed in the previous sections, we know that that is the “sweet spot” for the audio device. Usually, this will be calibrated to 0 VU (if the device has a VU meter), and 0 VU is likely set at or near +4 dBu or -20 dBFS (though many devices differ from this “pseudo-standard”).
Gain staging and headroom are important for the reasons above in both analog and digital audio systems. Proper gain staging will feed the audio device or system near its nominal level, and the inherent headroom of the device will allow for significant peaks in the input signal without distorting. It will also protect against poor signal-to-noise ratios.
This is essential in analog equipment, where the components have noise. Though digital systems are not prone to add inherent noise to the signal, they too benefit from proper gain staging.
There's often a benefit of running signals “in the red” or above the 0 VU mark with analog equipment. In certain pieces of gear, doing so can yield nice saturation in the signal while enhancing the signal-to-noise ratio further. In other pieces of gear, running signals in the red won't sound as good as nominal levels.
It is important to know what will sound good when recorded at nominal level versus above nominal level in a given recording session. It's also important to know what devices in the signal chain should and shouldn't be pushed too hard. This applies to mixing with analog equipment.
In the 1950s, when records were pressed on vinyl or printed to tape, mastering engineers (at one point referred to as “transfer engineers”) began competing to make the loudest records possible. This meant going well above the nominal level to achieve louder records by comparison.
Moving toward digital, we should consider the A/D converters that allow us to record analog signals in a digital system. For example, we could be miking an acoustic guitar to record into our digital audio workstation via an audio interface. The audio interface would convert the analog input signal to a digital signal that the DAW would then be able to record.
While noise may not be a huge concern, maintaining adequate headroom is essential when recording through an A/D converter. This helps avoid clipping the converter on the way in (resulting in digital hard clipping). It also keeps the signal clean across its dynamics. Though many A/D converters boast a wide dynamic range in their specification, the majority will underperform near the top of such range, leading to subtle distortion even if they aren't clipping.
Proper gain staging is also important for audio plugins designed to emulate analog gear for the same reasons mentioned above. It's important to know what the “sweet spot” nominal levels are for each plugin's input and whether the plugin is typically driven into the red or not.
It bears repeating that the typical 0 VU on analog equipment represents +4 dBu with a maximum level of +24 dB, meaning there's 20 dB of headroom. Being “in the red” means being within the 20 dB of headroom. So if we were to convert that to digital (somewhat erroneously), being “in the red” would be between -20 dBFS and 0 dBFS.
To recap, knowing the inherent headroom and nominal levels of our equipment is essential for proper gain staging, maintaining strong signal-to-noise ratios without clipping and distorting the audio.
The Second Definition Of Headroom (Mix Headroom)
The second definition of headroom is perhaps more common in the digital age. Instead of referring to the calibrated inherent headroom of an audio device, headroom is defined as the difference between the maximum level handling capabilities of the systems and the maximum peak level of the audio within that system.
So when you hear things like “leave 3 dB of headroom for mastering” or “leave 6 dB of headroom for mastering”, the advice is related to the peak levels of the audio (in this case, the mix to be mastered).
We've already discussed how both analog and digital systems have maximum levels, but let's quickly touch on peak levels.
The peak level of an audio signal refers to the instantaneous measurement of the audio signal's level in both analog and digital systems. In practice, we're mostly concerned with the highest peak(s) of an audio signal over time, which tend to happen on the transients.
Under normal circumstances, it's the peak levels that we should be concerned with in regard to clipping our audio systems. Remember that 0 dBFS is the digital ceiling, while analog systems are often maxed out at +24 dBu (even though dBu is not technically in reference to peak level, as discussed previously).
So the average level of an audio signal is the rms value over a set window of time, and the peak level is measured each instant. Inherent/calibrated headroom in audio devices allows us to drive the device at a safe nominal signal level with plenty of room for peaks to occur without clipping.
Headroom relative to peak level is commonly used in digital systems, where every signal level is relative to full scale.
In an earlier section, we discussed how VU meters typically only show a few dB above nominal “0 VU”. In the digital world, we're always very aware of the maximum ceiling before clipping (0 dBFS), so knowing the available headroom of the audio peaks is easy.
How Does “Mix Headroom” Apply To Recording, Mixing & Mastering?
When recording, it's important not to peak and clip the signal.
In the analog realm, there is some benefit of recording “hot” (as close to maximum as is safe to avoid clipping) to improve the signal-to-noise ratio.
However, in digital, there's no need to risk overload by recording hot. In fact, as was discussed earlier, A/D converters often begin distorting well before the 0 dBFS limit, even though their specifications may show great dynamic range and maximum signal handling capabilities.
During mixing, gain staging is key. This means feeding each device in line with appropriate signal levels. Although some processors sound great when driven above their “nominal level”, many will introduce unwanted distortion when run too hot (or excessive noise when run too quiet). This is true of analog equipment as well as audio plugin emulations of analog plugins (and even some purely digital plugins).
If you're into mixing, be sure to check out my article Essential Processors/Processes For Mixing Music & Audio.
When mixing, it's advised to keep a good amount of headroom on the mix bus to give the mastering engineer some breathing room. In general, 3 to 6 dB of headroom (between the highest peak and the maximum ceiling) is an advisable target to aim for when sending a mix to a mastering engineer.
Having this much headroom (without hard compression or limiting) will give the mastering engineer a nice, dynamic mix to work with. Achieving loudness is largely a process for mastering, so we don't need to worry about maxing out our headroom in the mix.
If you want to produce louder mixes, be sure to check out My New Microphone's Top 12 Professional Tips To Make Audio Mixes Louder.
A Note On Crest Factor
When discussing nominal and peak levels, it's worth knowing the term crest factor.
Crest factor is the ratio, in decibels, between the peak levels and the rms levels of a signal or mix.
Higher crest factors relate to greater dynamics in the signal or mix. With a high crest factor, we can eat up a significant portion of the headroom while maintaining appropriate nominal levels.
Lower crest factors relate to less dynamics in the signal or mix. With a low crest factor, we can maintain good average levels without eating up much of the headroom. Additionally, at the mastering (or mixing) stage, we can get a higher, “louder” average level before the peaks would clip.
Mastering with limiting effectively increases perceived loudness by reducing the crest factor, limiting the transients while bringing up the rms level.
For more information on limiting, check out the following My New Microphone articles:
• What Is The Difference Between Audio Compression & Limiting?
• Top 10 Best Limiter Plugins For Your DAW