We often hear that Harman’s over-ear target is “meant to capture the sound of good speakers in a good room in over-ear headphones,”
but… how well does it actually do this?
Additionally, something I’ve wondered lately is… Why is this the description given, when the bass shelf and treble shelf Harman chose for their adjustments (from 2013 and onward) don’t really make an “anechoically flat” (or raw DF HRTF for headphones) speaker measure more like a good speaker in a good room would?
Before going further it must be said: People say this because it is true. Sean Olive’s work has proven that, not only are speakers that measure with these features (positive bass and negative treble shelf) preferred over speakers that measure flat in a typical listening room, but similar corrections are preferred when applied to headphones..
That being said, it’s never been tested if these are the only adjustments that would be preferred in both speakers and headphones. So humor me a bit, as I talk about what I think could be another approach to approximating “good speaker-like timbre” in headphones.
Consider the below CEA2034 plot of a “good” speaker—I just picked the first one I saw on Spinorama.org, doesn’t really matter which given “good” speakers tend to behave reasonably (though not exactly, which we’ll get into) similarly in this regard.
Fig. 1 - CEA2034 output for Ascilab F6b speaker
Inspect the red (Early Reflections) and blue (Sound Power) curves. The Early Reflections curve is an estimate of all of the single-bounce first-reflections this speaker will have when placed in a typical listening room, while the Sound Power curve is the sum of the total emitted acoustic energy of a speaker as measured globally (at 360°, with measurements every 10°) around the speaker horizontally and vertically.
One should also inspect the green/grey (Early Reflections Directivity Index and Sound Power Directivity Index, ERDI and SPDI respectively) curves, which are the difference between the red or blue curves and the “On Axis” curve. This essentially allows us to isolate the impact of the indirect sound of the speaker’s design from the direct (On Axis) sound.
Upon doing so, it becomes clear that due to what i’ll summarize here as the “room impact” arising from the mixing of the On Axis response with Early Reflections and Sound Power, the effect of an anechoically flat speaker being placed in a typical room at a particular distance does not exhibit a distinct 105 Hz or 2500 Hz shelf. Instead we see a gradual downslope.
Fig. 2 - CEA2034 Estimated In Room Response for the AsciLab F6B
It’s prudent to mention that, while this “room impact” is characteristic of this speaker, and many Good™ speakers exhibit similar behavior (like the Genelec 8331A shown below), not all good speakers have similar room impact behavior.
Fig. 3 - CEA2034 output for Genelec 8331A speaker, take note of the similiarity of the ERDI/SPDI curves to that of the AsciLab F6B
As far as I know it’s not been studied if its actually more preferable to target a ERDI/SPDI similar to the F6b (Fig. 1) or Genelec 8331 (Fig. 3) shown above instead of something that would produce an in-room response more like the curves that Harman arrived at. Other, equally Good™ speakers like the Dutch&Dutch 8C in Fig. 4 and Kii Three in Fig. 5 may have a more distinct Harman-like bass feature when placed in a typical room, due to their ERDI/SPDI rising closer to 100 Hz.
Fig. 4 - CEA2034 output for Dutch&Dutch 8C speaker
Fig. 5 - CEA2034 output for Kii Audio Three speakers
While it is certainly up for debate which would be “better” on average, these speakers’ less traditional “room impact” is a result of their design being a bit different (or differently constrained) than typical 2-way or 3-way designs. Admittedly it is anecdotal, but in my experience more “conventionally” designed speakers are more common, much more common in studios, and these types of speakers exhibit behavior more similar to those shown in Fig. 1 and 3.
For this reason, I do increasingly wonder if an adjustment paradigm of a gradual downslope/down-shelf in the midrange—approximating the forces that cause good speakers to naturally downtilt in a room—could be equally or more preferred as headphone preference adjustments to DF than the filter results Harman arrived at for adjusting their In-Room Flat baseline (or even received as more “speaker-like” sounding).
To be clear, I have zero new preference data from blind listening tests to support this idea, so for now it remains a theory. Furthermore, I personally doubt many people would choose to eschew the distinct, noticeable bass of Harman’s popular curves for something with less overtly noticeable bass outside of a blind test—there is a pervasive attitude that “more bass = better” in the consumer and audiophile realms alike.
So with this, I’m mostly aiming to add another possible approach to “preferable” to the mix, one that better resembles the typical behavior of speakers in rooms. But I should also mention that I’m not the first person to see merit in this approach for headphones! Oratory1990 has done something similar, transposing B&K’s Optimum Hifi curve—which is similar—to be used as a headphone target response by (I believe) combining it with Harman’s In Room Flat HATS measurement.
Fig. 6 Oratory1990’s “Optimum Hifi Curve” for over-ear headphones (red) vs. B&K’s original Optimum Hifi curve (black, dotted)
Now, my approach would differ, because for over-ear headphones I would use the Diffuse Field (DF) HRTF of the rig the headphone measurements were performed on as a baseline, because DF is the appropriate HRTF baseline for playback of stereo recordings over headphones.
Eg. if I’m using Oratory1990’s measurements, I would use a preference adjustment applied to the KEMAR DF for the KB500x pinnae. And for IEMs measured on the B&K 5128, I would use a preference adjustment applied to JM-1 Diffuse Field, as that’s currently the best theoretical baseline we have for approximating an average human HRTF with consideration for the fact that IEMs are being measured in the ear canal of the 5128.
To capture this typical “room impact” tilt, I’ve opted to use a single 500 Hz high shelf filter with a Q of 0.4 as a primary preference adjustment on top of a Diffuse Field (anechoically flat, but for headphones) equalized headphone or IEM.
This shelf closely mimics the average corner frequency and slope of the Early Reflections DI and Sound Power DI curves that roughly approximate the forces causing a downsloping in-room response of a “good speaker placed in a good room”, though of course I invert the gain so the shelves are subtractive—the readout of ERDI/SPDI below showing as a rise is because it measures the difference between Early Reflections/Sound Power vs. the on-axis response, not the Early Reflections/Sound Power themselves.
Fig. 7 - 500 Hz Filter at +4 and +8 dB compared to the ERDI/SPDI of a few good speakers
With speakers we prefer the sound of an anechoically flat speaker, but not a “flat in-room” speaker, and this seems to suggest that we prefer the downsloping response that results from “room impact”. Some in our community have theorized that people do not prefer Diffuse Field equalized headphones because this is equivalent to listening to a “flat” speaker, but without the “room impact” that would be preferred in similarly flat speakers in a listening room, and that to counteract this, we need to “add the room impact’s downward slope in via preference adjustments on top of the flat (Diffuse Field) response.”
Whether or not this is the case is up for debate, but I think as far as filters for simulating the typical (preferential) effects that may arise when placing a typical anechoically flat speaker into a typical room, this simple 500 Hz filter is a rather elegant way for listeners to adjust the level of “approximated downslope due to room impact” they prefer added to their flat (Diffuse Field) equalized headphone responses.
After trying this method applied in the form of a subtractive 500 Hz high shelf filter with a Q of 0.4 to DF-equalized headphones and IEMs I’ve gotta say… I really, really enjoy it. Much more than I enjoy any preference target I’ve tried that includes a Harman-like bass and treble adjustment. There’s no overly sonorous or sludgy over-emphasis on bass overwhelming the midrange, but the bass is still mildly emphasized thanks to the downslope—it’s just not emphasized in what I perceive as a cartoonishly obvious way.
The midrange is natural sounding, roughly approximating -0.8 to -1 dB/octave with my preferred adjustment (-5 dB), while also being seamlessly integrated with the bass below it. There’s no exaggerated sense of bass bounce, contrast, or depth, but instead the midrange sounds effortless, coherent, and most of all correct.
The treble is where headphones have significantly more challenges than good speakers do, so it must be said that just equalizing your headphone to the DF curve relevant to your device’s measurements will not be enough to make this curve sound correct (similarly to how simply EQing to Harman will still present significant error in the treble relative to what current measurement analysis paradigms may suggest).
Personalized adjustment will be necessary ≥ 2 kHz to make the treble truly work, but this is true of EQing headphones measured on test fixtures from GRAS or B&K to any target curve—personalization is paramount, do not just blindly follow the line on the screen.
Below in Fig. 9 is what my HD 800 unit equalized to this target looks like on the B&K 5128 when manually accommodating the unique interaction between this headphone and my head/ears (green), compared to an HD 600 (pink). The big spike in the lower/mid treble is likely not present on my head—I certainly don’t hear it—as the outer-ears of measurement fixtures like the 5128 in Fig. 8 typically overestimate mid-treble relative to humans (credit to Sean Olive for this data).
Fig. 8 - Overestimation of treble between the 5128 (green) and a group of humans (blue), measured from headphones at the blocked ear canal entrance.
Fig. 9 - My HD 800 measured on B&K 5128, equalized to a DF-like response with a 500 Hz -5dB shelf with a Q of 0.4 and manually personalized treble (green), compared to HD 600 (pink) and the 5128 DF with a -0.8 dB/octave downslope applied (grey dotted).
My aim with this post is not to prescribe a new target curve, but instead to offer an alternative approach to producing a curve that still accommodates what Harman tells us is likely to be preferred—a downsloping response—but falls more in line with the behavior of a typical speakers than what people may currently be using / predisposed to trying as a preference adjustment. It also has the benefit of, unlike a flat tilt from 20 Hz - 20 kHz, being available to use and easily adjust in essentially any parametric EQ software.
Questions, comments, concerns? Lets talk about it!