Musings re: Harman's headphone/IEM preference adjustments and how they relate to speakers

We often hear that Harman’s over-ear target is “meant to capture the sound of good speakers in a good room in over-ear headphones,”
but… how well does it actually do this?

Additionally, something I’ve wondered lately is… Why is this the description given, when the bass shelf and treble shelf Harman chose for their adjustments (from 2013 and onward) don’t really make an “anechoically flat” (or raw DF HRTF for headphones) speaker measure more like a good speaker in a good room would?

Before going further it must be said: People say this because it is true. Sean Olive’s work has proven that, not only are speakers that measure with these features (positive bass and negative treble shelf) preferred over speakers that measure flat in a typical listening room, but similar corrections are preferred when applied to headphones..

That being said, it’s never been tested if these are the only adjustments that would be preferred in both speakers and headphones. So humor me a bit, as I talk about what I think could be another approach to approximating “good speaker-like timbre” in headphones.

Consider the below CEA2034 plot of a “good” speaker—I just picked the first one I saw on Spinorama.org, doesn’t really matter which given “good” speakers tend to behave reasonably (though not exactly, which we’ll get into) similarly in this regard.

Fig. 1 - CEA2034 output for Ascilab F6b speaker

Inspect the red (Early Reflections) and blue (Sound Power) curves. The Early Reflections curve is an estimate of all of the single-bounce first-reflections this speaker will have when placed in a typical listening room, while the Sound Power curve is the sum of the total emitted acoustic energy of a speaker as measured globally (at 360°, with measurements every 10°) around the speaker horizontally and vertically.

One should also inspect the green/grey (Early Reflections Directivity Index and Sound Power Directivity Index, ERDI and SPDI respectively) curves, which are the difference between the red or blue curves and the “On Axis” curve. This essentially allows us to isolate the impact of the indirect sound of the speaker’s design from the direct (On Axis) sound.

Upon doing so, it becomes clear that due to what i’ll summarize here as the “room impact” arising from the mixing of the On Axis response with Early Reflections and Sound Power, the effect of an anechoically flat speaker being placed in a typical room at a particular distance does not exhibit a distinct 105 Hz or 2500 Hz shelf. Instead we see a gradual downslope.

Fig. 2 - CEA2034 Estimated In Room Response for the AsciLab F6B

It’s prudent to mention that, while this “room impact” is characteristic of this speaker, and many Good™ speakers exhibit similar behavior (like the Genelec 8331A shown below), not all good speakers have similar room impact behavior.

Fig. 3 - CEA2034 output for Genelec 8331A speaker, take note of the similiarity of the ERDI/SPDI curves to that of the AsciLab F6B

As far as I know it’s not been studied if its actually more preferable to target a ERDI/SPDI similar to the F6b (Fig. 1) or Genelec 8331 (Fig. 3) shown above instead of something that would produce an in-room response more like the curves that Harman arrived at. Other, equally Good™ speakers like the Dutch&Dutch 8C in Fig. 4 and Kii Three in Fig. 5 may have a more distinct Harman-like bass feature when placed in a typical room, due to their ERDI/SPDI rising closer to 100 Hz.

Fig. 4 - CEA2034 output for Dutch&Dutch 8C speaker


Fig. 5 - CEA2034 output for Kii Audio Three speakers

While it is certainly up for debate which would be “better” on average, these speakers’ less traditional “room impact” is a result of their design being a bit different (or differently constrained) than typical 2-way or 3-way designs. Admittedly it is anecdotal, but in my experience more “conventionally” designed speakers are more common, much more common in studios, and these types of speakers exhibit behavior more similar to those shown in Fig. 1 and 3.

For this reason, I do increasingly wonder if an adjustment paradigm of a gradual downslope/down-shelf in the midrange—approximating the forces that cause good speakers to naturally downtilt in a room—could be equally or more preferred as headphone preference adjustments to DF than the filter results Harman arrived at for adjusting their In-Room Flat baseline (or even received as more “speaker-like” sounding).

To be clear, I have zero new preference data from blind listening tests to support this idea, so for now it remains a theory. Furthermore, I personally doubt many people would choose to eschew the distinct, noticeable bass of Harman’s popular curves for something with less overtly noticeable bass outside of a blind test—there is a pervasive attitude that “more bass = better” in the consumer and audiophile realms alike.

So with this, I’m mostly aiming to add another possible approach to “preferable” to the mix, one that better resembles the typical behavior of speakers in rooms. But I should also mention that I’m not the first person to see merit in this approach for headphones! Oratory1990 has done something similar, transposing B&K’s Optimum Hifi curve—which is similar—to be used as a headphone target response by (I believe) combining it with Harman’s In Room Flat HATS measurement.

Fig. 6 Oratory1990’s “Optimum Hifi Curve” for over-ear headphones (red) vs. B&K’s original Optimum Hifi curve (black, dotted)

Now, my approach would differ, because for over-ear headphones I would use the Diffuse Field (DF) HRTF of the rig the headphone measurements were performed on as a baseline, because DF is the appropriate HRTF baseline for playback of stereo recordings over headphones.

Eg. if I’m using Oratory1990’s measurements, I would use a preference adjustment applied to the KEMAR DF for the KB500x pinnae. And for IEMs measured on the B&K 5128, I would use a preference adjustment applied to JM-1 Diffuse Field, as that’s currently the best theoretical baseline we have for approximating an average human HRTF with consideration for the fact that IEMs are being measured in the ear canal of the 5128.

To capture this typical “room impact” tilt, I’ve opted to use a single 500 Hz high shelf filter with a Q of 0.4 as a primary preference adjustment on top of a Diffuse Field (anechoically flat, but for headphones) equalized headphone or IEM.

This shelf closely mimics the average corner frequency and slope of the Early Reflections DI and Sound Power DI curves that roughly approximate the forces causing a downsloping in-room response of a “good speaker placed in a good room”, though of course I invert the gain so the shelves are subtractive—the readout of ERDI/SPDI below showing as a rise is because it measures the difference between Early Reflections/Sound Power vs. the on-axis response, not the Early Reflections/Sound Power themselves.

Fig. 7 - 500 Hz Filter at +4 and +8 dB compared to the ERDI/SPDI of a few good speakers

With speakers we prefer the sound of an anechoically flat speaker, but not a “flat in-room” speaker, and this seems to suggest that we prefer the downsloping response that results from “room impact”. Some in our community have theorized that people do not prefer Diffuse Field equalized headphones because this is equivalent to listening to a “flat” speaker, but without the “room impact” that would be preferred in similarly flat speakers in a listening room, and that to counteract this, we need to “add the room impact’s downward slope in via preference adjustments on top of the flat (Diffuse Field) response.”

Whether or not this is the case is up for debate, but I think as far as filters for simulating the typical (preferential) effects that may arise when placing a typical anechoically flat speaker into a typical room, this simple 500 Hz filter is a rather elegant way for listeners to adjust the level of “approximated downslope due to room impact” they prefer added to their flat (Diffuse Field) equalized headphone responses.

After trying this method applied in the form of a subtractive 500 Hz high shelf filter with a Q of 0.4 to DF-equalized headphones and IEMs I’ve gotta say… I really, really enjoy it. Much more than I enjoy any preference target I’ve tried that includes a Harman-like bass and treble adjustment. There’s no overly sonorous or sludgy over-emphasis on bass overwhelming the midrange, but the bass is still mildly emphasized thanks to the downslope—it’s just not emphasized in what I perceive as a cartoonishly obvious way.

The midrange is natural sounding, roughly approximating -0.8 to -1 dB/octave with my preferred adjustment (-5 dB), while also being seamlessly integrated with the bass below it. There’s no exaggerated sense of bass bounce, contrast, or depth, but instead the midrange sounds effortless, coherent, and most of all correct.

The treble is where headphones have significantly more challenges than good speakers do, so it must be said that just equalizing your headphone to the DF curve relevant to your device’s measurements will not be enough to make this curve sound correct (similarly to how simply EQing to Harman will still present significant error in the treble relative to what current measurement analysis paradigms may suggest).

Personalized adjustment will be necessary ≥ 2 kHz to make the treble truly work, but this is true of EQing headphones measured on test fixtures from GRAS or B&K to any target curve—personalization is paramount, do not just blindly follow the line on the screen.

Below in Fig. 9 is what my HD 800 unit equalized to this target looks like on the B&K 5128 when manually accommodating the unique interaction between this headphone and my head/ears (green), compared to an HD 600 (pink). The big spike in the lower/mid treble is likely not present on my head—I certainly don’t hear it—as the outer-ears of measurement fixtures like the 5128 in Fig. 8 typically overestimate mid-treble relative to humans (credit to Sean Olive for this data).

Fig. 8 - Overestimation of treble between the 5128 (green) and a group of humans (blue), measured from headphones at the blocked ear canal entrance.

Fig. 9 - My HD 800 measured on B&K 5128, equalized to a DF-like response with a 500 Hz -5dB shelf with a Q of 0.4 and manually personalized treble (green), compared to HD 600 (pink) and the 5128 DF with a -0.8 dB/octave downslope applied (grey dotted).

My aim with this post is not to prescribe a new target curve, but instead to offer an alternative approach to producing a curve that still accommodates what Harman tells us is likely to be preferred—a downsloping response—but falls more in line with the behavior of a typical speakers than what people may currently be using / predisposed to trying as a preference adjustment. It also has the benefit of, unlike a flat tilt from 20 Hz - 20 kHz, being available to use and easily adjust in essentially any parametric EQ software.

Questions, comments, concerns? Lets talk about it!

5 Likes

Interesting. I need to collect my thoughts and then I will post later, likely with some questions regarding how I can test this with some of my personal headphones. Anyway, it sounds like fun and I appreciate the effort you put in here.

1 Like

A comment re: bass contour: something to note about Sean and Floyd’s work, particularly when Todd Welti is included, is that it is essentially a certainty that a discrete subwoofer is the assumed “good result” for a speaker. Bass is very significant to user experience, but ideal placement from a perspective of managing room modes and ideal placement from a perspective of stereo speakers very seldom coincides. The 105hz frequency for the low shelf is something Sean has explicitly identified as correlating with the crossover area from a loudspeaker to the subwoofer(s).

Modes are, in general, kind of the fly in the ointment for navel gazing too much about the exact contour of the bass and lower midrange: real rooms simply do not have the kind of smooth trends we can achieve in EQ’d headphones. This is kind of the corollary to the much greater ease of achieving axially smooth response with speakers vs. “HRTF smooth response” with headphones: you are picking poisons there.

Overall, I think that it may be a bit of a…dogmatic reading to parse “Harman headphones sound like good speakers in a good room” as “the Q values of the adjustments used in Harman parallel in-room speaker behavior”. Like, I know you’re not putting that on Sean per se, but he’s generally been pretty receptive to the idea that the general response trend of (conventionally) “good” speakers in rooms is a more significant factor in preference than any particular details of the adjustments.

To tangent a bit on this musing, something I really want to do is a study that starts with a series of MOA tests - slopes, discrete shelves, shelves + peaks - has users adjust to greatest preference, then does a round-robin comparison of the “most preferred” results of the different methods of adjustment to see which ones are “better” in terms of producing results that are more preferred than others.

3 Likes

The 105hz frequency for the low shelf is something Sean has explicitly identified as correlating with the crossover area from a loudspeaker to the subwoofer(s). Modes are, in general, kind of the fly in the ointment for navel gazing too much about the exact contour of the bass and lower midrange: real rooms simply do not have the kind of smooth trends we can achieve in EQ’d headphones.

So, feel free to correct me because I am not as speaker oriented, typically.

When it comes to using a room correction product with speakers—say GLM, Sonarworks, or DIRAC with an 8331 + sub—does the correction not usually default to EQing the bass region < 200-300 Hz as flat as possible by (trying to) address unevenness resulting from modes?

Obviously this does not end with a result that is “perfectly flat,” but the impression I had is that correction products (or manual correction done by studio engineers who think “flat is good”) are typically targeting or approximating a flat response in this band more often than targeting something like the ~100 Hz shelf. This may be a case where I just lack data, and I admit my smattering of experience of seeing how engineers equalize their speaker setups may be biasing me towards an incorrect generalization.

No doubt, the lack of modal bungus in the bass range is what the kids would call a “rare headphone W,” but headphones do indeed take an “L + ratio” in the treble.

I wouldn’t say that’s exactly what I’m saying. This post is rhetorically posing a question—is Harman actually similar to a typical speaker response?—where the answer is explained to be “to various extents, yes, but there are ways to achieve a speaker-like response in headphones that Harman’s work supports as preferable that aren’t just the two or three adjustments people dogmatically associate as ‘The Harman Filters’”

If anything it is more of a subtweet of Headphone/IEM enthusiasts who would insist something like the filter I propose is not sufficiently Harman-like. Neither Sean/Todd, nor the people saying Harman captures “the sound of a good speaker in a good room,” are the people I aim to “Uhm akshually” here.

As usual, my gripe is with how the community uses the published output of Harman’s work incredibly narrowly. They follow the result without considering the implications of the conclusions that have been arrived at concurrently with that result. I share in your perception that Sean is typically very open to different modes of adjustment allowing people to get the same kind of result, and this is one such way I think is currently under-explored due to what is essentially a myopic read of the data (and I think it’s an adjustment he would see as reasonable/in-step with the overall conclusions of his work, similar to the flat tilt from 20-20000 Hz).

So I offer it as a suggestion for anyone who wants to try something other than the two shelves or a flat tilt from 20 Hz to 20000 Hz, both of which have their own “kinds” of sound that are distinct from one another, as well as distinct from the filter I propose. I present this with a bit of context/justification from looking at speakers, framing it as a correction that in some ways parameterizes and mimics the downslope behavior of a typical, good speaker in a room because:

  • Many headphone/IEM people may be resistant to something that looks Not Exactly Like The Harman Adjustments they’re familiar with (we already experienced this with the flat tilt) even if it leads to similarly preferred, similarly “literature supported” curves. Framing it in terms of a way to achieve a “good speaker-like result”—a framing that has helped people feel confident in using Harman’s other curves—is helpful to get people to understand this is in-step with the other, more popular curves from Harman’s work
  • Most headphone/IEM people haven’t considered how the behavior of speakers in rooms—and how that behavior can directly lead to a preferable result—informed the conclusions of earlier work that eventually led to Sean going about his headphone/IEM research the way he did, giving people the ability to downslope a “flat” speaker response in headphones/IEMs.
  • The variation in types of downslope behavior we can see across “good speakers” is context that I think would be helpful for people equalizing their headphones to avoid restricting themselves to only one or two types of adjustment because they think their adjustments need to be “Harman-identical” in order to be “Harman-like” or supported by the literature.

You and me both, dude.

All right my brother, so what do we have to do at home in order to try this on some of our headphones? Is it better to start with a headphone that has mostly linear flat extension prior to EQ?

2 Likes

If I’m following along correctly (probably not) we do something like this. I think? :sweat_smile:

Basically just start by EQing a headphone you own to DF, then I’d say apply the curve (500 Hz LSQ Q: 0.4, people on Discord seem to be liking anywhere from -3 to -6 dB). Focus on getting the mids right while listening to a track you know. Then correct for any treble stuff after that.

1 Like

This seems like you’re EQing to a tilted DF curve and then adding more tilt with the 500 Hz shelf. you want to make sure you’ve clicked “Remove Adjustments” in the Preference Adjustments category before you AutoEQ to JM-1, then once its EQed to untilted JM-1, you apply the 500 Hz shelf and cut to taste (making any necessary treble adjustments after that).

I can’t really comment on what people in studios do, but room correction just tries to “correct” to a target curve. Of those target curves, some have bass rises that are higher Q, some have only a downward slope at higher frequencies, and many are terrible.

I would also note that overall I don’t consider room “correction” to be a majority tendency in speakers by any means.

I mean, honestly, given what we know of the preferred results from the Harman work, I’m not sure if there’s a starting point which we would expect to not result in a preferred-but-distinct-from-harman response. Like, so long as you have a “fairly HRTF-y” starting point and the ability to adjust the overall balance of treble and bass in some way, I feel like there may be a functionally infinite number of potential filter sets that people could play with which would result in well-liked results.

I suppose I get that. I’m kind of wondering what the best way to break these misconceptions is, honestly. Or perhaps I should just be glad that this industry seems to ignore the science and try to take advantage :grin:

My main caution here is that the in-room prediction is…very much context dependent. For a substantially longer or shorter distance from the listener, or a room that has meaningfully “atypical” reverberation time, you’ll get a meaningfully different result. To give a specific example, an 8331 at a mastering desk less than 1m from the user will have a meaningfully different timbre than that same speaker at 3m in a large room with hardwood floors.

1 Like

Yep, and part of the issue is no doubt that it just adheres to a line without the nuance of what the best correction for that situation would be.

In the studio world I actually think it’s gotten quite a bit more popular over the last decade or so. We should tell Aki to do another audit of studios sometime :slight_smile:

You have a point. I think there are certainly horrors that could be inflicted on a reasonable baseline, but thankfully the preference adjustments that have actually wound up being used have kept the fundamental HRTF shape of the midrange band intact. It’s ones that would alter that shape that I think would be easiest for me to call stinky.

LOL.

For sure. I’m not saying people should start EQing ceiling bounce curves into their headphones to “make it more speaker-like” or closely target an Estimated In Room Response of a given speaker for their headphone EQs lol. Just that a simple downslope filter—with an unspecified gain value, as to not prescribe a single distance or balance of direct to indirect sound—that seems to roughly resemble the way typical speakers often (but not always) downslope is a fairly uncontroversial thing to try as an alternative preference adjustment to the other two paradigms we currently have (Harman shelves and flat tilt).

Oh! Right. It’s obvious now that you point it out. I was applying two tilts. Thanks :slight_smile: The 500 Hz high shelf is the tilt.

Here’s the correct version for posterity:

1 Like

Thank you my friend. Just to be clear, it’s a high shelf filter, not a low shelf?

I’ll be having some fun with this later today. I appreciate you brother!

Either a subtractive high shelf or an additive low shelf, pick yr poison.

1 Like

I’m not sure I can do this. I don’t have an accurate rig and the only good DF based presets I’m aware of are your 5128 presets. They already have a tilt, right?

Are you familiar with using CrinGraph-based (eg. squig.link) sites? You can just load a headphone you own and then equalize them to a flat DF based target on those, like mine at Listener's 5128 Database.

Let me know if you need any other help, I also gave some instruction to OutOfMemory above.

Yes. I thought you didn’t recommend squig.link but I’ll give it a try. At the very least your 5128 measurements should be a good starting point. I’d forgotten about them. :+1:

Thank you very much for the insights! It would be interesting to try out this approach, especially since I’ve never been able to reach satisfactory results with the other two approaches. Yet the most difficult, problematic and crucial issue that dwarfs all others, which is treble personalisation, remains the same.

I wish we would get some kind of breakthrough in that regard, that would make it somewhat easier. Now that would be a dream come true, a true revolution in personal audio.

The approach to use directivity indices is very interesting! That said, I think a smooth tilt might be more intuitive using this logic:

  1. The directivity of both musical instruments and loudspeakers is highly variable–see below. The most consistent characteristic is a general upward trend with frequency. A smooth tilt in the target curve would better reflect this general trend, rather than imposing a specific directivity profile that may not be universally applicable.

  2. The fact that full-range loudspeakers–as opposed to speakers that don’t even operate in that region–often exhibit a continuously sloped directivity in the bass region (see below) supports that further.

  3. Below the room’s transition frequency, the steady-state in-room response is the appropriate measurement. Room gain exhibits a slope of -12dB/oct, or 8–10dB/oct under less ideal conditions.

  4. A continuous slope might be more compatible with previous Harman research, which offered a treble control, but not a tilt control.

  5. It also aligns well with Harman’s statistical speaker model, which basically describes smooth = good.

  6. Toole’s “idealized room curve” (from Figure 12.4(d) of “Sound Reproduction”) is basically a modern version of the B&K Optimum Hifi Curve.

  7. Uneven distribution of more rapidly changing sound might affect the timbre of certain instruments more severely compared to an even distribution. Using your filter, this especially affects the mids.

An aggregate of directivity indices might lead to more insights. There is also the question of how preferences vary between headphones and speakers.


2 Likes

Afaik this wouldn’t matter all that much for stereo playback considering the directivity of instruments and speakers is different, and the final judgment for the correctness of timbre for the instruments is done on speakers?

Unless your argument is that accounting for timbral issues with instruments with a tilt that mirrors their directivity pattern is a more natural way to approach it, which… idk, I can’t really say I feel strongly one way or another. Like I don’t see any huge problem with that idea given it’s not crazily different from the directivity falloff for speakers, but it also doesn’t strike me as important as the directivity of the playback system that the instruments’ timbre was originally evaluated/mixed on.

These points are good reasons why the 20 Hz - 20000 Hz flat tilt is absolutely still an adjustment that should be kept as an option for people to equalize their headphones. Again, my suggestion isn’t to replace the flat tilt, but offer an alternative that seems equally compatible with the behavior of typical speakers and Harman’s model, offering the same helpful downslope in the midrange while not altering the timbre of bass and treble elements as much.

I mean, this is still a smooth high shelf of the same Q value as the treble cut used in some of Harman’s papers, it’s just at a considerably lower frequency :~)

Indeed, though the filter I’m suggesting is closer to the Steady State Room Curve below (also from Toole). What I like about it is that it builds in a “range of adjustment” to handle the variability of room impacts we see in the shaded area.

My feeling is actually that treating all areas with exactly the same slope (flat tilt) actually results in a bigger hit to timbre, but mostly in the bass and treble. I admit this is purely a subjective judgment, not one based on maths, but IME affecting the bass with the same slope that works best for the midrange always seems to cause a good deal of slowness due to sub-bass being elevated over mid-bass which is elevated over upper-bass.

1 Like

This has been an interesting conversation.

I think you and others here already probably know where I mostly stand on this topic, Listener. And what I believe is missing from conversations like this is some actual in-ear measurements of neutral speakers in semi-reflective rooms for comparison.

Of course there are still folks here who probably don’t believe that is the best target, baseline, or touchstone, for various reasons. There are probably even a few who still believe that an unmodified free or diffuse field response is still the best for headphone listening. And they are entitled to their opinions.

We can also debate how close or far away Harman’s 711 target is from the in-ear response to speakers in a room (even though there’s no way to actually measure it on a GRAS rig with no head/torso, like the one they used for most of their research).

There are alot of good headphones on the market though right now. And many of them are pretty neutral sounding, based on the reviews here and elsewhere. And not all of them conform precisely to the Harman target on GRAS. Certainly many of the open headphones don’t.

My suggestion to you and others is to find a group of headphones that you think appoximates a neutral response for you, or the in-ear response of some good speakers, if you prefer. --You might even want to try listening to some good speakers, and doing your own informal comparisons with some different headphones.-- Then look at the graphs of those headphones. And see what target curve seems to align best with them.

Or maybe better yet, just compute the average of those curves, and try using that as your target for EQ! Then add or subtract headphones from the grouping with different characteristics, until you zero in on precisely the type of response you want.

Listener mentions squiglinks above. And the Earphones Archive squiglink allows you to compute an average of multiple headphone response curves. If you don’t see the headphones you want in their database, you can add more with the Upload FR button in the Equalizer menu. As many as you like.

It also has both the Stock HBK 1/3-octave 5128 DF, and the one Headphones.com uses. So you can experiment with both. The stock curve is my preference. But it is only useful as a coarse reference, good up to about 15-16 kHz imo. And I prefer to actually use an average of several headphones as my EQ model instead, rather than just DF + a slope or bass boost.

The only caveat I will mention is that the averaging tool in Earphones Archive seems a bit buggy. And you may only get one shot to compute the average, after loading/selecting all your preferred headphones. And have to relaunch the program, if you want to make changes to your preferred headphone list. If you are loading a large number of headphones for your average, this can be somewhat of a nuisance.

I hope this will maybe be of help to some here.