Just in case any of you readers needs to look up the acronym HRTF like I did;
HRTF = Head-Related Transfer Function
I didn’t come up with this technique. It’s something that oratory1990 recommended on his Reddit, and he is an expert on headphone EQ. I’m merely theorizing why he recommends using pink noise with a sweep tone instead of the sweep tone alone.
Agreed. But short of testing by repeatedly changing a PEQ and then listening to real music, this is apparently better than using a sine sweep alone.
Agreed. As I said, maybe the EQ I am using is already good enough. And maybe it is a quixotic quest to improve it further. But that doesn’t mean I don’t want to try. That’s part of what this hobby is all about: continuously improving our perception of sound quality, even in small increments.
Sorry about that! I hate it when I read a really interesting post and there’s undefined acronyms or abbreviations that I don’t know. And here I went and did it myself.
I know that the DF in DFHRTF means “diffuse field”, but how does DFHRTF differ from HRTF, and what’s the application?
See Resolve’s explanation here:
@Resolve and @listen_r when you use pink noise alone, what are you listening for? How do you know something is off? How do you know what pink noise is supposed to sound like? I’m assuming you have a reference that you consider “correct” but I can’t think of what that would be.
I all that and was still head-cocking. This helps:
Diffuse Sound Field is the region in a room where the sound pressure level is uniform i.e. the reflected sound dominates, as opposed to the region close to a noise source where the direct sound dominates.
Sound Fields, Terms and Definitions.
That’s right, and we use a diffuse field for the HRTF with headphones because that fits with the use condition - headphones are worn on the head, meaning the sound does not come from any particular direction and it has not specific localization (other than originating from ‘in the head’).
To go a bit deeper on this, the FR varies substantially when the sound source position changes. If that same FR change at the ear drum were to be observed with headphones, we’d most likely all report a meaningful change in timbre. This is because there’s no psychoacoustic priming for locatedness with headphones the way there is with speakers.
So really, the function of the DFHRTF with headphones, and why we use it, is in part psychoacoustic in nature - specifically due to the use condition of headphones.
Great question! Honestly I’m just using it to target resonances such that there aren’t any huge aberrations.
I don’t necessarily advise getting too granular with it based on the pink noise alone, but pink noise has been shown by F. Toole and S. Olive to have the lowest detection thresholds of resonances in loudspeakers, meaning its as easy or easier to detect resonances in continuous pink noise vs. other broadband signals like music. Some music can perform nearly equally to pink noise because of its spectral similarity to pink noise though, like Fast Car by Tracy Chapman.
In other words, pink noise is about as good or better than any musical stimulus at giving a wide-band idea of the tonal coloration vis a vis resonances of the playback transducer in-situ, but might not be quite as useful for super fine-grained adjustment.
PS. This is super unscientific, but when I’m EQing pink noise I always imagine I’m listening to a distant waterfall. It’s not exactly a helpful idea, but its one that brings me peace
That makes sense.
One of the things I do to verify my EQ is listening to Descending by Tool. Besides being a fantastic recording, it opens (and closes) with the sound of crashing waves. That’s a sound that has a broad frequency spectrum and is also familiar enough that I can tell when it’s unnatural.
ASIDE: I remember using the sound of crashing waves to spot bad lossy compression back when 128 kbps MP3 was a thing.
Yes. What works for me is dr. David Griesinger’s method, and since I discovered it I’ve wanted it for all my headphones (but didn’t actually get it done for all of them since it takes quite a bit of effort). It sounded like the best equalization I’d ever heard, though I have to say it didn’t cross my mind to actually test it against the reference speakers it was based on - should provide some indication of how successful the method is, since it’s mostly trying to get headphone X to sound to you like your calibrated(-ish?) speaker(s) of choice in your room of choice.
No no, of course not. Your HRTF reflects how the frequency amplitudes are changed from their values as the sound leaves the free-field (or realistic-field? if it’s good stereo speakers in a good room?) source to the values at the eardrum, by everything intervening along the way that’s specific to your personal body parts. It’s very much not an equal-loudness shape, but equal-loudness tuning does play an intermediate role as you can hear in dr. Griesinger’s explanation.
Essentially the method has you determine two equal-loudness tunings, reflecting how your body hears sound from realistic away-from-body sources, and how your body hears sound from a specific pair of headphones, and then in order to turn one into the other you basically apply the equal-loudness tuning of the headphones as positive values and subtract the values of the realistic-field (speakers) tuning at each frequency, resulting in a final tuning that “turns those headphones into open-air speakers”, for you specifically (within reason; it’s not going to be perfect of course, because you won’t be listening to recordings all made with binaural eardrum-microphones in a dummy head that has your exact HRTF).
Note that I keep talking about speakers, plural, because that’s how I modified his method to have what seemed like correct bass - his original method being aimed at recreating the experience of frontal out-of-head central sounds uses only one full-range speaker, but I found the result wasn’t as good purely tonally as when I used stereo monitors at +/- 30 degrees in front of me. So I gave up on the otherwise reasonably successful (in some songs, not all) illusion of some central sounds appearing to be from sources out in front of me - which would otherwise pretty much never happen for me with headphones - for what seemed to me like considerable gains in overall timbral accuracy.
Achieving an ‘equal loudness shape’ FR is not the suggestion though, rather it’s about what HRTF matching would sound like perceptually. And since we’re talking about headphones, it only makes sense for that to be the DFHRTF, and since the DFHRTF is a calculation based on many angles of incidence (coming from effectively nowhere), it’s a bit unclear how this relates to perceptual evenness in tone-gen. At the moment I’m of the opinion that we shouldn’t be chasing equal loudness in tone-gen. But certainly there are times when fixing a notable peak that’s heard in tone-gen yields an overall much better result.
It should sound like “real life”, unless I’m missing something. Or more specifically it should sound like the non-headphone listening scenario you used to determine that HRTF, which for most people trying this at home would probably be a “good (stereo) speakers in a good room” setup. It’s also what I call “truly flat” FR, the only condition where the reproduction chain “disappears”.
No, why would the completely artificial invention of “diffuse field” ever come into it? Nobody listens to music reflected at them equally from all directions, in non-headphone scenarios. People mostly like live concerts where the music is coming only from the stage, and then approximations of that in listening rooms, usually with frontal speakers, again firing the music at you mostly from the front. DF was always a weird concept to use, should’ve stayed “in the lab” in very specific experiments where the thing they were looking for couldn’t be investigated otherwise. But for people equalizing their headphones specifically as music consumers, I don’t see any use for DF.
So this is the root of the problem. We’re talking specifically with headphones, not speakers.
Trying to achieve a ‘speaker like sound’ in headphones runs into problems when dealing with stereo recordings, because of the use condition of headphones and how that’s different from speakers at a distance. DF is the correct sound field for headphones given that they’re worn on the head. The sound is coming from no direction the way you get with speakers. There’s no directivity or locatedness to the sound when it’s being worn on the head the way there is in real life when you hear things, or with speakers at a distance with a given angle of incidence.
To understand this better, it’s worth reading Listener’s article on the subject or the deeper one from Thiele.
Yes, and that’s why the best methods of improving the sound of headphones to make it as lifelike as possible involve 1. recording binaurally using a dummy head with microphones placed at the eardrum position (baking the HRTF into the recording, which even then remains problematic if the dummy head is not identical to the previsioned listener’s head) or 2. measuring the HRTF of the listener somehow and applying it to the recording somewhere in the audio chain before it’s fed into the transducer, as in expensive automated systems like the Smyth Realiser or as in work-intensive equal-loudness-based “subjective measuring” methods like dr. Griesinger’s.
You’re being a bit vague in your language, but I hope you’re not trying to tell me the sound coming out of headphones can’t be made to believably imitate the sound of a free-field concert or of speakers-in-a-room, once Smyth Research (at least) have clearly proven it can be done. If you accept that, the only thing remaining is to evaluate whether Griesinger’s method comes close enough to be worth the effort.
But it’s the wrong sound field for satisfying the user’s expectation that their music sound “as in real life without headphones”, so it should be corrected for, it should be eliminated as much as possible. Otherwise we’re going off-topic and there’s no need to even mention HRTF. Only people who want the music coming out of headphones to not sound like it’s coming out of headphones need to care about including HRTF information in their EQ calculations. HRTF characterizes only non-headphone listening scenarios. Why use it if you’re not trying to have your headphones emulate those scenarios?
Yes, and that’s the primary issue that Griesinger’s method purports to solve, specifically wrt. the perception of stage-central sounds as coming from a source outside and at some distance in front of your head. It works best with recordings made with his special dummy head (e.g. the opera samples from his 2017 Berlin AES presentation, corrected variant “All no 1st” on slide 45), but you can also have pleasant surprises with generic commercial recordings.
I feel it becomes a different question if we’re trying to answer “How to make headphones sound like real life” vs “How to make headphones sound natural with stereo recordings”. I’m squarely focused on the latter here, and for that, the DFHRTF is the one that makes the most sense, for the aforementioned reasons.
As far as it being not what people prefer, this is very much clear from the Harman work. People prefer a downward tilt from bass to treble. And so I’m not suggesting DFHRTF matching to be ‘most preferred’ by any means. Merely that for headphones with stereo recordings, this is the best fit for tone color.
But as soon as we start trying to answer the other question - how to make headphones sound like real life - I’m not so sure we can get there with non-binaural content.
The Realiser approach - and binaural, head-tracked approaches in general - are functional so long as the illusion of externalization is preserved. This illusion is easily shattered by head movement (thus the tracker on the Realiser), because the auditory image will move with the head rather than staying placed in space. Theile talks about this in his 2016 conference paper which uses a Realiser for some experiments.
Because, and this is Theile’s point, our perception of timbre is directly connected to our perception of source incidence. Because headphone sound is non-localizable, we do not perceive directional HRTF cues as directional cues, but rather as timbral defects. This finding is pretty consistent historically, and is part of why free field based targets are inevitably the lowest-scored in tests of preference.
Griesinger’s method is specifically and directly debunked by sections 2.4 and 2.5 of Griesinger’s 1986 paper, which is part of why those standards were withdrawn. Empirically, we know that subjective loudness comparison cannot yield correct timbre when one source is not spatially localized. The solution to this with headphones is individualized HRTF compensation and head tracking, as implemented in the Realiser, but you cannot achieve the same effect with a static EQ.
Something like Jaakko Pasanen’s Impulcifier can potentially produce an externalized auditory image, but the illusion will break fairly quickly with head movement, so you end up with the odd situation of having to keep your head in a vice in order to hear things “correctly” - this is one of the factors that inherently makes dummy head recordings limited in their usability to, basically, lab tests, and it’s part of why modern spatial sound is probably the only route to achieve consistent timbral/tone-colour matching between loudspeakers and headphones for the same recordings.
OK, that… soundss… like a dispute between researchers. I can’t get in the middle of that, I can only note that it seems quite unreasonable to assume Griesinger would have continued working and presenting on his theory all the way to 2017, possibly beyond, if it had been so completely ‘debunked’ by a 1986 paper. Given the way these things usually go in science - and especially at its interface with practice or engineering - I find it more likely that further experiments have proven Thiele’s old paper to not be quite as conclusive as it seemed, than that people have stupidly continued working and experimenting (and convincing themselves they’re getting worthwhile results) on something that clearly shouldn’t work because the Perfect Paper about the phenomenon was already published in 1986.
Anyway, I’m not promoting Griesinger’s method specifically for out-of-head source localization (since we can’t all just start listening to all our music re-recorded with his special dummy) but for overall timbral matching to a listener’s HRTF. And if using full-range stereo speakers instead of one smallish central speaker, I would say this method delivers. Can’t prove it with arguments that would be accepted in a scientific journal, I can just recommend that people interested in the topic try it for themselves.
I recently read a method for creating an HRTF measurement on oratory1990’s Reddit:
I measured my speaker with the IE mic in my ears. I then marked that position in space using sewing thread. Then I measured the exact same position with the mics but freefield. In REW, I then subtracted the measurements to get my own personal HRTF, or at least a starting point for that. I did this for both ears and averaged the results. After that, I applied my speaker target curve (-10 db tilt) to the HRTF measurement.
After some manual tweaking of that target i now have what I consider to be the ideal Target response for me. I saved that target in rew so now I can use the mics to basically eq any headphones (not iems unfortunately though) to that target.
I have to say this sounds similar to Griesinger’s method, except using in ear mics. Of course this would only give the correct timbre, not give the illusion of the music coming from phantom speakers.
Similar, but needs special hardware that not everyone has lying around (in-ear mics) whereas Griesinger’s method only really needs your favorite speakers and the headphones you’re correcting.
Plus it seems bjorken22 over there isn’t adapting the profile to each headphone, he’s just assuming every headphone should be producing the same timbre no matter where its membranes are sitting relative to the eardrum, or to the IE mics to be exact. Whereas that way of measuring the HRTF is only strictly correct for tuning IEMs that will radiate from the exact same point along the ear canal where the mics were sitting. Griesinger’s method adapts to this difference in membrane positioning (and other characteristics) of each headphone too, without additional manual tweaking. (Though of course with the larger headphones there’s always the issue of exact positioning on the head, which he admits tend to make for worse results than with on-ears and in-ears.)