Yes, though even without that you can still get within the ballpark and it’s still useful. I think this would be really cool for open canal too. Blaine made me some mics ages ago that sat at the canal entrance but were open in the middle, and had a capsule on the end of them. So you get the open canal data just not at the DRP. In a perfect world we use probe mics but uh… yeah I don’t have any of those. I’m going to look into this further.
Well the big monkey wrench in all of our approaches for this - as in that other crosstalk cancellation/reduction thing I mentioned in the spaciousness thread - is how the music was produced, what it was monitored on and intended to be listened on. I wish we had some solid and up to date numbers on this, but my working assumption is studios still use stereo speakers as the ultimate reference, even if they also double-check their masters with headphones before pressing “print”.
That would give those of us who tune for +/-30 degree speakers based HRTF a higher success rate listening to commercial recordings than anyone else. And some of the research seems to support this, like that 2013 Olive & Welti listener preference paper Blaine linked up-thread, where the speakers-in-room based targets were the most preferred.
So to be clear… the agreement between speaker and headphone preferences is largely to do with answering the question of how much bass and treble do people prefer, and that seems to be roughly the same.
But when we’re talking about timbre, the sound field conditions for these devices are going to matter more significantly, particularly when it comes to the psychoacoustics of listening to sounds at a distance vs listening to the sound helmet (headphones). And, that’s something where we want to look at Thiele’s work. Moreover, just because something is mixed on speakers does not mean the playback condition needs to be the same for the timbre to sound right. Again, the change from speakers to headphones is going to involve a change to the psychoacoustic system.
There’s also the whole issue where a purely analog headphone signal doesn’t contain any Inter-Aural Timing or Inter-Aural Level difference information for both speakers in both ears, not to mention the time of arrival for various room reflections. I’m pretty sure it’s Thiele who makes the argument that our brains automatically divide-out the transfer function of that information from the whole impulse response at eardrum to get the expected impulse response for timbre. So basically, I think the argument goes that eardrum FR targets that are based on in-room 2-channel stereo measurements will negatively affect perceived timbre because headphones don’t have all the rest of the info. If the front speaker-like presentation is what you’re after, though, I’ve heard that the best way to do this is to use room simulators or spatial audio with diffuse-field tuned headphones.
Plus, headphones hang off the side of the head. This can never be a natural presentation.
So basically your point is to improve the timbre without introducing FR elements that only sound right within a room-sized sound field and would be inappropriate and undesirable in a headphone cup sized sound field? And this is the core of why you want to use a DFHRTF? Then that still leaves open the question of preference-based targeting, right? As in, you’re not choosing DFHRTF because you want the music to sound like it’s all around you, it’s just because of technical reasons having to do with psychoacoustics and sound fields.
Sure it doesn’t, but:
-
See my recent spaciousness topic post, where some people argue that those cues shouldn’t be heard from both speakers with both ears when listening to speakers either!
Meaning that taking speaker presentation with crosstalk as a reference for correct stereo imaging has been wrong the whole time. -
If this is accepted, ILD shouldn’t be a problem with either speakers or headphones: each instrument is panned appropriately during mixing and has the right loudness ratio between left and right already baked into the commercial recording. ITD… I don’t think you can fix except by recording everything with two microphones positioned exactly where your intended transducers will be for the listener, so it’s either a 2-mic recording for a precise speaker setup geometry, or a binaural recording for headphones. I don’t think pointing this out helps the HRTFs and target curves discussion in any obvious way, it’s just something that can’t be fixed with EQ.
-
Room reflections are again a detriment for correct imaging, but judged positively by many listeners probably because of their contribution to overall FR tilt. I think for this component all we need to do is keep in mind to EQ in a good bass tilt and otherwise just be glad there are no (late arrival) wall reflections in headphones and enjoy the lack of associated imaging errors (though headphones can have their own early-arrival reflections in the front volume if they’re not very well designed).
Not specifically the cup, but rather the sound field condition of wearing headphones. There is no specific directivity to the sound source when you are wearing headphones like there is with speakers, you are permanently in the “it comes from no specific direction” condition.
This is why we talk a lot about the importance of turning your head relative to a specifically localized sound source (speakers). It doesn’t sound unnatural to us because we have all the rest of the psychoacoustic priming to interpret any sound change when we turn our heads as a change in localization, we can tell immediately that now the sound is coming from there. If you remove all of that localization priming we get with speakers, there’s nothing to tell our brains to interpret the FR as being specifically localized to there, and it would sound super weird.
That’s effectively what we’re doing when we put on headphones, and why it makes no sense to try to emulate any direction based sound-field with headphones - at least for stereo playback - if appropriate timbre or ‘tone color’ is something we care about.
See, I’ve always had a problem with this assumption, and still do. I think it’s just something people have always assumed because of the wrong ways the virtual stage has always presented subjectively from uncorrected headphones, but I think there are indications that it’s the wrong explanation for that result.
If simply putting the drivers close to the ear was removing all directionality effects, why would the OAE1 concept (and other similar ones) have ever made sense to even try? Can you seriously claim that rotating the driver forward or back inside an earcup should make no difference to the FR at the eardrum, that it’s always all just DF? Even forgetting the OAE1, why would Sennheiser have gone through the trouble of designing that odd shaped HD800 cup to angle the drivers away from the traditional 90 degree direction? Because it changes the FR!
I say there absolutely is directionality in headphone sound and it follows the driver’s axial direction. And especially with over-ears since the pinna is fully involved this should produce something much closer to the single-direction HRTF from a hypothetical speaker placed along the driver’s axial direction in space, than to a DF type of response. This may have been less true in the past with all the badly designed cups with copious reflections shooting off everywhere but in better damped cups and with the better earpad materials today it should be a lot more apparent. And this changes all the calculations you have to do for timbral corrections and spatial effects vs. working on the DF assumption.
Try it.
It doesn’t sound weird. I think what you’re missing in your model of this is that even as the different directional HRTFs tell our brains where to deduce the sound is coming from, we still hear directional sound as having different timbre! So position and timbre are causally linked but they still remain separate perceptions! Just consider not being able to understand what someone behind you is mumbling under their breath, and wanting to turn around to face them to hear them better (hear their treble better). You wouldn’t do this if positional detection removed all traces of timbral differences via the brain going “welp, now I’ve used up this information to decide on positioning, I can delete it and just keep everything else”.
This is so not how it works. We still hear the different timbre of each different direction, as a separate quality.
And if this is the case, it stands to reason that some directional timbre you may decide to take as a reference should be possible to enjoy any time you’re able to reproduce it with or without all the other non-timbral localization cues.
Well it can’t be an assumption when the sound source is literally affixed to your head.
No that’s not the claim. It would absolutely make a difference to the FR at the eardrum. The point is that you’re still in the “sound helmet” listening condition regardless of the FR, and that’s a problem for direction-based sound field equalization. FR differences at the eardrum in this condition don’t gain the benefit of any other typical localization priming effects. This is also why projects like the OAE1 are somewhat novel and interesting. They’re precisely trying to break this condition and create the illusion of front-biased speaker timbre. It doesn’t really work… for most people who’ve heard it at least. But that’s why it’s an interesting concept.
Same thing with the HD 800, it’s what it’s doing to the FR that’s creating this effect - the illusion of spaciousness (and being acoustically very open).
But this actually works in my argument’s favor, or I think maybe you’ve misparsed it? The point of the exercise is not to suggest that “we can’t discern timbral differences from things behind us vs in front of us”, rather it’s to indicate that it doesn’t sound immediately weird, because of the priming. It’s an extreme example but it’s meant to illustrate the point. You then apply that same idea to front-biased sound fields of any kind when the directivity priming is removed, and that’s why it ends up sounding weird. TO be clear… likely not as weird as the timbre of FR from speakers behind you, but still unnatural. I actually think the OAE1 demonstrates exactly why this is a problem.
I want to note though, I am led to believe there are people who genuinely like the OAE1, even if most of us who reviewed it did not. Assuming it’s not for its bass to treble delta and actually for its treble timbre, it’s entirely possible that some people don’t hear FF or front-biased timbre as strange in the sound helmet condition. Maybe they are getting localization priming from auditory cues alone, and this is then reinterpreting the tone color accordingly.
Apple and others digitally simulate sound coming from a fixed location with IEMs and headphones, but it always sounds processed and constrained. There’s no transparency or nuance left.
Forced positioning and forced width date back several decades. It’ll be interesting to see whether modern computing can use raw horsepower to get around the artificiality. So far I haven’t heard anything great.
That’s probably just because it’s not done very well. But yeah altering the playback material or adding head tracking… that’s it’s own can of worms. Here we’re just talking about stereo playback with typical content.
I’m naive after decades of trying to be sophisticated. I’ve heard really nice speakers and really nice headphone setups and gimmicky processing a la AirPods.
But I’ve given up trying or expecting to hear something approaching realism, especially with acoustic instruments that have unmistakable timbre when heard live. Piano, classical guitar, saxophone, oboe, brass, violin through bass. There is simply nothing that compares.
Even some jazz combos that might mix unamplified percussion with electric keyboards in a properly designed venue (Micks Black Box in Lititz PA) will never be replaced by a recording of the same on speakers (even in the same room, though that might be close) and certainly not on headphones.
This discussion is of interest, especially to those who might design headphones and playback chains, but the sound of Jarrett in the same room will always show up a recorded impostor.
Almost certainly. You are always at the mercy of the recording chain and the environment. I’m not aware of any devices that are designed to recreate any one specific recording or emulate one specific space for such a recording. I think with stereo playback content you’re never going to get that no matter what you do.
New project to tune stuff to your HRTF released this year: this German developer says he’s been working on getting his HRTF measured and reproduced in headphones for 15 years, and he can now get it done for anyone based on simple manual measurements fed into his software.
He claims most of what matters about your HRTF is captured by measurements of your head circumference, neck height and pinna width, and you should be good to go with over-ear headphones because those incorporate the rest of the relevant features of your anatomy naturally.
It’s not completely revolutionary, I vaguely remember some company (Sony?) coming out more than 5 years ago with a similar system based on you putting a few head measurements in their app so they can estimate your HRTF.
Wrt. room contributions, it’s supposed to be based on some synthetic “perfect room”, not on any measurement taken in a specific real room, but for a “perfect room” it seems excessively reverb-y and diffuse.
Unfortunately there’s no function provided to adapt it to arbitrary headphones, it only supports a list of models already measured by him, so my best immediate bet was to test the HE-400SE settings through my HE-400i. I can echo other user comments that the results seem unpleasantly diffuse in the center, and only sound crisp in a couple of lobes angled where the speakers should be (30 or 45 degrees off-center, depending on your selection). Not much improves when you switch to the ambiophonics-inspired “crosstalk-free” variant. This haze in the center kinda ruins the most important part of the “forward and out of your head” effect - what Griesinger calls the “auditory proximity” -, even while other qualities of the result do suggest a pulled-back presentation from sources away from you. But maybe with further improvements as the user feedback keeps coming in…
I tried a few settings but the music always sounded behind me ![]()