It’s less so about an ideal response and moreso that the existing product market is meaningfully flawed from top to bottom, so letting those flaws dictate our analysis of other products both bakes in unnecessary imperfections into the process while also putting a quality ceiling on things that can lead manufacturers to stop aiming higher.
This is exactly what happened with IEMs and the IEF Neutral 2020 target. Crin made a target with a midrange influenced by how IEMs tended to measure, and it was a de facto standard for almost a quarter decade. While this 2020 target was popular, IEM manufacturers tuned to it fairly commonly, and while it made things “decent” more often than not, it also meant things never really went past that level and the IEM market stagnated in a somewhat mediocre place fairly quickly.
This was to illustrate specifically that averaging OE headphones for a target also used for IEMs is a bad idea, but I also think doing so for headphones is a bad idea for the reasons raised above.
Hard to do a truly blind test with geometrically different headphones given the physical differences felt by the listener may bias their impression, even if they’re not familiar with the headphone. Something like Sean Olive’s “virtual headphone” methodology where you simulate the FRs of the other headphones with a single headphone would likely be better, but you’d want to do so with blocked canal microphones IMO.
Oh sure, that wasn’t really up for debate though, I don’t think? I think we all agree on averaging being a useful tool in the toolkit, just depends on the use case.
I’m sure there are folks already doing scans for this purpose, including probably some developers.
If you could start with an IEM that measures well on an average head, ears, and canal though, then maybe you’d have some better luck doing some personalized tweaks from there. It seems like an uphill battle though to me.
I’ve been looking at a lot of papers on this the last week, yes people are working on it, but almost all of them are doing so in the context of bi-aural audio reproduction. And the other ones who’re doing research on HRTF are doing so using in ear mics and/or scanning the outer ear, the canal is usually ignored.
There are a couple of papers done using scans of inner and outer ears of people, putting them in a database, then they group them into subtypes, calculate HRTF for those. Then scan the user’s ears using a phone app, see which subtype they’re closest to, and apply that HRTF to the audio output.
No one’s doing what audiophiles ultimately want, which is a complete personalization of the sound to perfection using the ear canal (for IEMS) or the entire ear (for Over ears). Actually hardly anyone is doing this kind of research for stereo playback at all. If you do come across such papers let me know, I’d love to read it!
Well yeah , I mentioned that.
But doesn’t Dr Olives methodology kind of have a different purpose?
I’m assuming he did that to see how close they could ‘ simulate ‘ other headphones with tuning a specific headphone and see if it has the same characteristics.
What I’m saying is that if FR is most of it, which I think is what I’m hearing, then we would be able to see in maybe an inverse manner if soundstage is indeed freq response.
I haven’t read Dr Olives conclusions. But let’s say, does the soundstage follow the FR response of a specific headphone?
If the HD800 FR was simulated on another headphone…does the soundstage it has follow it?
For most of the papers, they either calculate an average using previously made scans, or assume some average value and call it good enough. Some of them compared actual scans of the test subjects, custom make HRTF profiles for them, and compared it to a more generic HRTF based on all kinds of factors… the conclusions usually say having EQed HRTF is better than none, and having custom ones are even better but not a huge amount better.
Mind you, most of them are trying to get HRTF correct “enough” for bi-aural playback for Virtual Reality in particular, and they don’t use IEMs for playback either (usually small speakers on the headband). So none of that research is really that applicable for audiophiles, since some of the treble wobble is due to wave cancellation inside the ear canal, or in the case of over ear headphones, a combination of ear canal and the coupled air space inside the ear cup. None of those effects apply if you use small speakers on the headband, they call it FEC or Free-air equivalent coupling type “headphone”. Basically none of the headphones we use are open enough other than some old Stax that’s literally speakers without ear cups.
The point here is not about the merits of using averages to indicate things in general, rather it’s the committal of what’s called the is/ought fallacy (or naturalistic fallacy), which is that you can’t derive an ‘is’ claim from an ‘ought’ claim.
So particularly when it comes to the products being considered for evaluation, where we’re trying to determine if something is good, it makes no sense whatsoever to say “they do perform this way on average, therefore they should perform this way”. And it’s also just saying they should adhere to the norm, not whether the norm is any good, and that’s the question we’re really after here.
Even if you’re selective about them and your standard is “we preferred these headphones”, that still commits the same fallacy, and is quite different from what Dr. Olive did. You could have a whole bunch of horrible sounding headphones and some would still be most preferred. An average of that is not a good indicator of how these products should perform.
And as Listener mentioned, the use of averages to indicate behavior for a given condition isn’t really the issue. Nobody is saying “headphones should behave this way on a rig”, that’s just indicating how they vary.
Harman is scanning large numbers of heads, presumably to assist in their development of headphones and other audio devices, where analysis of in-ear measurements is needed. I don’t know how much has been made public though. Or what papers they may be planning. Or how far down the canal they go.
Floyd Toole had an agreement with Harman to make his research public. I think Harman has been less willing to share all their research though since he left, based on some of Doc Olive’s comments.
Can’t comment on the papers Luke mentioned, but I believe the redesigned ear canal simulator on the HBK 5128 rig, which is both longer and much more human-like in shape than previous couplers like the 711, is an average of a number of scans of different test subject canals.
I’ve posted this video many times before in similar contexts (on other forums), but Jude explains some of the important differences here…
I think I understand this, Resolve. And understand some of the concerns you and listen_r have raised re the potential misuses of averaging in this context. I think Tyll H also had some similar concerns about approaches like this. Perhaps this would be a good topic for another discussion at another time and place.
I stand by my opinion though that averaging can be a useful statistical/analytical tool though, in the right or appropriate context.