To kick this off, I’ll port over a recent post on imaging I made:
How I Define Imaging (and thoughts on how the phenomenon occurs)
Like many terms slung around in the audio world, imaging has as many definitions as there are opinions: one won’t find a concrete definition of what it constitutes. Most already know that I don’t think most IEMs have good imaging. I’m not alone in this sentiment either, with many in my audio circle holding similar opinions. So here, I’ll try to outline what I am listening for when I assess imaging, and why the vast majority of IEMs - and even headphones - are mediocre for this characteristic of sound to my ears. Do understand that what I outline here is my interpretation only, and I’m always working to understand more about this stuff. Important terms are in bold.
First, I’m no expert, but I can tell you imaging is influenced a good deal by what tracks you’re listening to specifically. There are some tracks that are better for imaging than others by virtue of how they have been mastered. I will not explore this variable further, and assume that we are using the “best” tracks for imaging. Imaging itself can be broken down into several subsets. However, at its core, it is largely a reference to the extent to which a transducer is able to shape the perception of the “room” around the listener. So by extension, soundstage is a derivative of imaging and they are not distinct. Another subset of imaging that is commonly referred to is positional accuracy . This is simply the degree to which a transducer is able to localize instruments on the soundstage; then, the degree to which a listener can pinpoint them. This has overlap with layering , the space, or sense of physicality, between instruments on the stage. A headphone like the HD6XX, for example, has pretty terrible imaging despite it sounding reasonably “open” in terms of layering. Some will also wonder about terms like holographic or “3D” imaging. I dislike these phrases, and they’re slung around far too generously in my opinion. This is the perception with which instruments - usually percussive ones - “float” on the soundstage. By extension, this plays into soundstage height and the way an IEM shapes the walls of the stage. The IEMs that qualify as being holographic to me are the few and far between.
A phrase that I use quite often in my writing is center image . Like the phrase implies, this is the field of sound that comes from dead center in front of a listener. Within the context of headphones and IEMs, it is a psychoacoustic illusion that comes from the perception of having two channels in conjunction. There are many IEMs that may have center image, but that cannot project it. This is most obvious to me when focusing on the positioning of vocals on the stage. Transducers that are able to project the center image create what I perceive as soundstage depth. So you can imagine that most IEMs that I have heard do not have much - or have at all - soundstage depth. Hell, even many headphones I’ve heard do not have soundstage depth. The HD800S (the lauded king of soundstage by the way) is a prime example; vocals sound like they are coming - positionally - from inside my head. In general, I would say I put a strong priority on center image when assessing imaging.
Some will wonder about the correlation between imaging and frequency response. There is definitely a strong one, particularly with respect to treble. In my experience, no IEM that has had good imaging has had poor treble extension. But the opposite does not hold true; I have heard many IEMs with excellent treble extension and unremarkable imaging. So what accounts for this distinction?
First, I want to plug an excellent point brought up by luisdent after I made this post, and that is our expectation of what constitutes a real-life, listening environment. In real-life, there are “reverb trails, room ambiance, positional precision, etc. When extension suffers, our ears can tell something is off. We hear high frequencies in room ambiance that we may not realize, and all instruments have interactions at that region and others as well. This allows the realism of the recording to come through which is heard as amazing depth or soundstage”. That aside, it can be hard to say due to confounding variables. For example, standouts in imaging like the Andromeda 2020, Tia Trio, and Tia Fourte all make use of acoustic chambers. But with the likes of IEMs like the U12T and the Ikko OH10, we can isolate some of these variables. I suspect “good” imaging with these IEMs occurs not necessarily due to sheer extension, but rather due to the contrast between their post-10kHZ dips and ~15kHz peaks. This lends to unique treble reverb and, by extension, to the way the stage is imaged. Similarly, dips in frequency response can lend to the perception of more open staging. The most obvious way this is achieved is by dipping 3-4kHz to make the upper-midrange sound more distant, thus increasing perceived depth. Most of my favorite IEMs tend to make use of this tuning trick.
I want to stress that this is neither intended to be an authoritative post on imaging, nor need it apply to how you personally perceive imaging. This is simply how I understand it, and hopefully, this helps explain why I think very few transducers have good imaging overall.