How to compare amps more fairly - my experience

Last week I made a headphone amp comparison between four different amplifiers: RNHP, SP200, G103 and G111. I concluded that those headphone amplifiers sounded the same when volume matched. If you would like to know how I came to this conclusion, please read about my test here.
However, in this post I am going to share you the lessons I have learned from this test.
I learned about how to make headphone amp comparisons fairer and more accurate.
If that sounds interesting to you, please read along.

Generally, when comparing two sides, it’s important to give both sides equal chances.
However, there are two major problems that are hindering us to do that:
A.) Our bias
Bias is to have an inclination or prejudice for an entity, which may result in incorrect treatment of that entity. We have to identify those biases and be self-reflective! Here are three examples:
-Price-value bias: The bias here is that the more expensive amp would always sound better, so the listener would subconsciously favour the more expensive amp and prematurely mark the cheaper amp as inferior. The Lake People G111 is more expensive than its older sibling G103-S, so we tend to think the G111 as more valuable and better sounding.
-Reputation bias: The reputation of an entity leads us to perceive it the same way. In example:
The RNHP has a reputation of having an organic sound, which will lead us to perceive the RNHP as more organic sounding. This process can also be found in the concept of conformity.
-Appearance bias: The appearance of an entity with its associated characteristics. For example, a modern looking and high-tech style amplifier could be perceived as analytical and harsh, while an old school looking amplifier could be then perceived as organic and lifelike. You can link this with the psychology of shapes or -colour.

B.) Unequal circumstances
Unequal circumstances of the parties, can lead to one party gaining an unfair advantage over the other. We have to arrange an equal playing field for the amps to give equal chances.
In order to arrange everything fairly, it’s recommended to have all amps set their best type of inputs/outputs and then volume match them. The key factor here is volume matching!
A volume level mismatch between amps can lead to inaccurate judgements of their sonic characteristics. For example: Imagine two identical sounding amps, but with a slight volume level difference. Comparatively, the louder amp of those two, would be then perceived as brighter and more detailed sounding, while the quieter amp would be then perceived as warmer and more bassy sounding. Solely due to a slight volume level mismatch between identical amplifiers, the sound can be interpreted differently for each of them! If you want to read more about volume matching, then please read my test, where I found no differences between amplifiers, when volume matched.

C.) To conclude: The sound we perceive from an amplifier would be then this following equation:
perceived sound quality = true sound of the headphone amplifier + bias (price, reputation, looks) + volume level mismatch
So, in order to perceive the true sound of the headphone amplifier,
we have to get rid of bias and volume level mismatch.

I hope that headphone amp comparisons will be done more fairly and unbiased in the future,
so the audiophile world can be more accurately represented to others.
Thank you for reading!
Sincerely, nhatlam96.

10 Likes

Very interesting test @nhatlam96. In some ways, I’m surprised but in others, not so much. I just have a few comments that help me make sense of what happened and why there is no perceivable difference between the different amps.

  1. Your source is Spotify -> SMSL Sanskrit. Even though Spotify is a great and convenient source of music, it’s not known to be the highest resolution available. If you are comparing whether these amps make sense to you and your normal scenario is just using Spotify, then it makes sense to continue using Spotify as your source. However, if you wanted to make the test more general and objective, I would suggest trying to acquire the best sources available so that you don’t have bottlenecks in other areas of your chain. While I don’t have experience with the SMSL, I can only hypothesize that the DAC isn’t quite revealing enough either.

  2. All of your amps kind of aim towards the same target audience. Both Lake People and RNHP are studio brands and I would expect them to produce a very neutral sound as a tool for studio work rather than musical enjoyment. I think it would be detrimental for a studio brand to have a colored amp when compared with other studios. The SP200 is kind of the odd man out but SMSL/topping is striving to produce the most distortion-less amps out there so I doubt they are designing with musicality in mind as well.

  3. Your amps are also all in what I would consider “budget to mid-fi” area as well. I would be interested to see a larger range of amps to see if that would make a difference. I think that area has so much competition and most of the consumers currently in that area want a “neutral” and “distortionless” amp so it makes for a very boring selection. If you were to throw in the Asgard 3, I wonder if that would change it up a bit or something like the V281.

Anyways, thanks for the writeup! It’s an interesting perspective to ponder.

13 Likes

Welcome to the forum. Nice work and writeup. I also believe bias is a potential issue and that there is value in blind listening (and tasting for liquids).

Volume matching also levels the playing field for a more valid comparison.

4 Likes

I’d suggest trying out the Asgard 3 along with one of those amps if you have the opportunity to. As @driftingbunnies stated, the amps you do have on hand have a neutral sound signature.

I also do believe that we definitely need to push blind testing more and isolate bias as much as we can.

2 Likes

Great test. Thanks. I think volume matching is the most significant variable and one that many fail to achieve when comparing.
For me the expectation bias is easy to nullify w/o resorting to blind testing. Simply be willing to be open to surprise - once you have an experience of that you can find & maintain that attitude again.

3 Likes

Of course, without actual blind testing … such comparisons are still subject to expectation bias (which works both ways), which significantly devalues the process. And volume-matching needs to be done empirically (i.e. at least with the test signal and multi-meter … doing it by ear is far too error prone).

14 Likes

Proposed corollary: the less (but still) enjoyable gear (due to worse relative sound quality) must be played at higher volume compared to the more enjoyable gear in order to be enjoyed.

3 Likes

There are dozens of cognitive biases, and NONE of them can be dismissed by conscious effort. These biases are unconscious. It is because of these biases that blind testing is required. There is no other way to achieve an objective evaluation of any product, whether it be audio components or wine. The subject is well-known and has been widely studied for decades.

In this particular test, blind testing was unnecessary because no discernible differences were noted, despite being a sighted test. But it was not a proper test nonetheless.

3 Likes

Is the conclusion that blind testing was unnecessary here absolutely true? Is it possible that results could have been different if blindly tested?

4 Likes

You should try different kinds of amps! Class A/B amps like the Asgard 3 and Lyr 3 sound forgiving and full bodied with a bit of warmth. I love them despite regretting getting both. Choose one or the other. Passion for Sound did a really great review for the cheaper one. I’ve never heard a pure class A amp, but some mention them being slightly euphonic. Then going to OTL changes the listening experience completely. I only own one and have just as many things to say about its sound as I do about any headphone. :laughing:

I’ve come to accept there are many people here reviewing gear way outside my price range that are worthy of complete trust as they’ve shared their stories of upgrading over the months or years. Really thankful to be in a community where they’re so open about it. In my case the upgrades were very noticeable every time I made a jump.

7 Likes

@nhatlam96 contacted me via PM at ASR regarding a post I made there comparing RNHP and the Monoprice 887 THX amp. He asked if I’d mind if he quoted me here, but since I’ve already participated in some other discussions on these forums (mostly posting pics of my gear), I figured I’d join the conversation and share here what I shared in our conversation.

For context, my response here was in regards to nhatlam’s report of hearing no material difference across 4 amps using Focal Clear:

–

My background is in engineering, but I appreciate a (true) scientific approach to learning about HiFi–one where experience and measurements align. I’ve found the peanut gallery at ASR is sometimes hostile to any kind of sharing of subjective experiences. Meanwhile, there are plenty of other sites where people have no background and no interest in engineering or measurements and perform naive, flawed comparisons and jump to poor or hasty conclusions about various pieces of gear.

There are a lot of factors that can color opinions, volume matching being a major one. I recently got a Holo Audio May DAC and integrated it into my system with my existing Ayre QB9-DSD DAC and noticed the May was obviously fuller and more detailed, which immediately got me wondering about output voltage of the two DACs. Sure enough, the May outputs a significantly stronger signal; I was able to attenuate that in my preamp to match the level of the Ayre and–surprise!–the sound of the two DACs became much closer to one another.

Some other factors that can color perceptions include bottlenecks in the audio chain, the unreliability of listening memory, mood, fatigue, and even just where you focus attention while listening. I like to do repeat listens to the same tracks on the same day and also revisit on more than one day to try to counter some of these effects. Real differences ought to be repeatable.

I have found some interesting differences between amps over time, and I’ve been able to correlate those with measurements. For example, as Amir noted in the RNHP review, the Ether CX does not play well with that amp. Ether CX is a planar with very low impedance and somewhat low sensitivity–it needs a lot of current to drive well, and RNHP is not great at delivering that, but Topping A90 is. Meanwhile, at the other end of the spectrum, the 600 ohm Beyerdynamic DT 880 is also somewhat insensitive and needs (anecdotally) something like 300-ish mA to be well controlled. When it is, it sounds very detailed and refined. Many amps, including A90, run out of steam to drive heavier loads. If you simply look at the power-to-dB formulas and listen, lesser amps can make the 880s loud enough, but extra power makes them sound amazingly clean for such inexpensive heapdhones. They are an altogether different headphone out of my Violectric V280 or Pendant (tubes, but powerful–Pendant can put a whopping 2.5A into 300 ohms!).

Anyway, those are some of my ongoing experiences with subjectively evaluating amps and amp / headphone combos. Focal Clear is a relatively low-impedance / high-sensitivity headphone that operates in a range that many headphone amps do well. It’s not too surprising to me that you’re not hearing differences–that makes sense to me given how relatively easy they are to drive.

–

Since this is now a broader conversation than the original PM, I’ll offer further that my engineering background is in ASIC verification. I prefer to summarize my job as “risk mitigation.” I need to evaluate an ASIC design and, nominally, make sure it works and try to ensure it won’t break when unexpected things happen. That means taking a hard look at what variables affect the system and find ways to account and control for those variables.

There are a great many variables involved in evaluating listening chains. While experience has shown that headphones, IEMs, and speakers (usually) have the most dramatic effect on how a given chain sounds, you’re also listening to source music, DACs, amps, cables and interconnects, and a variety of potential deleterious effects. Plus, of course, all of your personal biases and potential differences in your hearing.

And then, on top of all that, there’s the challenge of trying to find the language to relate auditory experiences in a way that both parties agree that they heard the same thing. None of this is easy.

Anyway, that’s how I approach evaluating listening chains for differences. I do appreciate rigorous standards like double-blind testing, though it’s not always practical to implement testing like that, it doesn’t fully control variables like mood / fatigue / listening memory, and it’s easier to detect repeatable differences with experience and with familiar recordings.

So, on the whole I tend to be somewhat conservative about reporting differences unless I feel pretty sure about them. While I tend to find repeatable differences somewhat easier with headphones, they’re often a lot harder with DACs and amps. And, at a certain point, if I have to struggle so much to be certain of differences between listening chains, those differences aren’t going to be very important when I hang up the critical listening and just try to enjoy the music.

12 Likes

Here is a very good article on amplifiers and their sound, written by an engineer with a lifetime of knowledge and experience in the industry: https://sound-au.com/amp-sound.htm

5 Likes

The reason I am sticking with Spotify is because I was not able to distinguish between lossless and lossy reliably. Have you tried it? The files are volume matched and the comparison is done by ABX method. So there is no room for bias and volume level mismatch.

My SMSL DAC has a SINAD of 114 dB. If someone is able to differentiate between DACs above 100 dB SINAD, then he would also be able to differentiate between amps above 100 dB SINAD. However, I would then doubt that his comparison was free of bias or volume level mismatch.

Coloured- or distortion less sound is a matter of personal preference, because musicality is subjective.

The V281 has a SINAD of 111 dB and the Asgard 3 has a SINAD of 109 dB. They both lie in-between SP200 117 dB SINAD and RNHP 101 dB SINAD. Since the SP200 and RNHP already sounded the same, then logically the V281 and Asgard 3 should also make no difference.
Out of love and curiosity for this hobby, I will definitely try a volume matched comparison between V281 and SP200 in the future and update you guys! I just hope that bias will not make me lead to false conclusions. For example:
Many people say that Asgard 3 and V281 make the sound warm, that could lead to reputation bias . The V281 looks big, so it must sound musical → appearance bias . The V281 cost 1400€ and is a lot more expensive, so it must be an upgrade → price-value bias . Biases like this, may delude the perception of sound and I will give my best to not fall for this delusion to give a fair comparison.

This is true. It’s a notorious trick from hifi audio stores. Just dial up the volume slider, then it will sound better. Luckily I did volume matching in my test, so that wasn’t a problem. It’s good that you are pointing out, that volume level mismatch is a real thing.

4 Likes

Suspending judgement is conscious & willful.

These “devices under test” are made for personal & subjective use & enjoyment, not scientific instruments for measurement.

2 Likes

Wouldn’t you say that the fact that you said “logically the V281 and Asgard 4 should make no difference” is another bias? You’re already assuming that just because of one measurement that that is the main factor that determines the sonic qualities of an amp.

So let’s remove the V281 since it has reputation bias, appearance bias and price-value bias. Why would the asgard 3, which is within the same budget as all your other amps and is relatively the same size, be considered warm and the others neutral? Is it truly just a reputation bias or is there something else there? How can you assume that everyone who has an asgard 3 has fallen into the trap of reputation bias?

Wouldn’t it be fair to include all different types of amps (i.e. warm, neutral, dry, bright, dark, etc) amps in your comparison if you truly wanted to prove that all amps are the same?

7 Likes

It is possible for 2 pieces of gear with equal (or with immaterial/tolerable variances) specifications or measurements to sound different.

Eliminate luck or chance where appropriate! I kid, perhaps you meant fortunately and not [luckily in the context of matters of chance], as you don’t strike me as someone who would.

3 Likes

Just want to confirm what I just read. Its being preached that amps all sound the same and dacs sound the same? And if we hear differences we are listening with our wallets and eyes?

10 Likes

I’m not going to comment or touch this with a 10 ft pole.

12 Likes

Wasapi exclusive mode on Qobuz is great. Even better paired with Roon. Give it a shot!

5 Likes

Love this topic! Very relevant.

I pretty much avoid any type of comparisons because of the subjective part of them. If I cannot make sure that I am making a fair comparison, then I will avoid it as a whole.

Here is a very lengthy controversial article that covers our bias and various different topics, I think it’s very close to the OG topic: https://nwavguy.blogspot.com/2012/04/what-we-hear.html

The part that I found particularly interesting is where it says our audio memory starts to degrade after 0.2 seconds. But the whole article as a whole is pretty brave.

3 Likes