Why blind listening tests are flawed


This may sound like pure flame war bait - but here it is anyway. Since rebuilding my system from scratch, and auditioning everything from preamps to amps to dacs to interconnects to speaker cable etc, it seems clearer than ever.

I notice that I get easily fooled between bad and great sounding gear during blind auditions. Most would say "That should tell you that the quality of the gear is closer than you thought. Trust it".

But it's the process of blind listening tests that's causing the confusion, not a case of what I prefer to believe or justify to myself. And I think I know why it happens.

Understanding the sound of audio gear is process of accumulated memories. You can listen to say new speakers for weeks and love them until you start hearing something that bothers you until you can't stand them anymore.

Subconsciously you're building a library of impressions that continues to fill in the blanks of the overall sound. When all the holes are filled - you finally have a very clear grasp of the sonic signature. But we know that doesn't happen overnight.

This explains why many times you'll love how something sounds until you don't anymore? Anyone experience that? I have - with all 3 B&W speakers upgrades I've made in my life just to name a few.

Swapping out gear short term for blind listening tests is therefore counter productive for accurately understanding the characteristics of any particular piece or system because it causes discontinuity with impression accumulation and becomes subtractive rather than additive. Confusion becomes the guaranteed outcome instead of clarity. In fact it's a systematic unlearning of the sound characteristics as the impression accumulation is randomized. Wish I could think of a simpler way of saying that..

Ok this is getting even further out there but: Also I believe that when you're listening while looking at equipment there are certain anchors that also accumulate. You may hear a high hat that sounds shimmering and subconsciously that impression is associated with some metallic color or other visual aspect of the equipment you happen to be watching or remember.

By looking at (or even mentally picturing) your equipment over time you have an immediate association with its' sound. Sounds strange, but I've noticed this happening myself - and I have no doubt it speeds up the process of getting a peg on the overall sound character.

Obviously blind tests would void that aspect too resulting in less information rather than more for comparison.

Anyone agree with this, because I don't remember hearing this POV before. But I'm sure many others that have stated this because, of course, it happens to be true. ;
larrybou
Ok, as I mentioned - there are two important considerations. Level matched components - quick A/B with no time gaps. It can even be sighted, but those two things MUST happen in order to exclude variables that will cloud our judgement.

During the time it takes to swap out a component, your memory has failed you (and even shorter time than that). There’s no possibility that you can remember the sound well enough to make an honest comparison. If you could, you would be an oddity, and very rich and famous. If I played you a test tone - a single tone, not even anything as complex as a song. Then very slightly pitched the tone, unplugged the cables, plugged them back in again then played the new tone - you’d hear no difference. But if I quickly A/B’d them, everyone here would hear the very slight pitch difference.

I used to get tired of hearing about level matching. Level match this, level match that... However, when you experience how different something sounds when you change the volume (even just the smallest amount) you quickly realize how important it is to level match. Our hearing is not built to make comparisons without first making sure the levels aren’t skewing the test.

I still don’t understand all of the pressure from an A/B test - I do it all of the time in the studio. Does A sound better than B? Well, let’s see: listen to A, listen to B - let’s go with B - next... I don’t sit in a session and live with a sound for a week, make a change and live with that one for another week. When you see the power in making quick, A/B comparisons, your ears will very easily tell you what’s right.

Sounds like you may not have trained your ears. If you need a blind test to hear the difference than in my opinion the change is nothing special. It took me three years to train my ears to really hear how a component sounds or interacts with a system. Since building my own components, and experimenting with parts swapping, I now have a pretty good understanding how various parts change the sound so I am not that quick to judge if one is better than the other but I can tell you what each part can do and what it cannot do in general. I also know that many people have not heard a component that really can make an improvement as I have found the different manufacturers (the large majority of them at least) all have a similar sound and each has a sound but nothing seems to do it all IMO. So as I read various comparisons, that is what I used to say before understanding how a better component can shape the sound. For example, I use a direct heated triode preamp design. Only a few people on Audiogon have heard something like this and what it can do. No caps in the signal path. You may prefer the sound of something different but most have not heard what a DHT component can do. I also have a switch that can change output resistors on the spot so you can hear what they sound like. Blind or not blind, you will hear the difference and then it all comes down to your preference. If you are familiar with two or three recordings than you should be able to hear what a change in your system does almost immediately. Otherwise I am not sure if you know what you are hearing or listening for.

My opinion. Happy Listening.
If you can't tell the difference with your eyes closed, then don't close your eyes when you listen.
Art Dudley has an interesting comparison in this month's stereophile.

Two of the examples are brilliant.

a. you don't ask an art expert to make determination of whether a painting is a forgery or fake using a blind ABX test.

b. a blind "sip" test showed Pepsi to be preferred in both Pepsi and Coke's own tests - which caused Coke to launch the ill fated New Coke - but there's a difference between a sip and a full can of the drink - and while some may prefer Pepsi's (and New Coke's) sweeter taste, it's different when you drink an entire can.
The Coke/Pepsi/New Coke tests are a great way to illustrate the limitations of blind comparison tests. The only way you can really tell what people prefer is what they choose over the long run.

Interestingly, after 27 years of digital dominance, analog-chain LPs have been roaring back. They're voting with their wallets, which is much more reliable than contrived short-term tests.

Short-term tests ignore the mechanism of mental schemas, whereby we build mental models of everything we sense. A short test ignores the mind's need to build schemas to understand constructions of various concepts (e.g., sonic signatures, musical values, etc.) and compare their virtues over time.