Why blind listening tests are flawed


This may sound like pure flame war bait - but here it is anyway. Since rebuilding my system from scratch, and auditioning everything from preamps to amps to dacs to interconnects to speaker cable etc, it seems clearer than ever.

I notice that I get easily fooled between bad and great sounding gear during blind auditions. Most would say "That should tell you that the quality of the gear is closer than you thought. Trust it".

But it's the process of blind listening tests that's causing the confusion, not a case of what I prefer to believe or justify to myself. And I think I know why it happens.

Understanding the sound of audio gear is process of accumulated memories. You can listen to say new speakers for weeks and love them until you start hearing something that bothers you until you can't stand them anymore.

Subconsciously you're building a library of impressions that continues to fill in the blanks of the overall sound. When all the holes are filled - you finally have a very clear grasp of the sonic signature. But we know that doesn't happen overnight.

This explains why many times you'll love how something sounds until you don't anymore? Anyone experience that? I have - with all 3 B&W speakers upgrades I've made in my life just to name a few.

Swapping out gear short term for blind listening tests is therefore counter productive for accurately understanding the characteristics of any particular piece or system because it causes discontinuity with impression accumulation and becomes subtractive rather than additive. Confusion becomes the guaranteed outcome instead of clarity. In fact it's a systematic unlearning of the sound characteristics as the impression accumulation is randomized. Wish I could think of a simpler way of saying that..

Ok this is getting even further out there but: Also I believe that when you're listening while looking at equipment there are certain anchors that also accumulate. You may hear a high hat that sounds shimmering and subconsciously that impression is associated with some metallic color or other visual aspect of the equipment you happen to be watching or remember.

By looking at (or even mentally picturing) your equipment over time you have an immediate association with its' sound. Sounds strange, but I've noticed this happening myself - and I have no doubt it speeds up the process of getting a peg on the overall sound character.

Obviously blind tests would void that aspect too resulting in less information rather than more for comparison.

Anyone agree with this, because I don't remember hearing this POV before. But I'm sure many others that have stated this because, of course, it happens to be true. ;
larrybou
I don't think Onhwy61 is reading the article right. He didn't say the art expert is analysing blind. He said, they don't make a conclusion based on small samples of an artwork when comparing real vs forgery.

But the conclusion is based on a long process that requires the total painting.
Thanks Doggieh, that makes more sense to me. I'll go back and look at the article again. I usually like Art D's writing, I just thought he was being shrill and wondered why he was even writing on the topic.
Lets not forget that magazines charge the equipment manufacturers for equipment reviews via advertising contracts. This is why overtly bad reviews are rarely seen (and when they happen you can be sure no such contract was signed).
What people perceive as sounding good is based on personal taste and the only important part of a panel of experienced listeners"blindly" judging/reviewing a piece of gear is what they agree on. That can be useful information. I'm not sure what Learsfool means about how "we professional musicians learn" since he' unclear about what it is they (us) are learning...how to play in tune? How to get an "early music" tone? I've imposed my personal opinion of many thousands of listeners as a live sound mixer and musician, and if enough people told me I suck at it I'd stop or simply not get hired again, or maybe I'm simply lucky. I've had hifi gear (and guitar amps) start sounding not so good, greasing the way for that component's exit due to the gear changing (it happens) or my taste changing or the gear simply sucking large chunks (a technical term) from day one. The quest for the "absolute sound" is silly if all you're looking for is validation of what might be your quirks (common among the wildly insecure), so it's always best to just hang with what feels right, and the absolutists can pound the collective sand.

10-22-14: Atmasphere
The need for blind tests illustrates to me just how little the bench measurements correspond to what we hear. If there was greater correlation, we would not need the blind tests at all.

Am I the only one that sees the irony?
I think the discrepancies between bench measurements and subjective impressions is that the bench measurements were contrived by one mindset while we listen with a different one.

Consider: When did measurements get important? I say it was in the late '60s to accompany the introduction of solid state electronics. In the '50s and early '60s, did tube products from HH Scott, Fisher, Marantz, Dynaco, Heath, Eico, etc. specify bandwidth and THD? These measurements really came into vogue when solid state components started taking over the product mix in the late '60s. I think the engineers formulated the specs and the marketing people used the specs to convince the buying public that the SS components measured better and therefore sounded better. It became such dogma that the great offerings from C-J and ARC were laughed at for daring to allow a 1% THD.

But I haven't seen much serious challenge to the validity of these specs and measurements as they relate to human perception. Yet most standard measurements are in the frequency or input/output comparison domains, and few if any are in the time domain or amplitude domain. In other words, the specs are oriented toward sound, but not musical values.

Consider THD: It stands for *Total* harmonic distortion, intentionally lumping together even order and odd order harmonic distortion. Not too surprising, as tubes' THD favors even-order, which tends to enrich the sound. And the easy way to lower THD in an SS amp is to increase the number of feedback loops.

But every time you add a loop, you slow down the rise time, which affects the timing, a value that affects the rhythmic aspect of music but doesn't show up in test tones, the stock in trade of specs and measurements.

Another aspect of sound reproduction is resolution of small differences in amplitude, something sometimes called microdynamics. This resolution conveys the finer expressivity and interpretation of a musical piece, the one that distinguishes Horowitz from a 2nd year piano major. Based on what I hear, this finer resolution is what distinguishes analog and tubes from digital and transistors. It's easy to feel the differernce on voice or cello.

Case in point: I have the CDs of Rostropovich's Bach Cello Suites. When I played it for the first time, my wife found it incredibly irritating. Later I got the Starker LP reissue of the same works. It's one of her favorite recordings as long as it's analog. It's not really "ears" (she has tinnitus) or expectations (she had none); it's how the output affects your brain waves and emotional state.

If reproduced music doesn't create the intended emotional state, it's a big fail, no matter what the product's controlled listening tests, measurements, or spec sheets tell you.