Why Do So Many Audiophiles Reject Blind Testing Of Audio Components?


Because it was scientifically proven to be useless more than 60 years ago.

A speech scientist by the name of Irwin Pollack have conducted an experiment in the early 1950s. In a blind ABX listening test, he asked people to distinguish minimal pairs of consonants (like “r” and “l”, or “t” and “p”).

He found out that listeners had no problem telling these consonants apart when they were played back immediately one after the other. But as he increased the pause between the playbacks, the listener’s ability to distinguish between them diminished. Once the time separating the sounds exceeded 10-15 milliseconds (approximately 1/100th of a second), people had a really hard time telling obviously different sounds apart. Their answers became statistically no better than a random guess.

If you are interested in the science of these things, here’s a nice summary:

Categorical and noncategorical modes of speech perception along the voicing continuum

Since then, the experiment was repeated many times (last major update in 2000, Reliability of a dichotic consonant-vowel pairs task using an ABX procedure.)

So reliably recognizing the difference between similar sounds in an ABX environment is impossible. 15ms playback gap, and the listener’s guess becomes no better than random. This happens because humans don't have any meaningful waveform memory. We cannot exactly recall the sound itself, and rely on various mental models for comparison. It takes time and effort to develop these models, thus making us really bad at playing "spot the sonic difference right now and here" game.

Also, please note that the experimenters were using the sounds of speech. Human ears have significantly better resolution and discrimination in the speech spectrum. If a comparison method is not working well with speech, it would not work at all with music.

So the “double blind testing” crowd is worshiping an ABX protocol that was scientifically proven more than 60 years ago to be completely unsuitable for telling similar sounds apart. And they insist all the other methods are “unscientific.”

The irony seems to be lost on them.

Why do so many audiophiles reject blind testing of audio components? - Quora
128x128artemus_5

Showing 10 responses by dletch2


edgewound
67 posts
04-29-2021 12:21pmHarman Int'l uses blind testing quite frequently to develop cost effective products that the market will consume.

Audiophiles reject blind testing out of fear. Fear of what? It's pretty obvious. The Oz syndrome.




Fear and ignorance.

steakster
1,141 posts04-29-2021 12:48pm
There aren’t any equations for touch, smell, feel, hear or taste.


That would explain why the food industry places so much emphasis on tests equivalent to ABX testing if not much more rigorous. They have whole societies and technical disciplines in place for the science of testing, and they use blind tests almost exclusively for taste. Pepsi Challenge anyone ...


djones51
3,869 posts
04-29-2021 3:12pm
It's really a depressing question. Why do so many people reject/fear science?

To quote Disney, "because when everyone is super, no one is super".    Bonus points if you can identify the reference without Google.


Did you even read what you posted. Here let me help!

In addition, the discussion emphasizes the usefulness of the ABX approach for testing clinical populations.

The results are interpreted as providing evidence for separate auditory and phonetic levels of discrimination in speech perception.

The obtained one- and two-step functions for both ABX and 4IAX tests are consistently better than the predicted discrimination functions, although the form of the obtained and predicted functions do match each other reasonably well.


The testing had absolutely nothing to do with blind testing by the way. ABX is just one of many test procedures used. Preference testing, pair testing, triads, etc.


Guess what, our brain can only detect timing differences out to 0.5 milliseconds. Does that means that we can’t discriminate audio signals longer than 0.5 milliseconds? If you don’t understand what you are reading then it is best not to comment with authority. I don’t ask my mechanics to interpret my x-rays for a reason!


Here let me illustrate how flawed your logic is.  Audiophiles regularly claim that they can instantly tell the difference from one cable to another because the soundstage got wider, instruments better defined, etc.  Most of that is embedded in first arrival information, stuff on the order of milliseconds. By the logic you attempted above, you should not even have been able to remember a difference!  But you did. Why? Because we don't remember waveforms, we remember the impacts of them, but the accuracy of those memories decay too. So if I play something now, and play it again 30 seconds later, and something in the image shifts 5 degrees, you will notice it.  But if I played one now, and another in a week, you would not be able to accurately identify a shift and the result would be random.


p.s. The test in the literature is a discrimination test, like positional accuracy tests. It tests a very specific processing feature of our auditory system. The funny thing is, tests like this within the domain of audio reproduction don't even need ABX testing. I simply have to test with 1 cable, look at my results, then repeat the test with a different cable and look at my results. If they are the same, the cable made no difference.  Does not matter how long our audio memory is.  Again, don't take your medical x-rays to your mechanic.
Long term evaluation is the only accepted way to evaluate audio gear. The snapshot of ABX testing is not reliable as most ABX testing results show.
 


Two falsehoods in two sentences. Care to try for 3?

mikelavigne
1,658 posts
04-29-2021 3:19pm
i've challenged blind testing advocates to show me a system that equals or exceeds the performance of my system using only blind testing as a system building method.



That does not even make sense.

cleeds3,773 posts
04-29-2021 1:59pm
The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles. It’s tedious. Time consuming. Boring. And still prone to errors.)



It's amazing that you could read this article, though I don't think you did, I think you are quoting others excerpts, and reach this conclusion!


THE AUTHOR IS NOT ADVOCATING AGAINST BLIND TESTING!  Can I be any more clear? What he is advocating against is poor quality of testing, such that the results are taken as absolute, without any consideration to whether test implication truly met the goals, and the opaqueness that often surrounds these tests!
Gee @cleeds , nice selective posting there. You know there are AES members and people with access to research literature here ...

This is a convention paper, not a journal paper, which means it does not go through the normal peer review of a formal journal paper.

https://secure.aes.org/forum/pubs/conventions/?elib=11480


The conventional .05 significance level used to analyze typical listening tests can produce a much larger risk of concluding that audible differences are inaudible than concluding that inaudible differences are audible than concluding that inaudible differences are audible, resulting in strong systematic bias against those who believe differences are clearly audible between well designed components that are spectrally equated and not overdriven. This paper discusses ways to equalize error risks, introduces a quantitative measure of a listening test’s fairness, discusses implications for literature reviewers, and presents a statistical table enabling readers to conduct equal-error analyses without calculations.

I volunteered for an ABX speaker wire test at Klipsch HQ back in '06. The first five rounds, I was perfect. 5 for 5 identifying the more expensive wire versus the lamp cord.  

My accuracy, as the test continued, began to deteriorate, as my ears desensitized to the source material and it all began to blur together, hearing the same small segment of the same musical passage over and over again. I finished the test 13/20. So I barely did better than a coin flip on the last 15.
 

13/20 across a range of test subjects would be statistically significant, but this point to bad test design, and not any error in blind testing. The result actually had nothing to do with blind testing at all, but an ABX test where listener fatigue set in. Any good analysis of results would also look at grouping to determine if there was a listener fatigue element. This goes back to the opacity of testing, all results and methods should be published.
The notion that blind testing for audio is an absolute test is absurd, and on so many levels. There is abundant literature (although not enough) on the frailty and limitations of blind testing in all matters of research. (That doesn’t mean that blind testing doesn’t have its place in audio, but it’s useless for most audiophiles.


No, there is not abundant literature that says blind testing is bad. You will have a hard time finding any.  There is literature that deals with bad testing that is blind, but not the basic concept of blind testing.  Every example given in this thread claims to show blind testing is bad, but not one of the actually does.