Why Do So Many Audiophiles Reject Blind Testing Of Audio Components?


Because it was scientifically proven to be useless more than 60 years ago.

A speech scientist by the name of Irwin Pollack have conducted an experiment in the early 1950s. In a blind ABX listening test, he asked people to distinguish minimal pairs of consonants (like “r” and “l”, or “t” and “p”).

He found out that listeners had no problem telling these consonants apart when they were played back immediately one after the other. But as he increased the pause between the playbacks, the listener’s ability to distinguish between them diminished. Once the time separating the sounds exceeded 10-15 milliseconds (approximately 1/100th of a second), people had a really hard time telling obviously different sounds apart. Their answers became statistically no better than a random guess.

If you are interested in the science of these things, here’s a nice summary:

Categorical and noncategorical modes of speech perception along the voicing continuum

Since then, the experiment was repeated many times (last major update in 2000, Reliability of a dichotic consonant-vowel pairs task using an ABX procedure.)

So reliably recognizing the difference between similar sounds in an ABX environment is impossible. 15ms playback gap, and the listener’s guess becomes no better than random. This happens because humans don't have any meaningful waveform memory. We cannot exactly recall the sound itself, and rely on various mental models for comparison. It takes time and effort to develop these models, thus making us really bad at playing "spot the sonic difference right now and here" game.

Also, please note that the experimenters were using the sounds of speech. Human ears have significantly better resolution and discrimination in the speech spectrum. If a comparison method is not working well with speech, it would not work at all with music.

So the “double blind testing” crowd is worshiping an ABX protocol that was scientifically proven more than 60 years ago to be completely unsuitable for telling similar sounds apart. And they insist all the other methods are “unscientific.”

The irony seems to be lost on them.

Why do so many audiophiles reject blind testing of audio components? - Quora
128x128artemus_5
Well how can you fault a business for trying to manufacture a market? And it is different in that you can buy a product and decide for yourself, using whatever methods you choose, if the item is worth the asking. A tangiable product which you can audition and compare against other similarly priced products. It isnt even remotely the same thing and you suggesting it is similar suggests that you are simply another casualty of this age.
Post removed 
I believe he was talking about the point of advertising--advertisers use our own nature (and pride) to sell their products, making us believe they are better, more luxurious, more this more that. Some products may be better, and I think my post above specifically pointed out that comparison of different models and coming away liking whatever it is you picked out is a good thing. What’s not good is when you buy it and then lord it over others (or when you don't and then berate others who did). Not sure why there is an issue with that.
Me again. I commented above on the psychometric technique of Semantic Differential rating scales, which is the primary technique I use in assessing perception because it is easy for participants to use and it differentiates products along multiple dimensions simultaneously. However, I have also used another psychometric technique, which is based upon similarity judgements. It strikes me that this touches upon the issue of AB and ABX testing. I collect similarity data either by having participants in my studies rate the degree of perceived similarity between pairs of items, or with a "triad" method in which items are presented three at a time and the participant selects the one that seems most different. Each triad provides input to three cells of a (dis)similarity matrix. Cells for the two items that weren't chosen as being "different" are coded "0" for minimum dissimilarity and cells for the two pairings involving the "different" item are coded "1" for maximum dissimilarity. I sum the matrices for individual participants and analyze it using Multidimensional Scaling to map all items within a multidimensional perceptual space. 

The rating method provides what is essentially a continuous metric for the AB comparison and the triad method provides something that is functionally similar to ABX except with three different items and the question being reversed, i.e., not which two are the same, but which one is most different? The two items remaining after the odd-man-out become the "same" items in ABX.

Although many psychometricians assume that the similarity judgments and Semantic Differential ratings produce equivalent results, I find that is usually not the case. In my experience, the Semantic Differentials (using Factor Analysis) differentiate a larger number of independent perceptual dimensions whereas the similarity comparisons (using Multidimensional Scaling) sometimes reveal higher order perceptual qualities that are not easily described with words. Either way, the emphasis is on identifying the number of different ways that things vary. Perception is always multidimensional unless you intentionally restrict the range of variation. Ironically, that is exactly what happens when you compare only two or a small number of items.
Well after you buy something what you choose to do is up the the individual and I feel it no more likely with an expensive product than a cheaper one. We see clues of this everywhere on this forum with suggestions that a Magnaplanar or horn speaker is better than the Wilson or other types. I see nothing wrong advertising goals because questioning proper etiquette removes individual responsiblity from the experience. You might say that the person who succumbs to advertising and buys a very expensive speaker was manipulated but I would never presume to know the persons motivation. Nor that they were inspired some dletch2 bias or for any other reason that it sounded better. Why dont some of you admit that there is a huge moral piece to all of this in terms of levels of accepted consumption (see Wilson thread).


Seriously dltech2, you have to come up with something better than cults. You are wasting my time with these stretches. You are the one asking us to join the blind listening cult and telling us that without this method we are being duped. We are not asking you to change anything in your regimen we just propose that it doesnt accomplish what you think in all cases and are not asking you to join our cult. Make no mistake, as per your definition, these are both cults apparently.