Audiogon Discussion Forum

@teo_audio

The place it counts is in the micro expression of transients and micro transients and the differences in level and timing between them.

This sums it up well.

This is where digital and class d falls apart. Those are the points of greatest distortion, in digital and class D.

I would agree, if "digital and class d" meant "mass market digital and mass market class d". Highest-end digital and class d are much harder to differentiate from highest-end analog.

In science, things are supposed to correlate to the situation at hand. Do you understand the question? Is the measurement relevant to the question at hand? If not, go back to the start and have a go at it again. Even when done, keep questioning the results and facts don’t exist..so that all finalized things can be gone over again and altered according to new results on the complex investigation of it all. That’s science.

Exactly. That's what I was pointing out to certain ASR folks. If a theory doesn't fit facts, keep working on the theory, instead of rejecting facts out of hand.

Engineering is specifically NOT exploration, engineering is designed for building things that work and use scientific theories turned into scientific law. Law...Which is a falsehood built for the engineering trade and training within it, for linear minds which are principally dogmatic in form and function.

I see it differently. Not a falsehood, but a model simple enough to be applicable in economical way to a day-to-day engineering.

In audio, the measurement and the analysis is wrong, just plain wrong. Too many engineering minds on the job, trying to play it safe and keep things ordered & black and white.

Measurements are measurements. If they are done competently, with calibrated instruments, and only what is actually measured is claimed, I'm happy to use them.

Analysis is a different story. Analysis always presupposes a theory, or at least a paradigm. And this I consider too rigid in the current mainstream audio.

This is why the audiophile conundrum has existed for about 50 years. The ignorance of projection in the pundits that surround the engineering trade and ideals that are involved in the audio world. Interference (engineers from other areas) from outside audio (even more ignorant!!) helps keep the insanity frothing along nicely.

There are other reasons for relative ignorance of the hearing system fundamental properties among practicing engineers. One of them is that not all relevant knowledge is even discovered yet. Another is that some very relevant knowledge was discovered relatively recently, and practicing engineers weren't taught it.

To clarify, an engineer is not trained to commit to the scientific method or invention, they are trained to follow the books, as that is why they are engineers, not scientists who explore and change things as required when required.

Agree. People like me, trained as scientists, are often perceived as "irreverent" in regard to dozens of audio engineering handbooks published over several past decades. Most engineers (not all) take doubting certain things written in these handbooks as a manifestation of sheer stupidity.

Meanwhile, a whole parallel world of peer-reviewed audio science publications exists. It is instrumental to observe how drastically the theories changed over the past 50 years, prompted by more and more sophisticated experiments, and breakthrough discoveries in the field of mammalian audio system physiology.

If you want to explore in formal sense, go back to school and get trained to see all as theories, which are subject to change from/on new data, tests and proofing, correlation, etc. Get trained as a scientist.

Not practical for most practicing engineers. The change will only occur gradually, as older generations retire and new ones are taking their place.

When this mess erupts into fully blown projections in insanity of following the dogmatic rule books of engineering, we end up with things like ASR.

I like pretty much all ASR measurements. What I don't agree with is some of the analysis they derive from the measurements. ASR crowd is very uneven: there are bone fide luminaries posting there, and also folks who keep scoring points for slighting others. Guess who ends up with more points?

The longer a problem sits unsolved, unresolved.. the more fundamental the error in the formulation of the question.

Agree. As an example, Ptolemaic System was generally believed to be true for about 14 centuries.

Thus, the audiophile conundrum is deeper than this surface level stuff that people generally think it is. It’s deep in the minds involved, regarding how they explore reality.

Indeed.

As long as dogmatic minds try to figure out what is wrong in audiophiles vs measurements, without moving to true and proper scientific method...the longer they’ll be spinning around and getting no real correlating clarity in any of it.

I'd say the truly dogmatic minds don't even try to figure out what is wrong. They just reject the facts as aberrations, just like later-centuries Ptolemaic scholars ignored observed deviations in planets movements not explainable by their preferred theory.

Let's dissect thinking behind ignoring one of such facts in audio: certain types of music, for instance classical symphonies and gamelan, tend to not sound right when published in CD format.

What is usually offered as grounds for rejecting such statement? The Sampling Theorem and one of the ways to calculate dynamic range of a digital format.

The Sampling Theorem (https://en.wikipedia.org/wiki/Nyquist–Shannon_sampling_theorem) reads in its original edition:

If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.

This theorem is often taken as "proof" that sampling frequency of 44.1KHz is sufficient for encoding any meaningful music signal. Because, "obviously", everything above 20 KHz can't be heard by humans, and thus is not worth encoding.

Let's look closely. What does "contains no frequencies higher than B hertz" actually mean? It means, using formulation in same Wikipedia article, that "Strictly speaking, the theorem only applies to a class of mathematical functions having a Fourier transform that is zero outside of a finite region of frequencies."

Do analog signals corresponding to practical music pieces have Fourier transform that is zero outside of a finite region of frequencies? Absolutely not! Because, as another theorem from Fourier analysis proves, only functions of infinite duration can have such Fourier transform.

Let it slowly sink in. The Sampling Theorem, strictly speaking, is not applicable to analog signals corresponding to practical music pieces. But, obviously, some form of Fourier transform is widely used in audio digital signal processing. What's going on here?

What is actually being used, in discrete form, are variations of Short-Term Fourier Transform.

A fragment of a signal, let's say with a duration of 20-25 milliseconds, is taken, then multiplied by a so-called "smoothing window". The resulting function of time is guaranteed to smoothly start as 0, and smoothly end as 0.

Then the signal is mathematically virtually replicated infinite number of times. Since it is now of infinite duration, the Fourier Transform result has limited range of frequencies.

Then the process repeats with another fragment of the signal, starting at 10-12.5 milliseconds later than the previous piece. For the purposes of digital filters, this process sometimes virtually repeats with shift of just one digital sample duration.

So, in practical applications, digital signal processing uses an approximation of Fourier Transform. Correspondingly, the Sampling Theorem only works approximately. Most of the time more than well enough. Sometimes not at all.

Now let's consider the issue of sufficient dynamic range. It is oft-cited that CD format has dynamic range of 96 dB. Let's see, approximately, how one could come to such conclusion. 1 bit corresponds to 6 dB SPL. So, "obviously", 16 bits correspond to 6 dB x 16 = 96 dB.

According to https://hub.yamaha.com/audio/music/what-is-dynamic-range-and-why-does-it-matter/:

As a group, classical recordings have the widest dynamic range of any genre. The same study cited above found that recorded classical music typically offers between about 20 dB and 32 dB of dynamic range. While that might seem like a lot, it’s still quite a bit smaller than that of a live symphony orchestra performance, which can be as large as 90 dB.

Technically, those are good news, aren't they? Live symphony orchestra dynamic range is 90 dB. CD dynamic range is presumably 96 dB. 90 < 96. So, CD should be able to reproduce the whole dynamic range of a symphony orchestra, right?

In practice though, we have those presumably stupid music producers and audio engineers, who fell to the Dark Side during the Loudness Wars, and who will only record classical music CDs with 20 dB to 32 dB of dynamic range.

What would happen if they attempted the 90 dB? They would need to allocate 90 / 6 = 15 bits to the dynamic range encoding. For the quietest sound, they'd only be left with 1 bit for encoding it. Wait, what?

Yes, imagine a quiet passage in a symphony, nevertheless involving a dozen of instruments, each with a complex multi-overtone spectrum. With frequency slides, amplitude rides, tremolos etc. All of this would need to be recorded with just one bit at 44,100 Hz!

This is exact equivalent of DSD encoding, only its frequency is 64 times lower. Correspondingly, the highest frequency that we can hope to encode with similar fidelity as DSD will be 44,100 / 2 / 64 = 344.5 Hz. Say goodbye to the "micro expression of transients and micro transients"!

How much different is what the audio engineers are actually doing? Let's say they decided to limit the dynamic range to 30 dB. This corresponds to 30 / 6 = 5 bits. This leaves 11 bits to encode the quietest part of the symphony.

How good are 11 bits? 2 to the power of 11 is 2,048. 1/2,048 = 0.00049. An average digitization error would be half of that, which is 0.025%. Interesting, it is just below the widely accepted threshold of THD defining a hi-fi power amplifier, which is 0.03%.

This is not a coincidence. If they'd allocated less bits for the quietest parts of the signal, they would hear noise and distortions in them, similarly to how they'd be able to hear noise and distortions introduced by a low-sound-quality power amplifier.

If they's wanted to go audiophile quality for the quietest passages, they'd need to up the ante 3 bits more, to 14 bits. so that digitization noise and distortions would be approximately equal to that of a high-quality DAC, and thus would be likely unnoticeable even on a high-quality professional headphones.

So, for faithful reproduction of a symphony we would need 90 / 6 = 15 bits for encoding the dynamic range, and 14 bits for encoding the shape of the signal. 15 + 14 = 29 bits. Uh-oh, but professional ADC and DAC only encode 24 bits? How could they manage to effectively push to 29?

And here we come to understanding of why arguably overkill digital formats are desirable. The seemingly excessive amount of information per second inherent in 24/192, DSD128, and especially in 24/384 and DSD256 can be divided between encoding the dynamic range, encoding the shape of the quietest signal, and sampling the signal frequently enough to capture its evolution over shorter periods of time.

How all of the above relates to the current thread theme? By the virtue of analog recording and reproduction system, which in principle doesn't place fundamental limits, other than noise and maximum acceleration of mechanical parts, on either effective bit depth or sampling frequency.

It is commonly accepted that the best analog systems have about 70 dB of dynamic range. Which would roughly correspond to 70 / 6 = 12 bits. This gives an excuse for proponents of CD superiority over LP to claim that this must be so because obviously 16 > 12.

However, in order for the quietest signal to be still distinguishable, it only needs to be 6 db, or 1 bit, above the noise floor. This leaves an equivalent of 11 bits for dynamic range, which is more than twice of the 5 bits of the usable CD dynamic range.

Instead of the last 1 bit with which to encode the shape of the quiet signal, a high-quality analog system has many more. Actual number depends on the analog media granularity and its speed, yet the most important fact is that there is no hard stop similar to the one a digital system would have.

So, the analog system would reproduce the quiet passages in higher fidelity signal-shape wise, superimposed with noticeable noise of course. Yet the human hearing system is capable of filtering out this noise at higher levels of processing in the brain, and enjoying the quiet passage hidden underneath.

Viewed from this perspective, LP has twice as wide usable dynamic range in comparison with CD. But higher noise and distortions. For classical music especially, this could be a desirable compromise. For some other genres, for instance, extremely-narrow-dynamic-range very-simple-signal-shape electronic dance music, CD could be preferable.

I would expect a classical recording made by multiple microphones sampled at 24/384, or even at 32/384, and delivered in DSD256 after careful mixing and mastering, to be the ultimate one for the time being. As I recall, they produce such recordings in Europe.

Recent Activity

Unanswered

Related to You

Following

Insider Lobby

Start A New Discussion

Has anyone been able to define well or measure differences between vinyl and digital?

@teo_audio

More to discover

Audiogon

The world's largest high-end audio community.

Virtual Systems

Let the world see what you've built.

Bluebook

The right price. Every time.

Merch

Rep the community and hobby you love so much.