Focus on 24/192 Misguided?.....


As I've upgraded by digital front end over the last few years, like most people I've been focused on 24/192 and related 'hi rez' digital playback and music to get the most from my system. However, I read this pretty thought provoking article on why this may be a very bad idea:
http://people.xiph.org/~xiphmont/demo/neil-young.html

Maybe it's best to just focus on as good a redbook solution as you can, although there seem to be some merits to SACD, if for nothing else the attention to recording quality.
128x128outlier

Showing 6 responses by almarg

On the question of continuous vs. non-continuous waveforms, I think that part of the reason for the disagreement is that the word "continuous" is misleading in this context. No waveform is truly "continuous." Regardless of the nature of the waveform, the Sampling Theorem will only be perfectly accurate (i.e., to 100.00000000...%) when an infinitely long sample record is available, covering the period from the beginning of the universe to the end of time. :-)

Any real-world waveform, whether sinusoidal or not, and "continuous" or not, will not meet that criterion. As a result there will always be some non-zero loss of information, at and near the times when the waveform begins, when it ends, and when it changes character. In theory the spectral content of those transitions extends out to infinity Hertz, although as a practical matter much of the high frequency spectral content of those transitions will be at amplitudes that are utterly negligible.

The information that is lost in those transitions will correspond to the spectral components that lie above the cutoff point of the anti-aliasing filter. The lower the cutoff point of the anti-aliasing filter, and the more abrupt the transitions are in the waveform that is being sampled, the greater the amount of information that will be lost.

Will any of that particular form of information loss be audibly significant when a music waveform is sampled at 44.1 kHz? It's hard to say, and I doubt that empirical assessment (by listening) can yield a meaningful answer considering how many other variables and unknowns are involved in the recording and playback processes. My guess is that it probably has some significance, especially on high frequency transients such as cymbal crashes, but only to a relatively small degree.

Is oversampling plus noise shaping an essentially perfect means of overcoming the problems inherent in sampling just slightly above the Nyquist rate, as the article seems to suggest? It's probably fair to say that it can work pretty well, but IMO it would be hard to argue that it is "essentially perfect." Can the ultrasonic frequency content that is retained by hi rez formats have adverse consequences, as claimed in the article, as a result of intermodulation effects within the system's electronics, or things like crosstalk effects for that matter? It certainly seems conceivable, to a greater or lesser extent depending on the particular components that are in the system. Will sampling at a higher rate result in sampling that is less accurate, assuming equal cost and comparable design quality? That would seem to be a reasonable expectation. But complex and sophisticated digital signal processing does not come for free either.

What does it all add up to? I would have to say that the paper referenced by the OP, and also the Lavry paper, make better cases against hi rez than I would have anticipated, but they are certainly not conclusive as I see it. And given the many tradeoffs and dependencies that are involved, my suspicion is that there will ultimately be no one answer that is inarguably correct.

Best regards,
-- Al
Hi Kijanki,

The only reference to downsampling + upsampling that I recall seeing was in the paragraph headed "clipping" in the lower third of the page, and in footnote 21. He was saying that by taking 192 kHz source material, downsampling it, and then upsampling back to 192 kHz, a sonic comparison could be made between the two 192 kHz signals that would be indicative of the adequacy of the lower sample rate.

Not sure if that is what you are referring to. But in any event the methodology he is describing doesn't make sense to me, because the comparison would not reflect the effects of the sharper anti-aliasing and reconstruction filters that would be required for recording and playback at the lower sample rate.

Best regards,
-- Al
Ok, I see what you are referring to. Note this statement:
Oversampling is simple and clever. You may recall from my A Digital Media Primer for Geeks that high sampling rates provide a great deal more space between the highest frequency audio we care about (20kHz) and the Nyquist frequency (half the sampling rate). This allows for simpler, smoother, more reliable analog anti-aliasing filters, and thus higher fidelity. This extra space between is [sic] 20kHz and the Nyquist frequency is essentially just spectral padding for the analog filter.

Because digital filters have few of the practical limitations of an analog filter, we can complete the anti-aliasing process with greater efficiency and precision digitally. The very high rate raw digital signal passes through a digital anti-aliasing filter, which has no trouble fitting a transition band into a tight space. After this further digital anti-aliasing, the extra padding samples are simply thrown away. Oversampled playback approximately works in reverse.
So what distinguishes the two situations you are referring to is that the 192 kHz hi rez format will presumably include a significant amount of ultrasonic audio information, which is what he is saying might have harmful consequences as a result of intermodulation effects in downstream components, while the oversampled redbook data will not include that information, and therefore those effects will not occur.

Best regards,
-- Al
04-20-12: Bombaywalla
Al, Kijanki: I *think* that I might know what the author is intending to say here: To do an A/B comparison, the author would like to level the playing field.... Does this make sense guys?

Hi Bombaywalla,

I think that what he is referring to in note 21 and in the "Clipping" paragraph is a comparison between a 192 kHz hi rez signal, and that same signal downsampled to 44.1 or 48 kHz and then upsampled back to 192 kHz. Both 192 kHz signals would be played back through the same DAC and the same downstream components. If they were to sound different in any way it would presumably mean that the lower sample rate, and/or the downsampling and upsampling processes, degraded the signal. Which signal sounds subjectively better would be irrelevant.

As I indicated earlier, though, it seems to me that the flaw in that methodology is that it does not take into account the sonic effects of the anti-alias and reconstruction filters that would be used if the recording and playback processes were done at the lower sample rate.

Best regards,
-- Al
04-20-12: Kijanki
Al, I wonder if 24/192 contains any ultrasonic frequency at all. Why would they leave it preparing hi-rez files? Where this ultrasonic frequency comes from? Again, notion that 192kHz sampling is harmful is a little farfetched.
Hi Kijanki,

What he is referring to is the ultrasonic output of the musical instruments themselves. Yes it would be at very low levels, and with a lot of instruments it would probably not be present to a significant extent at all. But his point, debatable though it may be, is that leaving it in can't do any good, and MIGHT do some harm, depending on the non-linearities that may be present in the playback system.

It would be left in the hi rez recording to avoid introducing a sharp cutoff filter into the signal path, which as you realize is one of the fundamental benefits of high rez.

Along the lines of my earlier comments, I'm skeptical and/or uncertain about a lot of his points, and how they would trade off in terms of significance against the presumable benefits of high rez. But I don't consider his arguments to be outlandish or unreasonable.

Best regards,
-- Al
Excellent post by Kijanki. I agree completely.
04-22-12: Bifwynne
... do you know what the power bandwidth is on ARC amps, particularly the VS-115. Freq. response is approx. 100K, but I don't know if that is the same as power bandwidth.
My understanding is that unless otherwise stated frequency response and bandwidth are usually specified under "small signal" conditions. I believe that for power amplifiers "small signal" is commonly defined to mean 1 watt or 2.83 volts (2.83 volts corresponds to 1 watt into 8 ohms). Full power bandwidth will sometimes be considerably less, in part because in some designs it will be limited by what is called slew rate, which isn't a factor under small signal conditions.

The specifications at the ARC website do not appear to indicate either full power bandwidth or slew rate for your VS-115, so there isn't enough information to answer your question. As Kijanki indicated, though, high power levels are not required at ultrasonic frequencies, so small signal bandwidth is a more meaningful number than full power bandwidth.

Returning to the question of ultrasonic intermodulation distortion, I'm not sure that bandwidth limitations are directly relevant to the issue, although they might play a role. What is relevant is non-linearity. As long as the amp's output amplitude is linearly proportional to input amplitude, at each of the frequencies for which an ultrasonic spectral component is present, there won't be a problem. Perhaps there will often be a tendency for linearity to degrade at frequencies where the amplifier's small signal frequency response is rolling off, in which case bandwidth would have some relevance to the issue. Or perhaps not; I have no particular knowledge on that question.

Best regards,
-- Al