The concept of jitter misleads people into thinking that all you need in a digital signal is the correct bits (which is relatively trivial to transmit) with great timing (low jitter), and so all you need is a great clock. This simplistic view is highly misleading. At least three things matter - the clock, noise and bandwidth. In a perfect square wave, the horizontal axis is time and the vertical axis is voltage. Assuming the clock is perfect – i.e. the vertical signal lines occur at perfectly spaced intervals (the bit rate). Assume when the signal is representing a binary 0, it is at 0v. Assume when the signal is representing a binary 1, it is at 1v. And we will assume that the receiver of this signal decides that the transition between a 0 and a 1 has occurred when the signal rises through the 0.5v level, and that a 1 has transitioned to a 0 when the signal falls through the 0.5v level. Now imagine that there is noise added to the signal. If the frequency of the noise is below the bitrate then this perfect square wave swims on top of a longer and smoother wave. The interesting point is that the timing between data transitions (where those vertical lines pass through 0.5v) is unchanged. So no problem, yet. If the frequency of the noise is above the bitrate then the horizontal lines get fuzzy. And if we combine the low frequency noise with the high frequency noise the effect is combined. Again, the interesting point to note is that the timing between the data transitions (where those vertical lines pass through 0.5v) is unchanged provided the noise is not extremely high. So, again, no problem. Noise, on its own (as long as the deviations caused are materially below 0.5v) is not a problem. The reason it is not a problem is those vertical lines, because noise does not change the space between them.
Now imagine there is no noise. Zero noise is impossible, but something else that is impossible is the vertical line on the square wave, since it requires infinite bandwidth. The vertical lines imply the signal can achieve 0v and 1v in more or less the same instant. Whatever tools we have to transmit a signal, the demands of high bit-rate signals are way beyond what the available tools can deliver. Think about how your analog cables can mess with sound up to around 20kHz, and then think about the enormously wider frequency range required of a digital cable (and, optical cables just have a different set of problems, mainly related to reflections). The higher the bit rate the harder it gets. When we allow for constrained bandwidth, instead of transitions being instantaneous, the signal goes up a slope when transitioning from 0v to 1v, and down a slope when transitioning from 1v to 0v. If the bandwidth was the same as the bitrate then the signal would be a sine wave. To reasonably square out the signal you need to add several harmonics of the bitrate (say 7 or more) above the bitrate, and that is a lot of bandwidth - even more for higher bit rate signals. By adding harmonics, the sine wave begins to square out. Interestingly, in both of these constrained-bandwidth examples, the transitions through 0.5v are still perfectly spaced – even with the sine wave. So still no problem.
So far, so good. Everything in line with Amir's crew.
But as I mentioned, a higher bitrate signal (if you think high bitrate files must always sound better) requires even more bandwidth to square out the wave, and so in a system that has a finite limit on bandwidth, a lower bitrate signal will be more accurately represented than a high bitrate signal. On top of that, if you ask anything in a music server to work faster, it will work with less precision and this is a key trade-off to be aware of when you assume higher bit rates must be better, just because the numbers are bigger. These examples only allow us to conclude that there is no problem if we can achieve zero noise or infinite bandwidth. But each of those goals is unattainable, and the problem becomes apparent when there is both noise and constrained bandwidth. So what happens if we add a low frequency noise component to a frequency-constrained digital audio signal? All of sudden, the 0.5v points are shifted right or left by the addition of the low frequency noise that lifts or drops the signal between bits. Shifting the slopes up or down shifts the 0.5v points left or right. The greater the amplitude of the noise, and the greater the bandwidth constraint, the greater is the effect on timing (jitter).
Now if we add high frequency noise to a frequency-constrained signal you can see that the transition timing at precisely 0.5v is now hard to discern for any digital receiver. If the signal is vertical at the transition then noise does not affect it. But as soon as the transition is not vertical then noise changes the transition point. It is the combination of constrained bandwidth and noise that inevitably creates jitter (variation in data transition timing), regardless of how great the clock is.