I don't agree with that mijostyn. Imaging comes from both volume cues (predominant in most multi-channel studio recordings by far), and timing from a proper stereo microphone setup which is rather uncommon. This is a long post, but all relevant.
With good dispersion and non-symmetric toe in, you can get reasonably accurate volume cues over a wider range. That provides two significant mechanisms for location, 1) Relative volume level, and 2) Frequency dependent head shading.
What you can't compensate for is timing, but there are two issues, a) Was timing even captured, and b) Can timing be conveyed with speakers in a traditional two channel audio setup, both because of the extreme accuracy needed in head placement, and the inability to prevent sound from one speaker reaching the opposite ear.
0.1" of head miss position = 1. 6 degrees of timing inaccuracy0.5" = 8.2 degrees2" = 32.7 degrees
So lets say you are sitting 10 feet back from the center line of your speakers at 60 degrees. A 2 degree toe-in difference only represents about 3.8 degrees of image movement, and the movement will be true for all sounds. I.e. the image shifts left or right. If the toe-in is symmetric, 3.8 degrees represents moving your head left-right about 3". At 5 degree toe-in difference, you are looking at 10 degree offset, and about 7" of side-side head movement (14 inches total range). You just moved from the best seats, to pretty good seats.
Of course much of this is all literally fuzzy anyway. When you have your speakers at 60 degrees, head shading to both ears creates an improper center image. You may have recorded timing information, but because you have no cross-talk cancellation, you have a secondary timing event about 0.2 seconds later confusing the brain on whether that is the event, an echo, etc. The singer (continuous tones) are properly placed, but perhaps a bit fuzzy due to aforementioned issues of shadowing for volume, and the drum hit off to the side, gets confused in the false secondary timing event.
Oh, so it is easy ... ya no. There is one other huge issue in capturing timing difference in stereo microphones. You are now playing back the same signal delayed in time between two speakers. Guess what that does when it hits the head? Filtering! Comb filtering effects will be evident and significant as the fixed timing delay reinforces and cancels depending on the frequency. Oh, but it gets even better ... I mean worse. Where timing only contributed spatial cues at <1,500 Hz, those new comb filtering effects you generated are now across the frequency range. You think you widened the stereo image, but really you created an auditory illusion of space that is not representative of the timing recorded. The timing becomes a level difference perception. *** Note that now, head accuracy becomes far less critical ***
And just to be clear, stereo speakers attempting to reproduce timing can't place the image outside the speakers (see crosstalk above). Of note also, timing only really works at <1,500Hz, and predominantly <1,000Hz. So to all those "phase" "phase" "phase" people, less posting, more learning, and for those buying or making speakers, keep the crossover out of the 200-1500Hz range if you can.
So what can be done?
- Signal processing akin to noise cancellation, but in this case, to reduce cross-talk
- Headphones with signal processing to replicate the body functions (head shading, reflections, etc) that are lost without an audio field.
With good dispersion and non-symmetric toe in, you can get reasonably accurate volume cues over a wider range. That provides two significant mechanisms for location, 1) Relative volume level, and 2) Frequency dependent head shading.
What you can't compensate for is timing, but there are two issues, a) Was timing even captured, and b) Can timing be conveyed with speakers in a traditional two channel audio setup, both because of the extreme accuracy needed in head placement, and the inability to prevent sound from one speaker reaching the opposite ear.
0.1" of head miss position = 1. 6 degrees of timing inaccuracy0.5" = 8.2 degrees2" = 32.7 degrees
So lets say you are sitting 10 feet back from the center line of your speakers at 60 degrees. A 2 degree toe-in difference only represents about 3.8 degrees of image movement, and the movement will be true for all sounds. I.e. the image shifts left or right. If the toe-in is symmetric, 3.8 degrees represents moving your head left-right about 3". At 5 degree toe-in difference, you are looking at 10 degree offset, and about 7" of side-side head movement (14 inches total range). You just moved from the best seats, to pretty good seats.
Of course much of this is all literally fuzzy anyway. When you have your speakers at 60 degrees, head shading to both ears creates an improper center image. You may have recorded timing information, but because you have no cross-talk cancellation, you have a secondary timing event about 0.2 seconds later confusing the brain on whether that is the event, an echo, etc. The singer (continuous tones) are properly placed, but perhaps a bit fuzzy due to aforementioned issues of shadowing for volume, and the drum hit off to the side, gets confused in the false secondary timing event.
Oh, so it is easy ... ya no. There is one other huge issue in capturing timing difference in stereo microphones. You are now playing back the same signal delayed in time between two speakers. Guess what that does when it hits the head? Filtering! Comb filtering effects will be evident and significant as the fixed timing delay reinforces and cancels depending on the frequency. Oh, but it gets even better ... I mean worse. Where timing only contributed spatial cues at <1,500 Hz, those new comb filtering effects you generated are now across the frequency range. You think you widened the stereo image, but really you created an auditory illusion of space that is not representative of the timing recorded. The timing becomes a level difference perception. *** Note that now, head accuracy becomes far less critical ***
And just to be clear, stereo speakers attempting to reproduce timing can't place the image outside the speakers (see crosstalk above). Of note also, timing only really works at <1,500Hz, and predominantly <1,000Hz. So to all those "phase" "phase" "phase" people, less posting, more learning, and for those buying or making speakers, keep the crossover out of the 200-1500Hz range if you can.
So what can be done?
- Signal processing akin to noise cancellation, but in this case, to reduce cross-talk
- Headphones with signal processing to replicate the body functions (head shading, reflections, etc) that are lost without an audio field.