Electrical/mechanical representation of instruments and space


Help, I'm stuck at the juncture of physics, mechanics, electricity, psycho-acoustics, and the magic of music.

I understand that the distinctive sound of a note played by an instrument consists of a fundamental frequency plus a particular combination of overtones in varying amplitudes and the combination can be graphed as a particular, nuanced  two-dimensional waveform shape.  Then you add a second instrument playing, say, a third above the note of the other instrument, and it's unique waveform shape represents that instrument's sound.  When I'm in the room with both instruments, I hear two instruments because my ear (rather two ears, separated by the width of my head) can discern that there are two sound sources.  But let's think about recording those sounds with a single microphone.  The microphone's diaphragm moves and converts changes in air pressure to an electrical signal.  The microphone is hearing a single set of air pressure changes, consisting of a single, combined wave from both instruments.  And the air pressure changes occur in two domains, frequency and amplitude (sure, it's a very complicated interaction, but still capable of being graphed in two dimensions). Now we record the sound, converting it to electrical energy, stored in some analog or digital format.  Next, we play it back, converting the stored information to electrical and then mechanical energy, manipulating the air pressure in my listening room (let's play it in mono from a single full-range speaker for simplicity).  How can a single waveform, emanating from a single point source, convey the sound of two instruments, maybe even in a convincing 3D space?  The speaker conveys amplitude and frequency only, right?  So, what is it about amplitude or frequency that carries spatial information for two instruments/sound sources?  And of course, that is the simplest example I can design.  How does a single mechanical system, transmitting only variations in amplitude and frequency, convey an entire orchestra and choir as separate sound sources, each with it's unique tonal character?  And then add to that the waveforms of reflected sounds that create a sense of space and position for each of the many sound sources?

77jovian

Showing 3 responses by almarg

Roberttcan 10-26-2019
I don't think that is what the OP is talking about at all. Not even close.

I could be wrong, but I am pretty sure the op is asking how just one amplitude varying signal in the time domain can represent a whole orchestra and all its instrument.

The answer to that is it doesn't convey the whole orchestra, the brain extracts all the instruments out of the signal based on pattern recognition.

+1.

I too am pretty certain that Geoff's questions have nothing to do with what the OP is asking.

Regards,
-- Al

@77jovian You may find the following writeup to be instructive. (Coincidentally, btw, as you had also done it uses the example of a flute for illustrative purposes):

http://newt.phys.unsw.edu.au/jw/sound.spectrum.html

Note particularly the figure in the section entitled "Spectra and Harmonics," which depicts the spectrum of a note being played by a flute.

To provide context, a continuous pure sine wave at a single frequency (which is something that cannot be generated by a musical instrument) would appear on this graph as a single very thin vertical line, at a point on the horizontal axis corresponding to the frequency of the sine wave.

The left-most vertical line in the graph (at 400 Hz) represents the "fundamental frequency" of the note being played by the flute. The vertical lines to its right represent the harmonics. The raggedy stuff at lower levels represents the broadband components I referred to earlier. Note this statement in the writeup:

... the spectrum is a continuous, non-zero line, so there is acoustic power at virtually all frequencies. In the case of the flute, this is the breathy or windy sound that is an important part of the characteristic sound of the instrument. In these examples, this broad band component in the spectrum is much weaker than the harmonic components. We shall concentrate below on the harmonic components, but the broad band components are important, too.

Now if a second instrument were playing at the same time, the combined spectrum of the two sounds at a given instant would look like what is shown in the figure for the flute, plus a number of additional vertical lines corresponding to the fundamental and harmonics of the second instrument, with an additional broadband component that is generated by the second instrument summed in. ("Summed" in this case refers to something more complex than simple addition, since timing and phase angles are involved; perhaps "combined" would be a better choice of words). And since when we hear those two instruments in person our hearing mechanisms can interpret that complex spectrum as coming from two different instruments, to the extent that information is captured, preserved, and reproduced accurately in the recording and playback processes our hearing mechanisms will do the same when we hear it in our listening room.

Best regards,
-- Al

Hi 77jovian,

First, kudos on your thoughtful question.

IMO, though, the answer is fairly simple. When we listen to an orchestra, or some other combination of instruments and/or vocalists, what our hearing mechanisms are in fact hearing is a combination of sine waves and broadband sounds ("broadband sounds" being a combination of a vast number of sine waves), both of which of course vary widely from instant to instant in terms of their amplitudes, timings, and phase relationships.

So to the extent that the recording and reproduction chains are accurate, what is reproduced by the speakers corresponds to those combinations of sine waves and broadband sounds, and our hearing mechanisms respond similarly to how they would respond when listening to live music.

Best regards,
-- Al