Your question seems to me more articulate than the responses it got, so it's kind of hard to know what to say that you haven't already thought of.
I don't know what you mean, however, when you write that "music not only travels on air, it is vibrations on air." Uh...yes, but so what? What would this have to do with the meaning of "air" in the audiophile context?
In the same paragraph, you go on to say: "In orchestral music the instruments get jumbled together to some degree. In other words, there really is not space between instruments, however, they do seem defined within the soup of air that hits my ear." Well put! Very resolving audio systems, playing very fine recordings, can create a spatial image that is actually more vivid than would be a live performance. I've heard many string quartet performances, for instance—kind of an ideal ensemble to judge in terms of instrumental placement, since there are just four of them spread across the entire "soundstage." In fact, I play cello. But I can follow individual instruments better on certain string quartet recordings than I have ever been able to do in a live performance, even though live I have my eyes to give visual cues about where the sound is coming from.
For what it's worth, I find this effect both exciting and musically relevant: it's easier to grasp complicated counterpoint if you can concentrate on each "voice," and by "watching" the sound in a virtual space (with eyes closed!), this is easier to do than by listening alone.
Finally, I've always understood "air" in the audiophile context to mean something like the natural spaciousness that is palpable when one is present to a performance involving several instruments in a large hall. My living room, where my rig is set up, is large, but not as big as a concert hall! So there's an inevitable cramping of the spaciousness of a live performance when one reproduces it in a listening room. "Air" refers to the simulacral recreation, by whatever means, of that original spaciousness.