the idea is that the best systems tell us what the recording engineer wanted us to hear. sometimes that is live to 2 track and so locations of instruments do match the video or picture. but also this simple mic'ing technique might not allow us to hear the instruments very distinctly. in a perfect world we would have many mics all around to catch the individual instruments and also the hall effects plus perfect mixing and so we get placement plus the whole cohesively. hard to do.
and YouTube recordings have limited bandwidth, even though i agree many are very enjoyable.
a few years ago i attended a live Seattle Symphony concert of a work by a friend of mine. sat mid hall right in the center. row M. i enjoyed the concert, but the sound and detail was muddled. the bass was in and out. later my friend gave me a file of the recording of the concert i attended.
it was much more clear and laid out in my home system. the elements of the music and the bass was much more distinct and easier to follow and enjoy.
for sure this is an anecdotal case, and not always how it goes. but the live music experience is very inconsistent. at it's best it is much better than the reproduced experience. but it's many times not as good as far as the sonics and understanding.
i do agree when you add the visuals to the recording then there are insights to be found. i have a separate home theater system with Dolby Atmos 9.3.6 speakers and it can result in a fun experience. but for me the best sonic experience is my 2 channel when the performance and engineering of the recording is top notch.