I've been struck with similar observations while listening @ahuvia . And I think I agree that small combinations of instruments often seem easier to present nicely. I also agree that this may be why one often hears sparse arrangements at audio shows and environments designed to show off hi-fi equipment. But what if the space provided by fewer instruments simply allows easier brain processing that gives the impression of enhanced fidelity? Or allows subtler details to stand out (a breath here, a brief string buzz there) simply because they're not masked by other noises, giving an impression of greater detail and clarity? I have no idea if explanations like these are actually true, but I mention it because I'm pretty sure the assumptions you've made to explain this (potential) phenomenon aren't right.
I mean, your speakers aren't fighting to create the sound of the instruments that were playing, right? They're delivering information stored in the recording of the instruments that were playing -- a very different thing. You're worrying that the speakers are attempting to deliver too much information while being inadequate for the task, but in fact the microphones that recorded the instruments themselves have already squashed the entirety of those data into something different, haven't they? Your speakers have only to correctly deliver that information. Your whole system knows nothing of 15 instruments, it only knows up, down, right, left... Or, you know, zeros and ones.
But for the original question I hope someone with much more knowledge than I can weigh in on why less info. in a recording seems easier to make sound good?