The even-order distortion of the driver section is cancelled in the primary of the interstage transformer, reducing driver distortion even further.
@lynn_olson Actually if the circuit is fully balanced/differential from input to output, even orders are cancelled at every stage along the way. In this manner distortion is compounded less from stage to stage.
The result is the 3rd harmonic is the dominant distortion product rather than the 2nd. Many people do not realize that the ear treats the 3rd in much the same way as the 2nd; its the only odd ordered harmonic that is musical to the ear. The 3rd is very good as masking higher ordered harmonics.
BTW, any properly functioning analog tape machine will have a 3rd harmonic as its dominant distortion product as the tape approaches saturation.
Mathematically this type of distortion can be described as a 'cubic non-linearity' as opposed to the 'quadratic non-linearity' of an SET. As Daniel Cheever pointed out in his paper from 1989, its important that the harmonics fall off on an exponential curve. Both an SET and a fully balanced amplifier can do this (its part of the reason people regard SETs as musical despite their many failings). The advantage of a balanced circuit is harmonics fall off on a higher exponential rate so higher ordered harmonics are at a lower level than seen in an SET; its inherently lower distortion.
This allows the distortion signature to be innocuous.
The advantage is greater power output with lower distortion. So at any power level an SET can make, in a circuit using the same tubes the PP amp can have vastly lower distortion and so be smoother with greater detail, since distortion obscures detail.
By the way, cascoding the input section is how you get both voltage resistance and a hundredfold reduction in Miller capacitance. Since Miller capacitance in transistors is grossly nonlinear, this is a very good idea. Tubes exhibit Miller capacitance too, but it is an order of magnitude lower, and it is stable and predictable instead of being nonlinear. There are cascode tube circuits as well, but they offer no improvement in linearity (unlike transistors), and are mostly seen in phono preamps and FM tuner input sections. In the tube universe, pentodes behave similarly to a pair of cascoded triodes, and are more commonly used when a cascode is called for.
We've been using a differential cascode circuit for decades. It has several advantages over pentode or cascade operation; one obvious one being a reduction in the need for a coupling capacitor. Differential circuits benefit from the devices (tubes in this case) having a lot of gain. That increases the differential effect so distortion cancellation is improved and noise is reduced. A differential cascode circuit can have a lot of gain.
This circuit can have a very high CMRR even in a tube embodiment. Its linear enough open loop that you can run it without feedback (something you can't do with pentodes), but if you want to do it, its possible to operate it in ultra-linear mode, where the plate Voltage of the top tube is applied through a divider network to the grid of the same tube. You can do that with a pentode too, but you can't run a pentode zero feedback and the amount of feedback available in UL mode is limited.
Since a cascode circuit is lower distortion, another advantage is it can be used in a circuit with feedback and result in less higher ordered harmonic generation than if a pentode is used.