Upsampling and oversampling with respect to audio playback is essentially the same function. In the DSP world you can usually find it explained under the topic of multirate interpolation. The incoming data is zero-padded to the desired sampling rate and the digital filter interpolates the values. The interpolation step is not a ’guess’, it’s more akin to a ’recovery’ of the sample values as if you had originally sampled the signal at the higher frequency. When the interpolation filter’s impulse response is convolved with the audio data, the filter coefficients multiplied with the other samples in the signal along the length of the filter will essentially fill in those zero-padded positions.
Resampling by a non-integer factor doesn’t involve the system clock. It is upsampling (zero-padding) by an integer factor followed by an anti-aliasing filter and downsampling by another integer factor. The resampling ratio is given by the upsampling factor divided by the downsampling factor. Here is a link to a good overview - https://www.eetimes.com/multirate-dsp-part-2-noninteger-sampling-factors/
The computational load is based on how many coefficients the digital filter has (i.e. the length). The ideal filter for audio purposes is the sinc function. However, it is impossible to implement because it would have an infinite number of coefficients. The digital filter is always an approximation and there are many approaches and algorithms to construct these filters. Also note that the longer the filter, the more delay it introduces into the reproduction process.
Here is a free online book on DSP that you’ll probably find useful:
Here are some informative videos that explain some of the fundamental concepts of digital video / audio: