That one: https://github.com/espressif/ESP8266_MP3_DECODER
It seems this code is unable to accurately hit the common audio samplerates that we're used to.
When asking for 44100, it ends up at 44321, asking for 48000 gives 48019.
From the #define's in the code it looks like there's some PLL available, but the code never uses it.
Is there any public knowledge how to hit exact audio sample rates?