Code that is running from flash goes through a small SRAM cache. So when the code is not in the cache, it is read from flash to cache first before the CPU can use it. (All automatic. In hardware. Impressive.)
Just maybe this is what you seeing? Sometimes your code gets aged out of the cache and has to be re-read.
Try to NOT inline the sensitive code.
Also, do NOT tag it with ICACHE_RODATA_ATTR.
Then the code goes into a special, fast SRAM set aside from the cache.
Might be worth a try. The fast SRAM isn't too big though. (Also 32k? Can't remember.)