As far as my experience when working with different MCUs and processors this all comes down to 2 things, how the drivers are written that access the flash and then the speed at which the hardware can reliably perform. We know that the flash supports quad data access but if the clocks and timings are not properly performing they could impact the performance in a very big way. Since you you've have had some success I would think that maybe the code is OK since once it works there shouldn't be a reason to change it.
As far as my experience when working with different MCUs and processors this all comes down to 2 things, how the drivers are written that access the flash and then the speed at which the hardware can reliably perform. We know that the flash supports quad data access but if the clocks and timings are not properly performing they could impact the performance in a very big way. Since you you've have had some success I would think that maybe the code is OK since once it works there shouldn't be a reason to change it.