@fastfinge My results shows that a dedicated IPC server performs faster, E.G the synthDriver is ready for use in just 5 seconds; response is good even in longer sentences, but this can be attributed to the 4.2m model I'm using. And when I ran the model through a streaming vocoder, response is surprisingly realtime, suitable for screen reader. As for voice rate, I'm using a modification of the good "audiostretchy" pip package. I can't give more details ATM, but I hope this helps in your research