So this looks like a high quality, fast, natural, and open source TTS system in Python. A key candidate for an #NVDA#addon. Unfortunately, I find #nvdasr addon development super confusing. Is there a good template to start from or something? github.com/thewh1teagle/kokoro-onnx
Yeah, I am deeply confused about how buffers work and how to indicate when speaking is complete and do indexing and so-on. If this is going to be an #NVDA addon, someone else will have to do it.
@fastfinge You need support from the synth for some features. This one doesn't have anything. Once it starts speaking, it blocks until it's done, so you can't interrupt it.
@tspivey That's why you start a session, so the model stays loaded in memory. Then I think you can actually stream output from onnxruntime bite by bite, I'm just not sure how.