User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
Admin
completely blind computer geek, lover of science fiction and fantasy (especially LitRPG). I work in accessibility, but my opinions are my own, not that of my employer. Fandoms: Harry Potter, Discworld, My Little Pony: Friendship is Magic, Buffy, Dead Like Me, Glee, and I'll read fanfic of pretty much anything that crosses over with one of those.
keyoxide: aspe:keyoxide.org:PFAQDLXSBNO7MZRNPUMWWKQ7TQ
Location
Ottawa
Birthday
1987-12-20
Pronouns
he/him (EN)
xmpp fastfinge@im.interfree.ca
keyoxide aspe:keyoxide.org:PFAQDLXSBNO7MZRNPUMWWKQ7TQ
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@amberhinds To be fair to your step dad, Karen and Lee, the popular Australian voices found on most GPS devices, as well as on IOS, originally developed by Scansoft, later purchased by Nuance, then transferred to Cerence Automotive, before finally getting owned by Microsoft, are some of the nicest text to speech voices ever made for casual listening. Largely this is due to whomever was in charge of recording and curating the data back in 2002. They did an excellent job editing and aligning the recordings for use with the concatenative synthesis technology that was available at the time, resulting in the Australian voices sounding noticeably better than all of the other English options, even though they all used the same underlying methods. The fact the data capture was so high quality has meant that as technology and training methods improve, those voices have continued to remain a step ahead. The female version of the voice your father is almost certainly using is based on this woman: en.wikipedia.org/wiki/Karen_Jacobsen

If all of my favourite, fast and efficient voices were ripped away from me, those Australian voices are probably what I'd revert to. They're not as fast as I would like, but at least they're clear and accurate. Your step dad has good taste.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@phillycodehound I played briefly with this and it seemed okay on the surface, though I'm not looking for work so didn't go deep: github.com/rendercv/rendercv
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@luiscarlosgonzalez @cachondo @FreakyFwoof @amir It has the same problem with speed.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@luiscarlosgonzalez @cachondo @FreakyFwoof @amir I didn't try Kokoro, because it cannot achieve a real time factor of 1 on CPU. By that I mean, to be fit for consideration with a screen reader, a text to speech voice must be able to generate one second of speech in one second or faster. In general, Kokoro takes two seconds to generate one second of speech. So it's not suitable.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@clv1 @jscholes @cachondo @FreakyFwoof @amir The issue is that both of these are effectively concatenative or parametric, rather than formant, systems. So they will never be as intelligible as eloquence.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@VE3RWJ Shrug. Nobody else has reported that issue. Probably a false positive from malwarebites.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@jscholes @cachondo @FreakyFwoof @amir That's my assumption because the only things that really need a 32-bit compatibility layer are speech synthesizers and braille devices.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@FreakyFwoof @cachondo @amir Yeah, you can get AI to modify the 32-bit addon for you. That's how I got the first two eloquence prototypes; it helped me understand the problem and what approaches would work and what wouldn't. If you give it the 32-bit orphius addon, and the 64-bit eloquence addon, it should be able to understand the working approach to make an addon 64-bit, and make the modifications itself. The reason to give it the 64-bit eloquence addon as an example is so it doesn't decide to go down the GRPC route and include protobuf and a bunch of other nonsense.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@jscholes @cachondo @FreakyFwoof @amir It was mentioned in the roadmap NVDA released a while back.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@cachondo @jscholes @FreakyFwoof @amir They don't have much choice. A lot of the libraries NVDA depends on are stopping 32-bit support this year.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@jscholes @cachondo @FreakyFwoof @amir My understanding is that when this comes to addons, it's going to require some kind of secure addons API/layer. And it won't be ready for 2026.1, or maybe not even 2026.2.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@FreakyFwoof @cachondo @amir You should be able to get either Gemini or Codex to help you, depending on what AI you have access to. The workflow would be:
1. download gemini-cli or codex-cli, and get them installed and configured.
2. clone all of the sourcecode from
github.com/fastfinge/eloquence_64/
3. Delete the tts.txt and tts.pdf files, so you don't confuse it with incorrect documentation.
4. Find any API documentation for orphius that's available, and add it into the folder.
4. Run codex-cli or gemini-cli, and tell it something like: "Using the information about how to develop NVDA addons you can find in agents.md, and the information about the Orphius API I've provided in the file Orphius-documentation-filename.txt, I would like you to modify the code in this folder to work with Orpheus instead of eloquence."

It will go away for five or ten minutes, ask you for permission to read and write the files it's interested in, and then give you something that mostly works. Now, build the addon, run it, and tell it about the errors and problems you have and ask it to fix them. In the case of errors, include the error right from the NVDA log, and for bugs and problems, tell it exactly what it's doing wrong, and exactly what you want it to do instead. Keep doing this until you wind up with a working addon.

Think of AI as a particularly stupid programmer, and you're the manager in charge of the project. You should be able to get this done without paying anyone.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@cachondo @amir I've heard from a second hand source that they are, yes. But I haven't verified that.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@pixelate @PepperTheVixen If you have a sample of someone talking while chewing gum, you can absolutely make that happen.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@pixelate @PepperTheVixen If you give chatterbox-tts an ASMR recording to clone, you can absolutely get it to make lip smacking noises.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@PepperTheVixen The reason it's grating is because unlike Eloquence and dectalk, Espeak only uses formant synthesis for the vowel sounds. For consonants and plosives, it instead uses concatenative recordings based on human speech. That's why even when you switch to a voice that sounds less sharp, the "t", "b", "p", and other sounds are still too sharp. This seems to be the primary cause of the fatigue most people experience while using ESpeak.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@svenja I vaguely remember writing something like that a while ago. The list I linked has all of the games I remember putting in that comment, though.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@Landon205 There's already addons that do that.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@svenja Was it this? gist.github.com/Molitvan/50e3b5060ab9465b1da895155d5c0480
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
The State of Modern AI Text To Speech Systems for Screen Reader Users: The past year has seen an explosion in new text to speech engines based on neural networks, large language models, and machine learning. But has any of this advancement offered anything to those using screen readers? stuff.interfree.ca/2026/01/05/ai-tts-for-screenreaders.html