Note by @fastfinge

Tamas G @Tamasg@mindly.social

5mo

Probably the last build of NV Speech player, ever. Sorry y'all. I'm not sure if this will continue.
Last change-log:
- adopts the new sound from "experimental." This smooths out the voice, which no doubt some people will hate. If you do, just replace speechplayer.dll from an older build and you'll still get your sharper sound.
- huge language pack update to add glottalOpenQuotient set from vowel height, Voice turbulence, and to reduce noisier phonemes.
- removed driver clicking noise when rapid speech chunks occur.
eurpod.com/synths/nvSpeechPlayer-2026-v9.nvda-addon

3

0

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

5mo

@Tamasg Disappointing, but okay. An eloquence rewrite is a 30 year project. At least. You're stopping for the same reason NV Access themselves stopped: this is a massive undertaking. It's a bit frustrating to me that everyone who takes this on so massively underestimates how hard this will be, then quits when they don't have something in weeks or months. If we're going to succeed we have to be into this for decades.

2

3

0

Tamas G @Tamasg@mindly.social

5mo

@fastfinge so, more of a gradual tuning than a "day and night difference" in how it sounds kind of thing.

1

0

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

5mo

@Tamasg Either that, or a case of building the tools to build the tools to do the thing. The phoneme editor is an excellent, perfect start. But I suspect we're going to need tools to help us tune the klatt model any further. I don't think AI can get us much closer. But it might be able to help us build a tool to analyze the waveforms of the synths we like. We're probably going to also need a tool to help us tune the pitch/intonation table. If you look at the work of Dr. Susan Hertz, who built eloquence, she didn't start by building Eloquence. She built SSRS, a system for creating and editing text to speech rules. Then she didn't like it and wrote delta, a more powerful system. Delta was described as a hierarchical system for creating linguistic text to speech rules, where every rule could interact with rules on the levels above and below it. Based on her paper specifically on Eloquence, as well as her academic publication history, it looks like her team spent about 20 years writing tools, and then about five years writing eloquence.

1

0

1

0

Tamas G @Tamasg@mindly.social

5mo

@fastfinge I just pushed an update that really improves things. no more sharp consonant spikes. This is really shaping slowly, but it's going to be such an iterative process. V9 / latest master. Personally, I'm happier with the sound than I was with the original Speech Player, maybe that's what should count in the end. I had just this simple goal of reviving it, it wasn't until after I did so that I realized, "Oh this thing didn't even support American English?" that I began to take it on as a project, LOL.

2

0

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

5mo

@Tamasg Yeah, NV Access never really took it beyond the prototype stage. But then a butchered version got added to espeak, so now in everyone's memories, we remember Speech Player as having more features than it did. And it's totally getting a bit better with every update. But the problem is, as long as eloquence keeps working, everyone will just use eloquence. But if we don't start this work now, when eloquence is finally gone for good, we're all going to be screwed.

1

0

1

0

Tamas G @Tamasg@mindly.social

5mo

@fastfinge hey, I created two .py files in the tools folder:
1) klatt_tune_sim.py — the core single-phoneme synth sandbox
A small, self-contained Klatt-style formant synth simulator (16 kHz) that can synthesize one phoneme at a time from your packs/phonemes.yaml parameters. It implements two source models so you can compare voicing behavior.
It also prints a few spectral metrics (centroid and band energy splits) so you can track “brighter/darker” changes numerically, not just by ear.
Input: packs/phonemes.yaml and a --phoneme key (like a, ʃ, t͡s).
klatt_tune_sim
•
Output: optional WAV file + printed metrics.
klatt_tune_sim
2) ipa_klatt_probe.py — a rough phrase “ear-test” harness built on the simulator
What it is:
A helper script that:
1.
Gets IPA from eSpeak for a word/phrase (or accepts IPA directly),
2.
Applies a few tiny normalization tweaks similar to what you’d do in a pack,
3.
Tokenizes the IPA into phoneme keys from packs/phonemes.yaml,
4.
Synthesizes the phrase by concatenating per-phoneme audio generated via klatt_tune_sim.py.
I hope these tools can help us.

1

0

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

5mo

@Tamasg I think so. Next thing to do is build similar tools for eloquence, to make doing side by side comparisons of the exact same phonemes easier. Then we can rapidly do listening tests, instead of switching back and fourth in NVDA, and having to compare the entire system.

0