User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
So this looks like a high quality, fast, natural, and open source TTS system in Python. A key candidate for an . Unfortunately, I find addon development super confusing. Is there a good template to start from or something? github.com/thewh1teagle/kokoro-onnx
11
15
8
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
Here's a much longer example of the quality of speech Kokoro TTS generates. I really do think it might be a decent addon. The weird pauses are because I'm just giving it a big long string, rather than chunking it like I should. It generates this in real time on CPU, and faster on GPU. The code to generate it is as follows:
import soundfile as sf
from kokoro_onnx import Kokoro
from onnxruntime import InferenceSession

session = InferenceSession("kokoro-v0_19.onnx", providers=["ROCMExecutionProvider", "CPUExecutionProvider"])
kokoro = Kokoro.from_session(session, "voices.json")
samples, sample_rate = kokoro.create(
"He wasn't sleeping very well, and he knew the people around him noticed, but he didn't know what to do about it. He had quietly gone to Madame Pomfrey, who had regretfully told him that Dreamless Sleep was highly addicting and that while she could give him the occasional dose, it would have to be spread out enough to prevent it from becoming addicting – meaning he could only take it one night out of every two weeks or so. It was one night more of productive sleep than he'd be getting otherwise, so he still did it, but it didn't help the larger issue. He wasn't under the effects of any nightmare-inducing Curses, potions, or other magical ailments, so there was nothing for Madame Pomfrey to do. The nightmares were coming from his own mind, and she was not a Mind-Healer. She'd offered to try and connect Harry with one, but when Harry discovered that it involved having someone else quite literally entering his mind with magic and helping him sort out things like trauma he couldn't. If Harry couldn't even tell Hermione the extent of what he'd suffered at the Dursley's, he wasn't about to let a stranger into his mind to see it. Let alone the 'adventures' of his Hogwarts years. So the nightmares persisted, and with the poor quality of sleep serving as the first domino, everything else slowly began to fall. His grades weren't slipping yet, but he was struggling with the study schedule Hermione had set out for them and doing his homework took more effort, more energy that he didn't have.", voice="af_sarah", speed=1.0, lang="en-us"
)
sf.write("audio.wav", samples, sample_rate)
print("Created audio.wav")
8
3
3
0

User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge she's quite pleasant. I regret that I have now read so much fanfic that I can't immediately identify that one, though.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo Here's a UK English sample of the same text. It sounds fine to my ear, but I'm not British. Thoughts?
2
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo Oh, and the fanfic is: Harry Potter and the Art of Getting Your Shit Together — by MsStarryNightSky.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@fastfinge @cachondo I don't think this is one @SeveraSnape and I have. Sounds new to me too and that's surprising considering how much I read.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@FreakyFwoof @SeveraSnape @cachondo It's a Harry/Hermione fluff fic, so probably something you'd both enjoy. I'm surprised you don't already have it. Harry Potter and the Art of Getting Your Shit Together
Posted originally on the Archive of Our Own at
archiveofourown.org/works/59310490.
0
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@FreakyFwoof @fastfinge @cachondo I don't think we do, but we shall in a minute.
3
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@SeveraSnape @FreakyFwoof @fastfinge still in progress, isn't it?
1
0
0
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SeveraSnape @fastfinge @cachondo I platonically love this girl. Always on top of things. Mad respect.
3
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@FreakyFwoof @SeveraSnape @fastfinge the dedication is awesome. I need a way of knowing when the in progress things are done, seriously. DO I perhaps need to make an ao3 account or something?
3
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@cachondo @SeveraSnape @fastfinge I had to, because I follow sooooo many things on both ff net and ao3, so I set up a rule to forward any HP-related emails to a dedicated folder. I set it such that if the emails do not contain the text 'Harry Potter' it immediately deletes them. It's cut down on my inbox clutter 10-fold if not more, and the folder of fics is purely HP-related. I loves it.
0
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo @SeveraSnape @FreakyFwoof Yeah, and get fanficfare configured on a server somewhere. It can monitor an imap account, find emails from FF and AO3, and auto-update your epub files.
1
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge @SeveraSnape @FreakyFwoof Blimey, that is clever.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@cachondo @fastfinge @FreakyFwoof It does help clean up the inbox.
0
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@cachondo @FreakyFwoof @fastfinge For Andre's folder, I've begun putting status in there, I have visions of a Change log like I do for mine... but there's so much I'm doing to his folder right now it would just be crazy. but, you would only need an account if you want to get author alerts. And... if you follow a bunch ten you would get lots of emails
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@FreakyFwoof @SeveraSnape @cachondo Same. Although apparently she drinks coke in the morning instead of coffee! So I dunno... LOL JK
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @FreakyFwoof @cachondo hahahahaha!!!! I do indeed, but I drink tea too... does that count toward the good? :)
2
0
0
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SeveraSnape @fastfinge @cachondo It undoes a lot of the bad, let's just put it this way haha
1
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@FreakyFwoof @fastfinge @cachondo Oh man!!!! hahahahha!
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Depends. Peperment? What kind of tea? LOL
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof Black tea. Like mint too, though that's not my go to. I like a bunch of different kinds.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Coffee is the one true caffeine delivery system, and mint tea is the only acceptable hot drink without caffeine. ROFL
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof Coffee puts me right to sleep.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Really? I hear that's one of the primary symptoms of people having ADHD.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof So I've heard. I don't care to figure that out. I focus just fine on what I need to. hahaha.
0
0
1
0
User avatar
Ather Jammoa @atherjammoa@mastodon.social
1y
@SeveraSnape @fastfinge @cachondo @FreakyFwoof You should add a dash of honey. You won't regret it!
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@atherjammoa @SeveraSnape @cachondo @FreakyFwoof That's what I do whenever I have a sore throat.
1
0
1
0
User avatar
Ather Jammoa @atherjammoa@mastodon.social
1y
0
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
0
0
1
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo I am just waiting to see it show up in a folder somewhere, I want to read that one. sounds like a good one.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@JamminJerry @FreakyFwoof @fastfinge @cachondo It's in Andre's under the author folder mentioned.
2
0
0
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo ah, there it is. thanks for that.
0
0
0
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo holy cow! those are going to be some very long chapters! around 1.7 meg file, and only 20 chapters. just wow!
0
0
1
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge it's a very nice neutral sort of an accent. Goes a bit funny on the ends of some words, one, and so are good examples in that sample. But I can see it being a great option for people who want more Human-sounding voices.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo So looking at it, it looks like it just uses the phonemes generated by espeak, and passes those to the natural voices. So if you use a voice trained on American English, and ask for en-gb, it'll do it anyway and sound terrible.
1
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge haha that's rather funny.
One of the biggest complaints from users new to screen reading when I taught was the quality of the available voices. The school paid for vocaliser I think but that was as good as it got. I did get a few people onto the neural stuff, but it was in its infancy when I left.
This sounds really smooth in comparison.
0
0
1
0
User avatar
Brandon @serrebi@tweesecake.social
1y
@fastfinge I agree it's pretty good
0
0
0
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@fastfinge so can it only write direct-to-file, or could it also raw PCM data to a callback or have a way of reading a buffer it creates with that raw data? NVDA drivers would work infinitely sympler under that model. Sadly no real template for one exists beyond just looking at the code for drivers like DECTalk or Eloquence, Sonata, ETC and basing it off them to see which pattern best fits that synths way of operating on things.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg It returns samples by default. I'm just using a python library to write them to a file.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg @tspivey So it looks like the repo is still super active. For this to be an addon, we want: streaming samples in real time, and indication of speech starting and stopping. Anything else? I can open an issue to ask.
1
0
1
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@fastfinge @tspivey I think yeah, a way to inject stop sequences mid-speech as well (so we could call a shut-up or stop from the main thread during playback) - having callbacks for stop can be nice, sometimes we can gather that just on the basis of the audio buffer closing itself if that's done in realtime with speech fragments.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg @tspivey Issue is here if anyone wants to chime in: github.com/thewh1teagle/kokoro-onnx/issues/13
1
2
2
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge Oh that’s an 11 labs voice. Nice.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@lynessence IIs it? I don't know what there built in voices are now. Guess that's where they got the training data.
2
0
0
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge Yes. I just listened to a book using that voice in ElevenReader. It’s not bad.
0
0
1
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge I have tried to speed those 11 lab voices up quite a bit using ElevenReader, and I don’t really like the results. Curious to hear what you think.
0
0
1
0
User avatar
Serena 🏳️‍🌈 @SerenaTori@dragonscave.space
1y
@fastfinge @FreakyFwoof Yeah, that sounds amazing. I would love to read stuff with that synthesiser.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SerenaTori @fastfinge Now to gently gently request @Tamasg to make it a reality... Haha
1
0
1
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@FreakyFwoof @SerenaTori @fastfinge ha. I know very little about how we could get it compiled right in the add-on. (I know there was a discussion of this earlier so if that build process for onnxruntime into the add-on succeeded, would love some basic copy then.) For anyone wanting to try, I think looking at something like the Brailab driver (which is super minimal and in the end all you're really going to use are the getters and setters for the synth driver, the way you do speech is obviously not at all like Brailab), and then crafting in to open the stream might work. But between the latest family emergency, work at Spotify with the new year / new projects, I'm afraid I'll be swamped for awhile to give it that truly comparitive look. I'd also love to see a test run at how quick it can synthesize speech on slower CPUs especially when that speech is interrupted mid-utterance - how does it handle stopping a stream and loading a new one, is there lots of latancy? A simple py test that just throws lots of speech chunks like that, stops, starts, would give us an idea maybe to then know if it's worth turning into a driver just yet.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@Tamasg @SerenaTori @fastfinge Sorry to hear about family emergencies, never nice to deal with. I hope things can be sorted out for the better.

Re slow CPU though, that's where I come in. I am right now even, using an Intel Core I5-3570K from 2012. It runs every synth very well, apart from Piper which it struggles with due to the neural aspect of it. If my machine can run... Whatever you guys end up coming up with (hopefully) then anything else should be a breeze.
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge I have an even slower one. Yay for countries in the middle of... Well somewhere, and computers from 2009 haha if something can even run on that, I'd be surprised. How's that for a slow processor? It's pretty ancient. The synth sounds nice, yeah, don't like how it reads hashtag, but I guess that's me. There's also something about question marks it clearly missed, but I think it needs to be fed a bigger chunk of text to see if it'll sound better. Otherwise, for the quality, Bleh, either my ears, or something, do not consider it a great quality in the sound terms, but for a TTS, I guess it's good. says the person who daily drives a TTS that came out in 2001. LOL.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@tardis @Tamasg @SerenaTori @fastfinge What's your CPU spec then, and your daily synth of choice?
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge A synth that does English people no good. Haha. And I have a dell from 2009, it has still a 32Bit windows 10 version, so it tells you something. :D
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge I also cannot tell you the full specs. Computer not here, sadly. It has a removable battery though, that gave up a long time ago, then I fell down some stairs while carrying set computer, and the pixels in the screen went poof, and no screen.
0
0
0
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@fastfinge This is nice. Yeah, this would be a nice addon.
0
0
0
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@fastfinge oh wow! I really really like that voice! that would be awesome for reading with.
0
0
1
0