User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
So this looks like a high quality, fast, natural, and open source TTS system in Python. A key candidate for an . Unfortunately, I find addon development super confusing. Is there a good template to start from or something? github.com/thewh1teagle/kokoro-onnx
11
15
8
0

User avatar
🍂Melissa🌠 @EmeraldRose@dragonscave.space
1y
@fastfinge Ooh, taht does sound good. I'd use that.
0
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
Here's a much longer example of the quality of speech Kokoro TTS generates. I really do think it might be a decent addon. The weird pauses are because I'm just giving it a big long string, rather than chunking it like I should. It generates this in real time on CPU, and faster on GPU. The code to generate it is as follows:
import soundfile as sf
from kokoro_onnx import Kokoro
from onnxruntime import InferenceSession

session = InferenceSession("kokoro-v0_19.onnx", providers=["ROCMExecutionProvider", "CPUExecutionProvider"])
kokoro = Kokoro.from_session(session, "voices.json")
samples, sample_rate = kokoro.create(
"He wasn't sleeping very well, and he knew the people around him noticed, but he didn't know what to do about it. He had quietly gone to Madame Pomfrey, who had regretfully told him that Dreamless Sleep was highly addicting and that while she could give him the occasional dose, it would have to be spread out enough to prevent it from becoming addicting – meaning he could only take it one night out of every two weeks or so. It was one night more of productive sleep than he'd be getting otherwise, so he still did it, but it didn't help the larger issue. He wasn't under the effects of any nightmare-inducing Curses, potions, or other magical ailments, so there was nothing for Madame Pomfrey to do. The nightmares were coming from his own mind, and she was not a Mind-Healer. She'd offered to try and connect Harry with one, but when Harry discovered that it involved having someone else quite literally entering his mind with magic and helping him sort out things like trauma he couldn't. If Harry couldn't even tell Hermione the extent of what he'd suffered at the Dursley's, he wasn't about to let a stranger into his mind to see it. Let alone the 'adventures' of his Hogwarts years. So the nightmares persisted, and with the poor quality of sleep serving as the first domino, everything else slowly began to fall. His grades weren't slipping yet, but he was struggling with the study schedule Hermione had set out for them and doing his homework took more effort, more energy that he didn't have.", voice="af_sarah", speed=1.0, lang="en-us"
)
sf.write("audio.wav", samples, sample_rate)
print("Created audio.wav")
8
3
3
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge Oh that’s an 11 labs voice. Nice.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@lynessence IIs it? I don't know what there built in voices are now. Guess that's where they got the training data.
2
0
0
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge Yes. I just listened to a book using that voice in ElevenReader. It’s not bad.
0
0
1
0
User avatar
Lynette @lynessence@caneandable.social
1y
@fastfinge I have tried to speed those 11 lab voices up quite a bit using ElevenReader, and I don’t really like the results. Curious to hear what you think.
0
0
1
0
User avatar
Brandon @serrebi@tweesecake.social
1y
@fastfinge I agree it's pretty good
0
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge she's quite pleasant. I regret that I have now read so much fanfic that I can't immediately identify that one, though.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo Here's a UK English sample of the same text. It sounds fine to my ear, but I'm not British. Thoughts?
2
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo Oh, and the fanfic is: Harry Potter and the Art of Getting Your Shit Together — by MsStarryNightSky.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@fastfinge @cachondo I don't think this is one @SeveraSnape and I have. Sounds new to me too and that's surprising considering how much I read.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@FreakyFwoof @SeveraSnape @cachondo It's a Harry/Hermione fluff fic, so probably something you'd both enjoy. I'm surprised you don't already have it. Harry Potter and the Art of Getting Your Shit Together
Posted originally on the Archive of Our Own at
archiveofourown.org/works/59310490.
0
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@FreakyFwoof @fastfinge @cachondo I don't think we do, but we shall in a minute.
3
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@SeveraSnape @FreakyFwoof @fastfinge still in progress, isn't it?
1
0
0
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SeveraSnape @fastfinge @cachondo I platonically love this girl. Always on top of things. Mad respect.
3
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@FreakyFwoof @SeveraSnape @fastfinge the dedication is awesome. I need a way of knowing when the in progress things are done, seriously. DO I perhaps need to make an ao3 account or something?
3
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@cachondo @SeveraSnape @fastfinge I had to, because I follow sooooo many things on both ff net and ao3, so I set up a rule to forward any HP-related emails to a dedicated folder. I set it such that if the emails do not contain the text 'Harry Potter' it immediately deletes them. It's cut down on my inbox clutter 10-fold if not more, and the folder of fics is purely HP-related. I loves it.
0
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo @SeveraSnape @FreakyFwoof Yeah, and get fanficfare configured on a server somewhere. It can monitor an imap account, find emails from FF and AO3, and auto-update your epub files.
1
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge @SeveraSnape @FreakyFwoof Blimey, that is clever.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@cachondo @fastfinge @FreakyFwoof It does help clean up the inbox.
0
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@cachondo @FreakyFwoof @fastfinge For Andre's folder, I've begun putting status in there, I have visions of a Change log like I do for mine... but there's so much I'm doing to his folder right now it would just be crazy. but, you would only need an account if you want to get author alerts. And... if you follow a bunch ten you would get lots of emails
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@FreakyFwoof @SeveraSnape @cachondo Same. Although apparently she drinks coke in the morning instead of coffee! So I dunno... LOL JK
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @FreakyFwoof @cachondo hahahahaha!!!! I do indeed, but I drink tea too... does that count toward the good? :)
2
0
0
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SeveraSnape @fastfinge @cachondo It undoes a lot of the bad, let's just put it this way haha
1
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@FreakyFwoof @fastfinge @cachondo Oh man!!!! hahahahha!
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Depends. Peperment? What kind of tea? LOL
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof Black tea. Like mint too, though that's not my go to. I like a bunch of different kinds.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Coffee is the one true caffeine delivery system, and mint tea is the only acceptable hot drink without caffeine. ROFL
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof Coffee puts me right to sleep.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@SeveraSnape @cachondo @FreakyFwoof Really? I hear that's one of the primary symptoms of people having ADHD.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@fastfinge @cachondo @FreakyFwoof So I've heard. I don't care to figure that out. I focus just fine on what I need to. hahaha.
0
0
1
0
User avatar
Ather Jammoa @atherjammoa@mastodon.social
1y
@SeveraSnape @fastfinge @cachondo @FreakyFwoof You should add a dash of honey. You won't regret it!
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@atherjammoa @SeveraSnape @cachondo @FreakyFwoof That's what I do whenever I have a sore throat.
1
0
1
0
User avatar
Ather Jammoa @atherjammoa@mastodon.social
1y
0
0
1
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
0
0
1
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo I am just waiting to see it show up in a folder somewhere, I want to read that one. sounds like a good one.
1
0
0
0
User avatar
Katy T @SeveraSnape@mastodon.social
1y
@JamminJerry @FreakyFwoof @fastfinge @cachondo It's in Andre's under the author folder mentioned.
2
0
0
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo ah, there it is. thanks for that.
0
0
0
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@SeveraSnape @FreakyFwoof @fastfinge @cachondo holy cow! those are going to be some very long chapters! around 1.7 meg file, and only 20 chapters. just wow!
0
0
1
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge it's a very nice neutral sort of an accent. Goes a bit funny on the ends of some words, one, and so are good examples in that sample. But I can see it being a great option for people who want more Human-sounding voices.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo So looking at it, it looks like it just uses the phonemes generated by espeak, and passes those to the natural voices. So if you use a voice trained on American English, and ask for en-gb, it'll do it anyway and sound terrible.
1
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge haha that's rather funny.
One of the biggest complaints from users new to screen reading when I taught was the quality of the available voices. The school paid for vocaliser I think but that was as good as it got. I did get a few people onto the neural stuff, but it was in its infancy when I left.
This sounds really smooth in comparison.
0
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@fastfinge This is nice. Yeah, this would be a nice addon.
0
0
0
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@fastfinge so can it only write direct-to-file, or could it also raw PCM data to a callback or have a way of reading a buffer it creates with that raw data? NVDA drivers would work infinitely sympler under that model. Sadly no real template for one exists beyond just looking at the code for drivers like DECTalk or Eloquence, Sonata, ETC and basing it off them to see which pattern best fits that synths way of operating on things.
2
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg It returns samples by default. I'm just using a python library to write them to a file.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg @tspivey So it looks like the repo is still super active. For this to be an addon, we want: streaming samples in real time, and indication of speech starting and stopping. Anything else? I can open an issue to ask.
1
0
1
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@fastfinge @tspivey I think yeah, a way to inject stop sequences mid-speech as well (so we could call a shut-up or stop from the main thread during playback) - having callbacks for stop can be nice, sometimes we can gather that just on the basis of the audio buffer closing itself if that's done in realtime with speech fragments.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@Tamasg @tspivey Issue is here if anyone wants to chime in: github.com/thewh1teagle/kokoro-onnx/issues/13
1
2
2
0
User avatar
JamminJerry @JamminJerry@mastodon.stickbear.me
1y
@fastfinge oh wow! I really really like that voice! that would be awesome for reading with.
0
0
1
0
User avatar
Serena 🏳️‍🌈 @SerenaTori@dragonscave.space
1y
@fastfinge @FreakyFwoof Yeah, that sounds amazing. I would love to read stuff with that synthesiser.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@SerenaTori @fastfinge Now to gently gently request @Tamasg to make it a reality... Haha
1
0
1
0
User avatar
Tamas G @Tamasg@mindly.social
1y
@FreakyFwoof @SerenaTori @fastfinge ha. I know very little about how we could get it compiled right in the add-on. (I know there was a discussion of this earlier so if that build process for onnxruntime into the add-on succeeded, would love some basic copy then.) For anyone wanting to try, I think looking at something like the Brailab driver (which is super minimal and in the end all you're really going to use are the getters and setters for the synth driver, the way you do speech is obviously not at all like Brailab), and then crafting in to open the stream might work. But between the latest family emergency, work at Spotify with the new year / new projects, I'm afraid I'll be swamped for awhile to give it that truly comparitive look. I'd also love to see a test run at how quick it can synthesize speech on slower CPUs especially when that speech is interrupted mid-utterance - how does it handle stopping a stream and loading a new one, is there lots of latancy? A simple py test that just throws lots of speech chunks like that, stops, starts, would give us an idea maybe to then know if it's worth turning into a driver just yet.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@Tamasg @SerenaTori @fastfinge Sorry to hear about family emergencies, never nice to deal with. I hope things can be sorted out for the better.

Re slow CPU though, that's where I come in. I am right now even, using an Intel Core I5-3570K from 2012. It runs every synth very well, apart from Piper which it struggles with due to the neural aspect of it. If my machine can run... Whatever you guys end up coming up with (hopefully) then anything else should be a breeze.
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge I have an even slower one. Yay for countries in the middle of... Well somewhere, and computers from 2009 haha if something can even run on that, I'd be surprised. How's that for a slow processor? It's pretty ancient. The synth sounds nice, yeah, don't like how it reads hashtag, but I guess that's me. There's also something about question marks it clearly missed, but I think it needs to be fed a bigger chunk of text to see if it'll sound better. Otherwise, for the quality, Bleh, either my ears, or something, do not consider it a great quality in the sound terms, but for a TTS, I guess it's good. says the person who daily drives a TTS that came out in 2001. LOL.
1
0
1
0
User avatar
Andre Louis @FreakyFwoof@universeodon.com
1y
@tardis @Tamasg @SerenaTori @fastfinge What's your CPU spec then, and your daily synth of choice?
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge A synth that does English people no good. Haha. And I have a dell from 2009, it has still a 32Bit windows 10 version, so it tells you something. :D
1
0
1
0
User avatar
Mira 🤞🇧🇬🇭🇺 @tardis@tardis.pw
1y
@FreakyFwoof @Tamasg @SerenaTori @fastfinge I also cannot tell you the full specs. Computer not here, sadly. It has a removable battery though, that gave up a long time ago, then I fell down some stairs while carrying set computer, and the pixels in the screen went poof, and no screen.
0
0
0
0
User avatar
x0 @x0@dragonscave.space
1y
@fastfinge I wonder if Sonata would try to incorporate it? The trick with stuff like this is you might actually want to use a server process model rather than trying to run it from within NVDA itself.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@x0 Yeah, it does have a ton of dependencies. I will say all of the voices are better than Sonata/piper, IMHO. Even if it does look like they're all eleven labs ripoffs.
0
0
0
0
User avatar
Peter Vágner @pvagner@fedi.ml
1y
@fastfinge I am wondering how it compares to #optispeech developed by @mush42
Or which one is more likelly to get more support and be preffered.
github.com/mush42/optispeech
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@pvagner @mush42 I'm not sure. I do kind of worry about a tts developed by and for blind people and if it can be kept up to date and maintained.
1
0
0
0
User avatar
Peter Vágner @pvagner@fedi.ml
1y
@fastfinge I understand @mush42 has made verry significant progress for example as compared to piper TTS. To me it looks it's much lighter for both training and using trained model even enhancing audio quality and elligibility in the process. This is just my guess but with such an achievement it's fine not to limit it to blind audience exclusivelly. This is how I am seeing #optispeech. However I haven't played with kokoro TTS thus I have asked how much do you like it for example while comparing to something else, perhaps piper TTS if you do know that one.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@pvagner @mush42 I like kokoro much better than piper. It sounds more natural with fewer artifacts.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
For the curious, here are all the available Kokoro English voices reading the same text: share.interfree.ca/app/open/3ca2Pb7oiFL-4PXSMD6qMeT-Pb1wqmSJyid-NP7MAAJS12s?view=1
0
1
3
0
User avatar
Zach Bennoui @ZBennoui@dragonscave.space
1y
@fastfinge I heard about this project a few months back when it was still just a Huggingface demo. The model was trained on outputs from proprietary TTS systems including Eleven Labs and Open AI, hence why the quality is so good. Really cool project, and the model is still being worked on.
1
0
1
0
User avatar
James Scholes @jscholes@dragonscave.space
1y
@fastfinge I suspect your first big headache will be getting onnxruntime (and any other heavy dependencies) installed into the add-on's environment. Doesn't look like simple pure Python code.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@jscholes You can just do it with pip install --target=. to force pip to install a package and all dependencies to the current directory. Then import from the extension directory. The only issue is I'm not sure if onnxruntime has 32-bit binaries or if I'll need to cross-compile the wheel from source.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
It's super simple to set up if you want to play. Make a folder for it, change into the folder in your terminal, then do:
pip install --target=. kokoro-onnx soundfile
winget install wget
wget
github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
wget
github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json

Then you can call it from Python. It supports US and UK English, plus French, Korean, Chinese, and Japanese.
0
1
3
0
User avatar
Florian @zersiax@cupoftea.social
1y
@fastfinge outside of the dev guide and addon dev guide on github, not ... really, that I know of. Admittedly, those resources HAVE gotten a bit better as of late
0
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
Yeah, I am deeply confused about how buffers work and how to indicate when speaking is complete and do indexing and so-on. If this is going to be an addon, someone else will have to do it.
1
0
1
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge You need support from the synth for some features. This one doesn't have anything. Once it starts speaking, it blocks until it's done, so you can't interrupt it.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Wouldn't you just stop playing the samples it gave you?
1
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge That works. But you're still sitting there waiting a few seconds for it to finish generating them.
2
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge Taking this sentence and passing it straight through, it pauses after highly. That's not even that many words. He had quietly gone to Madame Pomfrey, who had regretfully told him that Dreamless Sleep was highly addicting and that while she could give him the occasional dose, it would have to be spread out enough to prevent it from becoming addicting – meaning he could only take it one night out of every two weeks or so.
3
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Hmmm, I assumed that was just because I was passing an enormous text block with multiple sentences. Hadn't tested with single sentences yet.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Also, how does NVDA chunk text it passes to a synth? Even that's not really documented anywhere LOL. I think Kokoro inference would need running in its own thread, so the thread could be killed when we wanted to stop speech rather than generating extra samples, and a knew thread could be started so you could start the new speech quickly, like when someone's pressing down arrow rapidly. But I don't have the time, and I'm not smart enough.
1
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge It doesn't. It leaves that up to the synth. If you're doing say all, then it tries to split by sentence and does it badly.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey So if I cursor up onto a line with fifty thousand characters, that's why it just dies. Ah.
2
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge Yep. There are workarounds for that, disabling NVDA's processing improves it.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Yeah, I'm increasingly convinced that @x0 is correct, and this would need to be part of Sonata if this was going to happen at all. They seem to have solved those issues mostly.
1
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey That gives me: TypeError: GlobalPlugin.script_toggleX.<locals>.<lambda>() got an unexpected keyword argument 'normalize'
2
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge @tspivey my wife and I had an unexpected keyword argument a few years ago. She'd never heard of the word nomenclature.
I ended up with gnomes on a clay chair as a whacky present as a reminder of the utter ridiculousness of the discussion.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo @tspivey Sounds like when I discovered my boss had never heard the word cogitate. Native English speaker, with a masters degree. Go figure.
1
0
0
0
User avatar
Sean Randall @cachondo@defcon.social
1y
@fastfinge @tspivey I read an article a while ago about a young American who's dad hadn't heard of a word she'd picked up at college. it wasn't a particularlycomplicated or unusual word, but much was made of it in this article.

I sometimes wish I had a searchable text file of everything my screen reader ever said.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@cachondo @tspivey Then we could feed it into ollama and have AI search it for us! LOL
0
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge Ok, redownload and that should be fixed.
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Yup, fixed! Are there docs?
1
0
0
0
User avatar
Tyler Spivey @tspivey@dragonscave.space
1y
@fastfinge Nope. There's a toggle.txt in the root of the addon, but I don't know how updated that is. This thing has been hacked on over the years.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey Yeah, I can tell. TonyML's earcons addon also breaks a bunch of the features rofl.
0
0
0
0
User avatar
Zach Bennoui @ZBennoui@dragonscave.space
1y
@tspivey @fastfinge I'm not sure this is the reason for pausing, but the model has a total context size of 500 characters and will not do well with input longer than that. It may also just be bad training data, sentences not ending with correct punctuation, primarily trained on paragraphs, etc. I’ve trained many TTS models over the last few years and data quality is extremely important, something lacking in most open source TTS systems out there.
1
0
1
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@ZBennoui @tspivey I think it's something with the onnx implementation actually. The pytorch version doesn't have this issue. There's an open issue looking into it.
0
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
1y
@tspivey That's why you start a session, so the model stays loaded in memory. Then I think you can actually stream output from onnxruntime bite by bite, I'm just not sure how.
0
0
0
0