Note by @fastfinge

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

Edit: This is now released. Say all works, though the audio becomes choppy sometimes. But it doesn't crash.
Right! I now have a copy of Eloquence that works on the 64-bit alphas of #NVDA, with the following issues: say all on the web doesn't work (it stops whenever the type of element changes for reasons I don't understand), and dialect switching doesn't work (but it doesn't crash everything anymore). If you want to play, you need to follow the build instructions; I only understand about a quarter of this code and have no intention of actually releasing things that are still broken: github.com/fastfinge/eloquence_64/

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge Was AI involved in the making of this?

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes Oh, also, I used spell check on the build docs. So I guess that counts. And Titet11 is Mexican and more comfortable working in Spanish, so AI translation was involved in communication and some of the comments. If you want to avoid anything that uses AI you need to avoid this, as it couldn't exist otherwise.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge No, nothing like that. But appreciate the detailed breakdown!

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes Hah no worries. Your question got me thinking about what that even means. Like if my collaborator doesn't speak my language, does that mean I should disclaimer the code as AI assisted? If the code started off as entirely human generated, and an AI rewrote it, is it now AI generated? If a human rewrote large parts of what the AI did, when does the code stop being AI? I really don't know.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes So with more code updates this morning, the thing I'm noticing is that the more rewriting that is done, the less and less code there is from the initial AI rewrite. The AI solution mostly worked, but was over-complicated and multi-threaded where it didn't need to be. We're slowly arriving at code that is both simpler and works better.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge I suppose I initially asked because of it defaulting to a Python helper process written in Python, using sockets as the IPC mechanism. Which is very AI, based on what will have been most common in the training data.

But for this sort of thing, I wonder about performance gains from shared memory, COM, or whatever with something other than Python on the other end.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes So the reason I wanted Python was because I naively thought a lot of the existing code could be reused, as well as some of the learnings from IBM TTS, eloquence threshold, and the sonata voices. That turned out to be entirely wrong. The "correct" way to do this would be to write a 64-bit API compatible wrapper for ECI.dll. But that's way beyond my abilities as a programmer, and AI can't help because we don't have the development headers for ECI.dll to feed it.

feld @feld@friedcheese.us

8mo

@fastfinge @jscholes

and AI can't help because we don't have the development headers for ECI.dll to feed it.

Ohhh I bet you could get it decompiled with something like IDA Pro and feed it to an AI model and get somewhere

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@feld @jscholes Maybe you could. But I don't really understand low level c++ code and pointers and things well enough to want to use AI for this. When it comes to Python, I can at least understand the code well enough to audit it (even if not to write it myself), and understand the approach. With C++ that wouldn't be the case.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes It's original code used C++ as the host, with GRPC for communication. I made it retry because I didn't want to deal with the complexity of the GRPC dependencies in an NVDA addon, I don't really understand protobuffs, and it just didn't feel any faster.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge github.com/datajake1999/SAPI5IBMTTS/blob/master/eci.h

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes Those are for version 6.4 of the DLL, and we use 6.1 because 6.4 has a bunch of changes like requiring registry entries for languages and voices that make it not portable, and several annoying bugs. I believe 6.4 also made a bunch of changes around threading. I already ran into issues with this, because the tts.txt is the manual for 6.4, and we need to use 6.1, the last release before IBM took it over.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge How annoying. I thought there was also header files inside the Voxin packages, but I can't find them at present and not sure if they'd be any more helpful.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes Extremely annoying. I spent like two hours fighting with that because I'd given the AI the 6.4 manual, and it was doing things the manual said were allowed, but it wasn't working. Took me forever to realize that the DLL is a different version than the included manual.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge Separately, I 100 percent think GRPC plus Protobuf would have been overkill and, in some scenarios, slower than the current "hand-rolled binary messages over a socket" approach because of overheads.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes You might be right. I just didn't wanna distribute like five different dependencies with the addon. Then it turned out that NVDA's version of Python doesn't include multiprocessing, even though it's built-in to Python by default, so I had to annoyingly include .pyd files and library code anyway.

Matt Campbell @matt@toot.cafe

8mo

@jscholes @fastfinge If I were doing it, I'd definitely use something non-Python for the 32-bit helper process. I don't have time to do this though; I'm a bit behind on actual obligations as it is.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@matt @jscholes Yup. I'm hoping someone will get inspired and do things correctly. I just wanted something that worked, and Python was the easiest way for me to get that done. It's far from the best way.

James Scholes @jscholes@dragonscave.space

8mo

@fastfinge Absolutely. It's comforting that at least one working version exists in the world. @matt

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@jscholes @matt I know davidacm is working on something in rust, of all things. But I don't know where he's at with that, and I don't know anyone else who wants to work in rust. So either he'll finish it and maintain it all himself or nothing will come of it. :-)

Matt Campbell @matt@toot.cafe

8mo

@fastfinge @jscholes Rust is what I'd probably use.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@matt @jscholes Is there really anything to be gained, though? Few enough blind people who care about eloquence know C++. Even fewer are going to know rust. So is being one of maybe three people who could maintain it worth whatever performance gains you'd get?

Day Garwood @daygar@tweesecake.social

8mo

@fastfinge @matt @jscholes I'm better at C/C++ than I am with Python. Don't know enough about the Eloquence API though, especially if it does differ significantly between versions.
I probably know even less about IPC and threads, but there's plenty of learning scope in that department.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@daygar @jscholes @matt If you look at this file, you should be able to understand what's expected from you by both ECI.dll and NVDA. github.com/fastfinge/eloquence_64/blob/master/host_eloquence32.py

Alex Chapman @alexchapman@tweesecake.social

8mo

@daygar @fastfinge @matt @jscholes I suck ass at coding without help from either someone who knows way more, or in most cases, Gemini or ChatGPT. I've used Python as its super easy to get the program going and things like nuitka and pyinstaller let you compile stuff with a couple of commands. C++ I've heard is way more complex, and if you don't want to deal with the bloat that comes with Visual Pubio, sorry I mean Visual Studio, then you're other option is GCC with MinGW, and when I tried to get that working, the installation manager UI is borked with NVDA.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@alexchapman @daygar @jscholes @matt Meh. You can install the visual studio build tools from the command line, these days. Then just add the workload to vscode. Getting a development environment set up isn't the hard part. I actually already have one because of unspoken-ng and working with steamaudio. I just am not comfortable in C++ doing anything other than compiling other people's code and making the odd, extremely basic, change.

Alex Chapman @alexchapman@tweesecake.social

8mo

@fastfinge @daygar @jscholes @matt Ah OK.

🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca

8mo

@alexchapman @daygar @jscholes @matt If you ask chat GPT it will guide you on how to get the visual C++ compiler and Windows SDK set up on the command line. Just specify that's what you want or it will direct you to the GUI.