What are your pain points, folks? Stuff that you hate doing or dealing with, or problems you can't find a good solution to? Stuff that other people might be frustrated with, too.
I'm looking for a way to make myself valuable to other people, as a way to both help people and also earn an income to feed my family in the process.
One thing I can do really well is create reliable software to automate rote tasks, generate financial/statistical/other reports, or calculate difficult solutions. Think it can't be done without LLMs? I might surprise you!
I can go into more detail about why all the options are bad if you want. But this is the sort of problem that eats years of your life, requires advanced mathematics (digital signal processing at a minimum), and advanced linguistics, on top of being a good systems-level programmer.
@fastfinge I just so happen to be an (unemployed) machine learning researcher by trade, with advanced mathematics, linguistics, and programming skills. Maybe not systems-level programming, but I could probably find someone who does that and work with them.
Given that the first two responses I've gotten were both about accessibility, there might be more of a market for this than you think, and also, it might make a good way to demo my skills even if it isn't paid work.
@hosford42 The sourcecode for dectalk is out there. Unfortunately, It's...legally dubious at best. It was leaked by an employee back in the day, and now the copyright status of the code is so unclear that nobody can safely use it for anything, but also nobody can demonstrate clear enough ownership to submit a DMCA and get it taken off github. GNUspeech is also pretty close to what's needed, but it won't even compile without all the NeXT development tools, I don't think. So at best all that would be is a base for something else; modernizing it would probably effectively be a complete rewrite anyway.
It looks like both DECtalk and DECtalkMini are being actively maintained, with commits as recent as 1 to 2 months ago. I was hoping the copyright for the "mini" version would be unencumbered, but no such luck. It would have to be a re-implementation from scratch using this code as a guide. That's a lot easier than implementing a new system out of nothing, though.
@hosford42 I also have no idea about any associated IP or patents, though. Wouldn't whoever does it need to be able to prove they never saw the original code, just its outputs? Otherwise you're still infringing, aren't you? In this regard, it's probably actually a bad thing that the dectalk sourcecode is so widely available.
And most of the commits seem to be about just getting it to compile on modern systems with modern toolchains. I dread to think how unsafe closed-source C code written in 1998 is.
@fastfinge hmm, patents do complicate things. I could search and see if there are patents on it, I suppose. Or I could just try to implement from scratch.
@hosford42 If you're going to reimplement something, you might be better to go with gnuspeech, as it's known to be in the GPL. At the least, it gives you a vocal model to improve on, that was coded with open research in mind, rather than proprietary code probably written for job security.
@hosford42 Also, if you enjoy comparing modern AI efforts with older rule-based text to speech systems, and listening to the AI fail hard, this text is wonderful for that. As far as I'm aware not a single text to speech system, right up to the modern day, can read this one hundred percent correctly. github.com/mym-br/gnuspeech_sa/blob/master/the_chaos.txt
But eloquence gets the closest, gnuspeech second, espeak third, dectalk fourth, and every AI system I've tried a distant last.
1. it's 150 or so years old, so a few pronunciations have changed a bit 2. the pronunciations and spellings (and hence some of the apparent mismatches) are UK English, not US English.
At a minimum, you'll have to envision skipping "r"s after vowels at the ends of words for many of these to make sense. As for the rest, I recognized a few of those from past experience with older UK English (e.g. "clerk" with an "a" sound), but a couple left me scratching my head saying "that's how people actually said or spelled it then and there?"
@hosford42@fastfinge My favorite version of this sort of thing is the kind where the pronunciation doesn't rhyme, but the spelling and the meter make it seem like it should, for example:
The Fates told Odysseus "If you write to Penelope Make sure that you use A waterproof envelope."
@dpnash@hosford42 Right, but most text to speech systems have a UK English setting. And the mistakes they're making are on things much more basic than that. For example, far too many so-called state of the art AI TTS systems can't even pronounce "Susy", "plaid", "fuchsia", and "lieutenants".
@hosford42@dpnash Compare that with the version of GNU Speech released in 1995. It still messes up "tear" and "live". But once you get past the unnatural voice, it's far more precise. And once you get used to it, much much easier to listen to at an extremely high rate of speed (4x or more) all day. All text to speech advancement from "AI" is just the wow factor of "Wow, it sounds so human!" But pronunciation...you know, the important part of actually reading text...is either the same or worse. With five thousand times the resources.
@hosford42@dpnash For example, here's Eleven Labs, the billion dollar voice AI company that's supposed to replace all voice actors forever. I used the voice builder to specifically request received pronunciation. That was not at all what I got. Aside from that, notice the incorrect "tear", pronouncing "plaid" "played", having no idea that "victual" is pronounced "viddle", and a number of other mistakes. I reran it just now, to be as fair as possible. It has not improved.
@fastfinge@hosford42 what a beautiful text. As English is a foreign language to me, I can't even dream of reaching the TTS-models' levels. It would be fun to have speakers of different English dialects record this!