User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
My current project of ultimate silliness is using omnivoice, gemma-4, bark, and ace-step to create a radio station that's entirely AI generated but runs locally without using the cloud. It's super buggy, so not sharing yet. But it can do foreground music, background music, foreground sound effects, background sound effects, and host dialogue with multiple hosts, all positioned with HRTF inside a studio. The hosts can use a browser to look stuff up, move themselves around the studio, and talk to each other. Sound effects and music are cached and reused. No, I don't expect to replace radio. It's more of an art project/way to torture people I don't like with a stream of endless audio slop. Also proof of what can be done without a data center; a modern video card is enough to generate spoken dialogue, music, and sound effects all in close to real time. If you have 24 gb of VRAM you don't need an enormous data center to do everything you could possibly want.

The primary issue is that the longer it runs, the farther and farther the station deviates from the original prompt. It started out as a 24/7 news station. Within 20 minutes it generated and played a song with the lyrics "I go into the kitchen and what do I see? Round and happy, just like me. A potato! Yay!" Followed by one of the hosts saying "Oh my God why do I do this job. Please send help." Note that this is caused by bad sampler settings and poor prompting; a giant trillion parameter model wouldn't do any better than what I can run locally.

If I ever get this thing in releasable shape it'll serve as a kind of ultimate answer to the people who think AI needs nine million data centers. No, it really doesn't. One gaming computer is fine. The purpose of the data centers is to centralize control in the hands of the corporations, not because AI actually needs them.
8
13
8
0
User avatar
Majid Hussain @mhussain@universeodon.com
2mo
@fastfinge I love the sound of that,
in my younger years, used to love radio dxing.
sadly what with the radio mergers the dxing landscape has, sadly gone.
here in the uk, am is shutting down, there are only a few stations left now.
also sadly, I don't have 24gb of ram to playwith.
only have 16, could a budget version be ran using say gemma4 e4b or something??
man, if within 20 minutes the station went off the rails, could that be due to the context length?
1
0
0
0
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
2mo
@mhussain The big eater of vram is ace-step. Everything else could fit. I think it's due both to context length issues, poor prompting, and some syncing bugs I have. Right now I have several different queues for audio (background, foreground, etc). Sometimes they get out of sync, meaning playback sounds off, and the AI's get confused about what's playing where and when.
0
0
0
0