User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
Admin
completely blind computer geek, lover of science fiction and fantasy (especially LitRPG). I work in accessibility, but my opinions are my own, not that of my employer. Fandoms: Harry Potter, Discworld, My Little Pony: Friendship is Magic, Buffy, Dead Like Me, Glee, and I'll read fanfic of pretty much anything that crosses over with one of those.
keyoxide: aspe:keyoxide.org:PFAQDLXSBNO7MZRNPUMWWKQ7TQ
Location
Ottawa
Birthday
1987-12-20
Pronouns
he/him (EN)
xmpp fastfinge@im.interfree.ca
keyoxide aspe:keyoxide.org:PFAQDLXSBNO7MZRNPUMWWKQ7TQ
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@b Yes, the blind community is quite strong, and almost all of us moved over from Twitter when they discontinued the API and broke accessibility. It wasn't like in sighted communities where some percent is on Mastodon, and some is on Twitter. Nearly a hundred percent of the blindness community who was active on twitter was forced to move.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@pvd1313 @the5thColumnist @RachelThornSub Depends. I like to start with deepseek-ocr if I have any reason to suspect the image is text. If it is, I can stop there. Otherwise, I move up to something like microsoft/phi-4-multimodal-instruct. If I still care and didn't get enough, llama-3.2-90b-vision-instruct will do the trick for most things. Only if it's charts and graphs that I care about do I need to use either the Google or OpenAI models. If it's pornographic, I have to use Grok, because XAI is completely and utterly unhinged and won't refuse anything no matter what. I use everything either locally where possible, or via the openrouter.ai API. That way it's more private, and I'm only paying for what I use. I usually use the tool: github.com/SigmaNight/basiliskLLM

It supports ollama, openrouter, and any openAI compatible endpoint, and integrates perfectly with the NVDA screen reader.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@pvd1313 @the5thColumnist @RachelThornSub This helps a lot, yes. Though if I know you just AI generated it, I'm probably not even going to keep reading. My AI is almost certainly better than yours, because I use it constantly and have customized the settings to get it to be as accurate as these things are capable of being.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@the5thColumnist @RachelThornSub My suggestion would be to keep it simple. If the reason you posted the photo was because it was a pretty flower, well...that's fine for the alt-text. No matter how many words you use, you might not be able to communicate the exact feeling of beauty you experienced. If you could, you'd be a writer, not a photographer. Ask yourself why you posted, and what you want someone to take away from it. If you want them to notice the colour, or the size, or whatever, those are what goes in the alt text.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@RachelThornSub @milkman76 And this is an entirely false argument. An AI specialized in describing images can run on a consumer PC, these days. It's doing zero of the things you're talking about. Apple has done image descriptions locally on its phones for five years now. If you're just tossing images at Chat GPT, you're doing it wrong. The same way as if you gave chat GPT a CSV file and told it to sort it for you. There are way, way better ways of doing that, that get you the result you want quicker, without the resource waste.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@RachelThornSub So as an actual blind user who uses AI regularly...no, not really. If you include AI generated alt-text, the odds are you're not checking it for accuracy. But I might not know that, so I assume the alt-text is more accurate than it is. If you don't use any alt-text at all, I'll use my own AI tools built-in to my screen reader to generate it myself if I care, and I know exactly how accurate or trustworthy those tools may or may not be. This has a few advantages:
1. I'm not just shoving images into Chat GPT or some other enormous LLM. I tend to start with deepseek-ocr, a 3b (3 billion parameter) model. If that turns out not to be useful because the image isn't text, I move up to one of the 90b llama models. For comparison, chat GPT and Google's LLM's are all 3 trillion parameters or larger. A model specializing in describing images can run on a single video card in a consumer PC. There is no reason to use a giant data center for this task.
2. The AI alt text is only generated if a blind person encounters your image, and cares enough about it to bother. If you're generating AI alt text yourself, and not bothering to check or edit it at all, you're just wasting resources on something that nobody may even read.
3. I have prompts that I've fiddled with over time to get me the most accurate AI descriptions these things can generate. If you're just throwing images at chat GPT, what it's writing is probably not accurate anyway.

If you as a creator are providing alt text, you're making the implicit promise that it's accurate, and that it attempts to communicate what you meant by posting the image. If you cannot, or don't want to, make that promise to your blind readers, don't bother just using AI. We can use AI ourselves, thanks. Though it's worth noting that if you're an artist and don't want your image tossed into the AI machine by a blind reader, you'd better be providing alt text. Because if you didn't, and I need or want to understand the image, into the AI it goes.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@cachondo @FreakyFwoof It's not a matter of affording, for me. If it wasn't for NVDA I'd have to come up with the money for jaws. So I give that money to NVDA instead. I just never stopped setting money aside for a screen reader.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@FreakyFwoof TYup. The anual jaws price increase is just a reminder for me to up my NVDA donation.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin Dismissing your lived experience is not my intent. But it's not something I can speak to as it's not an experience I've had, 'nor can imagine. And if we're trying to define what it means to be intelligent, we kind of do need to base it on things the majority of people can understand.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin To add something that I think is important: intelligence is something our society prioritizes for its own reasons. I don't think having intelligence gives an entity anymore value or more right to exist than an entity that does not have intelligence. And I think the state of intelligence comes and goes; I'm not intelligent when I'm sleeping.

Anyway, interesting conversation! Thanks for engaging.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin Sure, you can talk to someone who's extremely high or has dementia. That doesn't mean they currently have access to intelligence. That's the one thing that LLMs have successfully proven: just because it can talk with correct syntax does not mean it thinks. And of course you can have more than one self. Each self is an intelligence of its own. But to have intelligence, you do need at least one self, and that self needs the ability to process experience, and that processing requires something like language. If you don't have all three, what you have is not intelligence: it's a Parler trick (like LLMS that have only language), death (if you have no concept of self), or...I can't even think what a self-awareness with language and a sense of self but no ability to experience would be. Forming an idea of self requires experience, so I just don't think that's a possible thing.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin But they call it ego death because it's a type of death. And death is stopping to exist. Turning off your intelligence as much as you can. intelligence requires an ego that is aware of itself, and that persists through time. I would say that's the primary reason an LLM is not intelligent: it lacks ego.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin But intelligence requires something happening LOL. And thus some sort of language for it to happen in.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin But if they're not expressed in some way, nothing is happening.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin You might not know how to express it to others. But you have to be able to express it to yourself. Otherwise, how are you aware of the thought at all? How can you recall it later? Thought isn’t some magic incomprehensible thing. It’s way more complex than the llm people admit. But it it does require some sort of internal language. Otherwise how can it exist at all?
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@jscholes @jaybird110127 Yup, I love uv. It just feels like huge overkill for anything but the machine I actually develop on. And I think uv needs you to modify the path, in order to do everything it does, but I could be wrong. Also, don't modern Python packagers do some tree-shaking so they're not including the complete Python environment in every single build? I know that NVDA's environment, to my sorrow, doesn't include multiprocess, for example, because NVDA doesn't use it. So if I want it in my addons, I have to include it myself, even though it's a Python builtin.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@jscholes @jaybird110127 You're assuming I want to install uv and modify the path and keep uv up to date and have multiple python versions on the machine. And in one particular case, that the machine has an internet connection at all.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@jscholes @jaybird110127 I often package them up if they need to run on a different machine from the machine I developed them on. Saves dealing with setting up a virtual environment and so on.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@munin Right, but words, drawings, and music are all forms of language. "Language" isn't limited to written words, hand gestures, or mouth noises. And without some form of language, a way to put handles on ideas and manipulate them, thought is impossible. What those handles are and how we attach them to ideas differs. But it's still a process that is required for any kind of thought or reasoning.
User avatar
🇨🇦Samuel Proulx🇨🇦 @fastfinge@interfree.ca
5mo
@MaryAustinBooks I mean if I went, I'd just spend all my time wishing I was dead, and/or actually die when I have to watch the drunk guy from I.T engage in acts of sexual harassment because my social anxiety is far too deep for me to actually do anything about it. So we might as well just front load it, right?