Final update: The developer is now on Mastodon via @andrew_guide.
Update: The developer has removed the ability to download Guide until the security issues mentioned in the linked thread are fixed.
Update: this product contains some code flaws that are concerning from a security perspective, beyond just giving control of your computer to an LLM. You might want to read this thread before installing the product: toot.cafe/@matt/114258349401221651
Update: I've exchanged some long emails with Andrew, the lead developer. He's open to dialogue, and moving the project in the right direction: well-scoped single tasks, more granular controls and permissions, etc. He doesn't strike me as an #AI maximalist can and should do everything all the time kind of guy. He's also investigating deeper screen reader interaction, to let AI just do the things we can't do that it's best at. I stand by my thoughts that the project isn't yet ready for prime time. But as someone else in the thread said, I don't think it should be written off entirely as yet another "AI will save us from inaccessibility" hype train. There is, in fact, something here if it gets polished and scoped a bit more.
Just tried guide for fun. It's supposed to be an app to use #AI to help #blind folks get things done. I asked "Where are the best liver and onions in Ottawa?" It: 1. Decided it needed to search the web. 2. Thought that the "stardew access" icon on my desktop was a kind of web browser, so clicked it. 3. Imagined an "accept cookies" dialogue it needed to accept. 4. Decided that didn't work, so looked for Google Chrome (I don't have chrome installed on that machine) 5. Finally opened edge from the start menu. By the way, it just...left Stardew open and running. Because apparently having Stardew Valley running in the background is a vital part of finding liver and onions in Ottawa. 6. Opened a random extension from my edge toolbar (goodlinks). 7. Clicked the address bar and loaded google.com, instead of just doing the search right from the address bar. 8. Got blocked because it couldn't sign into my Google account, even though it could have also searched from the Google homepage.
To be fair to AI, that was the kind of open-ended task AI is terrible at. If I had asked it to check an inaccessible checkbox, or read a screenshot, or something, I'm sure it would have been fine.
Anyway, I'm still better at using a computer than an AI. So is my 87 year old grandfather, for that matter. www.guideinteraction.com
@fastfinge I've been testing it in beta for about a month now and for specific one step tasks such as interfacing with an inaccessible slider, or on one particular site I have to use selecting search results from an inaccessible drop down it has absolutely saved me a lot of time and annoyance. I think the problem is that it isn't really an llm in the way most people have gotten used to using them
So I wanted to try guide on a real accessibility issue. However, it seems that #codeberg has finally fixed their #inaccessible#captcha. Now, if you tab into the #captcha field, you're told what you need to type to get past it. Good job codeberg! #a11y
@fastfinge@atherjammoa yes it is, especially when it does things like mine has been doing recently. Don’t let it name your real life children. I don’t think you want a child named fuck. It may not have actually said that, but that’s the way Heather pronounced it.
@fastfinge Yup, definitely send in feedback though, he wants to make it better. Maybe giving it a prompt that helps it understand what to do for specific types of requests to guide it.
@Jage I think the lack of a privacy/"what I'll do with your data" section of the website is a big miss for a product that will have access to do whatever it feels like on my computer. @fastfinge
@jscholes@Jage On a more serious note, I think the interface as presented is just way, way too generalist, and its freedom too unrestricted. Things I'd like to see: 1. it gets prompted with the name of the currently focused app, and the mouse is disallowed from leaving that app window. 2. Bringing it up with alt+ctrl+g gives prompts of what tasks it thinks it could perform inside the current app. Instead of the current general "ask me to do anything!" 3. It should have access to the DOM for browsers so it's not taking screenshots and acting only on that all the time if it's being asked to do something on a webpage. 4. It really, really needs training on NVDA focus mode. It says to turn off speech, but that doesn't solve for it trying to type into an edit field when NVDA isn't in focus mode. It does this constantly.
I still don't think it'd be ready for prime time, but it would be closer.
@jscholes@Jage It's also a security thing. Sure, it was pretty funny that it opened Stardoo Valley. But if I was using it on my real machine, I can think of a lot of apps I wouldn't want it to unexpectedly launch.