@jscholes @Jage On a more serious note, I think the interface as presented is just way, way too generalist, and its freedom too unrestricted. Things I'd like to see:
1. it gets prompted with the name of the currently focused app, and the mouse is disallowed from leaving that app window.
2. Bringing it up with alt+ctrl+g gives prompts of what tasks it thinks it could perform inside the current app. Instead of the current general "ask me to do anything!"
3. It should have access to the DOM for browsers so it's not taking screenshots and acting only on that all the time if it's being asked to do something on a webpage.
4. It really, really needs training on NVDA focus mode. It says to turn off speech, but that doesn't solve for it trying to type into an edit field when NVDA isn't in focus mode. It does this constantly.
I still don't think it'd be ready for prime time, but it would be closer.