Venture Capitalist at Theory

About / Categories / Subscribe / Twitter

2 minute read / May 19, 2024 /

Chatting With Her - The ChatGPT App on Mac

image For the past few days, I’ve been using the Mac ChatGPT app OpenAI demonstrated last Monday.

It’s unquestionably the future of human-computer interaction. Conversing with a computer is much more natural than typing. Imagine speaking t a colleague with the entire internet at their disposal. But also a verbose colleague without much sense of social cues.

Tapping a keyboard shortcut, the ChatGPT app loads & four little bars reminiscent of Google transcription software appear in the app. You can choose from 5 mellifluous voices who speak fluidly & naturally.

The computer is patient with unstructured thoughts. I used it to outline this post. I switched between the strengths & challenges of the product randomly. The software categorized & organized, transforming the rambling mess of sentences to an outline.

Verlyn Klinkenborg would still have some work to do, editing the outline. Just as most LLMs are verbose so is this app. For initial content review, that’s fine, but after the fifth or sixth iteration, it’s faster to interrupt the speaker by clicking.

A hybrid voice & text mode would improve today’s either/or experience. Seeing the document evolving as the speaker chats would help with editing.

Sometimes the conversation illusion breaks. The voice interrupts & says she couldn’t hear all of my sentences. Or the length of the input speech is too long & the app retries several times, processing smaller chunks of voice, leaving the user unclear on how much of the conversation has been captured. At least, this was my perception.

Other times, the app isn’t patient with an um or a thoughtful pause. I imagine this a tricky problem : when is a pause an indication to speak or wait?

But overall it’s clear where assistants like this will go : draft an email & send it. Delegate a task in Asana with a due date. Review a web page & summarize it in a blog post with some commentary, then check the grammar & publish it. All through voice.

Years ago, I wrote The Dawn of the Voice-to-Text Era & The Fastest User Interface on why I think voice is the future dominant user interface.

We’re much closer to that vision than ever.

Read More:

AI Spending Patterns : It's Not What You Think