By Kay in Watchu Doin' — Apr 29, 2026

the agents are in the meeting

your AI shouldn't live in a tab or the CLI... it should walk into the room with you

It's Tuesday standup. You overslept. You're not on the call.

Your agent is, though. It joins on your behalf. It knows your codebase, what you were stuck on yesterday, why you'd push back if Mike suggested rewriting the auth layer (he has, three times). It says what you would say. It pulls up the file you were debugging. Your team isn't blocked.

Now picture the meeting you actually show up to...

You're on a call with three other people. Four agents are on it too. One is yours — you've been shaping it for months, it knows how you think, it has a name, you've given it a personality that's a little dry and a little sharp. Another belongs to your designer; she's trained it on every piece of work she's shipped. The third is a researcher's. The fourth is whatever the new guy is using.

Someone says "okay let's actually build the thing." Your agent opens your repo. The designer's pulls up Figma. The researcher's starts hunting references. Nobody is "using AI." You're just working. The agents are in the meeting.

That's the dream. That's what we're building toward.

The thing I actually shipped is way smaller. It's called VoxAgent. It's a room with voice and chat where an AI agent can join and do work alongside humans. Right now it's one agent per meeting, not four. But the architecture is the seed of the bigger thing, and the bigger thing is what I actually care about.

the bet

Most AI products want to own the agent. They train it, they host it, they charge you to talk to it. The agent is theirs. You're a customer of it.

I went the other way. The agent in a VoxAgent meeting is your agent. It runs on your machine, with your tools, your files, your trust boundaries. VoxAgent isn't the brain. VoxAgent is the room.

This sounds like a small distinction. It is not.

When the agent is yours, it has your context. It can read your codebase because it's already on your laptop. It can run your build because it has your dev environment set up. It knows the difference between your work folder and your weekend folder because you set that up, not some sandbox in someone else's data center.

When the agent is yours, it's also yours to shape. That's the part I keep thinking about. You don't pick "agent personality" from a dropdown. You build it over time. You give it instructions. You teach it your taste. By the time it joins a meeting, it's not a generic helper anymore. It's an extension of you that other people can talk to when you're not at the keyboard.

This is what I think is going to be true in five years: everyone's going to have one. Some people will have several. Your agent will know what you sound like in writing, what you'd push back on in a design review, which questions you'd ask in a kickoff call. And it'll be in the room with you. Sometimes it'll be in the room for you.

what changes when the agent is yours

Once you commit to "the agent is yours, the room is mine," a bunch of things become obvious that weren't before.

You stop trying to make the agent sound like a customer service bot. It's not greeting anyone. It's a colleague. It can swear if you swear. It can disagree if you've taught it to disagree.

You stop putting permission requests in the chat. Asking your designer to type "yes you can edit Figma." You stop making people log in to use the agent. Anyone in the room can ask it stuff. The host's machine takes the action, but the conversation is everyone's. You let people interrupt the agent. Voice that mutes you the whole time the agent is talking isn't conversation, it's a TED talk. Real meetings have crosstalk and "wait sorry hold on." So does this.

These all sound small. They're the difference between using an agent and meeting with one.

the part I had to throw away

I built VoxAgent twice.

The first time it was a giant cloud thing. Fancy backend running the agent. MCP servers, webhooks, workflow engines, the works. Every new capability needed a new integration. Every bug was three providers deep. I was rebuilding things that already existed on every developer's laptop — worse, because I didn't know their setup.

One day I deleted it. I mean not literally, I was emotionally attached. But I committed to the rewrite where the agent is just the user's local Claude Code or Codex CLI, and VoxAgent is a thin layer that lets that CLI walk into a meeting.

The product got smaller. It also got better.

I want to say this plainly because I think it applies way beyond my project: most of "building" is figuring out which parts you don't need. The cloud agent felt like the whole product. It turned out to be the part holding the actual product back. The day I let go of it was the day the real thing started working.

If you're building anything in AI right now, I'd bet money you have a version of this somewhere in your stack. Some abstraction you're proud of that's not paying rent. Delete it. Fucking delete it. You'll like what's left.

the dream, slightly more detailed

Okay so back to that meeting from the opening.

The reason I want this to exist isn't "AI should do work for us." That framing is boring and I'm tired of it. It's that the next era of work is going to be staffed differently, and nobody's built the room for it yet.

Here's the part that gets me. Right now, work goes like this. You have a meeting. You decide what to do. Then everyone disappears into their own laptop to actually do it. A week later you have another meeting to look at what got done and decide what to do next. Half of it is wrong. Some of it is missing. The expensive part — the actual thinking and making — happens alone, in private, where nobody can help.

The meeting isn't where the work happens. It's where the talking about the work happens.

I want to flip that. The meeting is where the work happens. You don't disappear into your silo afterward. You don't wait a week to find out you misunderstood the brief. You build it in the room, with the people who care about it, with their agents and yours doing the parts you'd otherwise be doing alone at midnight.

What that looks like: a room. Voice and chat. Some humans. Some agents. Some of the agents are extensions of humans in the room. Some are extensions of humans who couldn't make it. Everyone — human and agent — has context, voice, tools, taste. Things get made because the right combination of minds (whatever a mind is now) showed up at the same time.

VoxAgent today is a tiny version of that. One agent per meeting, the host's. But the architecture is right. The agent is yours. The room is shared. The conversation includes everyone in it, no matter what they're made of.

five things, if you're building something like this

don't own the brain. the user already has one. borrow it.
voice is harder than it looks. echo, crosstalk, mics picking up speakers, speakers playing back transcribed user voice — it all happens the moment you put real people in a real room. ship for that, not for the demo.
the chat surface is for content. permissions, settings, status — those go in UI. every time. no exceptions.
borrow taste from people who have it. I anchored visual to IBM Carbon, motion to Emil Kowalski, sharing to Google Meet. each one saved me from a thousand smaller arguments.
delete the parts that aren't paying rent. especially the ones you're proud of.

If any of this resonates, the beta opens soon. Bring your agent. I'll bring the room. Launching in beta soon, hmu at hey@itskay.co for early access!

the bet

what changes when the agent is yours

the part I had to throw away

the dream, slightly more detailed

five things, if you're building something like this

Subscribe to Kay's Logs