gmux.ai/writing/overview
May 13, 2026 Research · 01 ~8 min read live demo running

How gmux changes multi-agent development.

Three pillars — gestures, agent management, visual oversight — wrap a fleet of AI coding agents into a workspace you can actually see. This is the overview.

Apr · prototype
PyPI + AUR shipped
May · Tauri + 4 demos
Installer · soon
v1.0 · public
Try it now — ↗ Multi-agent monitor ↗ Agent flowchart ↗ Memory panel 📱 Phone companion

Running one AI coding agent is fine. The current generation of tools — Claude Code, qalcode2, DeepSeek-TUI — do single-agent work well.

Running ten is when the problem starts.

Ten agents working in parallel across ten projects is increasingly normal. In plain tmux it's chaos. Agent 7 has been sitting on a permission prompt for an hour. Agent 3 finished its task. Agent 5 errored out silently. You have no idea — because tmux just shows you text boxes.

10+parallel agents · normal
6colour-coded states
4live demos · no install
0cloud calls · all local

gmux is the workspace shell for that problem. It reads every agent's state directly from its API and presents the fleet through three pillars: gestures as a real input device, agent management that keeps work flowing, visual oversight so you see everything at a glance.

tmux · the substrate 01 · GESTURES Hands as input. Pinch · swipe point · dwell MediaPipe · on-device 02 · MANAGEMENT Many agents in flight. Live state · todos permissions · memory SSE · not screen scraping 03 · OVERSIGHT See the whole fleet. Grid · flowchart phone · projector never scan tmux windows again three pillars · one workspace
Fig 1 · The three-pillar architecture Each pillar fails independently — turn any one off, the others still work.

Pillar one · gestures as a real input device.

Most AI tools treat the keyboard as the only input. gmux adds a second one: your hands. A camera on your laptop runs MediaPipe locally, tracking both hands in real time, translating natural movements into terminal actions.

PINCH click · drag-scroll SWIPE switch agent POINT voice toggle THUMBS approve perm OPEN activate · zoom RIGHT HAND Navigation. swipe, pinch-scroll, pinch-click cursor inside the workspace LEFT HAND Commands. point, thumbs, three-finger jump verbs · semantic actions five core gestures · two hands · twelve actions
Fig 2 · Gesture vocabulary Hand role separation prevents accidental input while typing.

The system runs in two modes. Passive mode (default while typing) has a higher confidence threshold and blocks swipes — you don't accidentally switch agents mid-sentence. Active mode (triggered by holding an open palm for 1.5 seconds) accepts every gesture for deliberate navigation.

Voice rides the same input layer. Left-hand point toggles listening; faster-whisper transcribes on-device; the wake word routes the command to a specific agent pane. Camera sharing uses v4l2loopback so multiple apps can read the same webcam without conflicts.


Pillar two · agent management that keeps work flowing.

The reason ten agents in tmux is chaos: tmux is oblivious to what's running inside its panes. It just shows rectangles of text. Permission prompts fire silently. Agents finish without telling you.

gmux reads agent state directly from each agent's HTTP API (qalcode2's /event SSE stream, opencode's /session/status). No pattern-matching, no screen-scraping — exact state transitions pushed as they happen.

working waiting ! permission done idle error

Those six colours are what tmux shows you for every window, every second. The status bar becomes a fleet dashboard:

gmux 2:◉ volkus 6/8 3:● planner 3/5 4:! research 2/4 5:◉ deploy 8/8 6:○ fish
Fig 3 · tmux status bar · live Window 4 needs you right now. You see it without opening it.

The numbers (6/8, 3/5) are live todo progress pulled from each agent's session. Permission prompts get a separate orange marker — when window 4 fires !, you know to look without scanning every window.

Beyond state, gmux carries the memory of each agent — a context layer that injects ~600 tokens of structural and episodic context into each new pane. The agent knows the codebase architecture, recent decisions, what other agents are doing, before its first message.


Pillar three · visual oversight for the whole fleet.

Knowing the state of every agent is necessary but not sufficient. You also need to see what each one is touching, where attention is moving, and whether multiple agents are converging on the same file.

gmux ships four oversight views, each tuned for a different question.

gmux · multi-agent monitor 8 agents volkus working 6/8 tasks planner waiting 3/5 tasks research ! permission 2/4 tasks deepseek working 7/9 tasks deploy ✓ done 5/5 tasks gemini idle haiku working 1/3 tasks claude-3 ✗ error 2/6 tasks agent monitor · flowchart watching: volkus volkus 6/8 todos src/ gesture-engine.js WRITE memory panel · episodic + semantic Installer paused until 5 criteria pass decision · planner · 2 days ago Option A overlay killed by Wayland decision · research · 3 days ago SSE not screen-scraping for state decision · volkus · 4 days ago DeepSeek-TUI +21,752 stars this week fact · deepseek · 1 day ago phone companion vol↓ cycle · vol↑ ptt volkus working research ! tap to approve planner waiting PTT
Fig 4 · Four oversight views in one diagram Multi-agent grid · flowchart · memory · phone — same data, different lenses.

The multi-agent grid is the default — every agent as a card, sorted by urgency (permission first, then waiting, then working, then idle). The flowchart takes a single agent and shows the path of its attention: agent → folder → file with a coloured pulse-line and a running timer. The memory panel shows decisions, facts, and episodes across the workspace, filterable by agent or memory type. The phone companion puts the urgent subset in your pocket — volume keys cycle agents, push-to-talk to send commands.


What ships now.

working today

The Python terminal stack. pip install gmux && gmux --status-only gives you live AI state in your tmux status bar — no camera, no mic, no ceremony. AUR package available: paru -S gmux.

Four browser demos. Multi-agent monitor, agent flowchart, memory panel, phone companion — all on mock data, no backend required.

in progress

The Tauri desktop app. Real PTY, gesture canvas, live data flow. Three of five ship-criteria green; voice port and qalcode2 push-patch are the remaining blockers. Installer paused until all five are green.

gmux-brain memory wiring. Three-layer architecture (structural / episodic / workspace) is built but not yet registered in opencode.json. The 30-minute task that unlocks every agent getting ~600 tokens of context for free.

Try it.

All four demos run in your browser on simulated data with no backend. They use the same UI code that ships in the desktop app — just disconnected from the real data sources.

Open a demo — ↗ Multi-agent monitor ↗ Agent flowchart ↗ Memory panel 📱 Phone companion

gmux is an open project. MIT licensed.
Install the working parts today: pip install gmux or paru -S gmux.

Next · the stack we built → Open the live demo → ← All writing