The stack we built

gmux is six implementations stacked on top of one another, each adding a layer of the three-pillar workspace: gestures, agent management, visual oversight. Below is the map. The shipped layers are the foundation. The work-in-progress layers are how the desktop app becomes real.

Fig 1 · The six implementations stacked Green = shipped · orange = in progress · grey = dropped or concept only

The rest of this article walks each layer in order, with a note for each one about which of the three pillars it serves.

The six layers in order.

01 · April 2026 Concept only

The assembled prototype.

Before a single gmux repo existed, every piece was already in MASTER_PROJECTS/: wake-word voice routing, MediaPipe face tracking, four-agent orchestration. Symlinking them showed all the technology existed. The work was integration and UX.

managementgestures

02 · April → May Shipped

The Python terminal stack.

A Python daemon that reads every qalcode2/opencode instance's SSE stream and writes live state to tmux's status bar. Six-state colour code, todo progress, permission markers. Available now: pip install gmux or paru -S gmux.

managementoversight

03 · early May Abandoned

The transparent overlay.

First attempt at a visual layer — a chromeless floating window with gesture canvas drawn over the terminal. Wayland killed it: no API to query another window's pixel position, so the overlay couldn't align with tmux panes. Decision logged; move to Option B.

gestures

04 · mid May In progress

The Tauri desktop app.

Tauri (Rust + WebKit) owns the terminal directly — spawns tmux via a real PTY, xterm.js renders it, gesture canvas layers on top. PTY working, sidebar with 14 panes live, three of five ship-criteria green. Voice port and qalcode2 push-patch are the remaining blockers.

gesturesmanagementoversight

05 · May Built, not wired

The memory layer (gmux-brain).

An MCP server that routes queries across three memory types: structural (graphify — what calls what), episodic (kalarc-memory — why X was decided), workspace (gmux native — what other agents are doing). Injects ~600 tokens of context into every new agent pane.

management

06 · May 13 Live

The product face — gmux.ai.

Single-file landing on Cloudflare Pages plus a Worker handling interest votes. Four live demos ship under /demo/: multi-agent monitor, agent flowchart, memory panel, phone companion. All run on mock data — no backend required.

oversight

The decision that made the management pillar work.

The first version of the Python stack detected agent state by pattern-matching terminal output — look for the qalcode2 prompt (waiting), look for a spinner character (working), look for "Continue?" (permission). It worked. Then it stopped working.

A model outputting a long response with ❯ in it would flip the indicator to "waiting" mid-stream. A code block with spinner characters meant "working" when the pane was idle. The approach was inherently fragile — guessing state from visual artifacts.

qalcode2 exposes /session/status and /event SSE. Every state transition — agent started thinking, finished, permission prompt fired, agent responded — comes through as a structured event. Subscribing to the stream made detection exact and instant.

No pattern-matching. No screen-scraping. State transitions pushed as they happen.

This is the architecture that keeps the management pillar honest. Every visible indicator — the status bar colour, the sidebar dot, the todo progress, the permission marker — comes from a real event, not an inferred one.

Two cameras solve the gestures pillar.

A real webcam can only be opened by one process at a time. The gesture engine wants it. So does the browser when you want a demo. The fix: v4l2loopback — a kernel module creating a virtual camera device.

Fig 2 · Camera broker architecture Nothing reads /dev/video0 except the broker. Everything else reads /dev/video2.

A background ffmpeg process reads /dev/video0 exclusively and writes to /dev/video2. The gesture engine and browser apps both read from the virtual device. No "camera in use" conflicts.

How the oversight pillar shows attention.

The agent monitor's flowchart view is how visual oversight pays off. Each agent's current attention becomes a vertical chain: agent → folder → folder → file. The active edge pulses in the agent's colour with a running timer attached.

Fig 3 · Two views in the flowchart Default = one agent's attention chain. Overview = whole-fleet web with conflict detection.

The default view watches one agent. Click an agent in the rail to follow it, or toggle "auto-switch" to always follow whichever pane you're focused on in gmux. The overview view (⊞ ALL) shows every agent at once arranged in a ring; their active files orbit around them inside dotted "territories". When two agents touch the same file, the file glows — surfacing potential merge conflicts before they happen.

Pure SVG, vanilla JS, zero dependencies. Tauri-ready as a second-monitor window. Try it: /demo/monitor/.

Where we are now.

ships today

Python terminal stack (layer 02) — PyPI + AUR. Live AI state in tmux status bar today: pip install gmux && gmux --status-only.

Four browser demos (layer 06) — all running on mock data, no install. /demo/, /demo/monitor/, /demo/memory/, /demo/phone/.

close to shipping

Tauri desktop app (layer 04) — three of five ship-criteria green. Voice port (move from :8765 to :8770) and qalcode2 push-patch are the remaining blockers.

gmux-brain wiring (layer 05) — built and documented. A 30-minute config change to opencode.json unlocks ~600 tokens of context for every new agent pane.

won't be shipped

Transparent overlay (layer 03) — Wayland made it impossible to align the overlay with tmux panes reliably. Tauri (layer 04) replaces it by owning the terminal directly. Decision logged in DECISIONS.md May 2026.

Try the stack.

Open a demo — ↗ Multi-agent monitor ↗ Agent flowchart ↗ Memory panel 📱 Phone companion

All layers MIT licensed.
Today: pip install gmux · paru -S gmux

Next · the devlog → Open the agent monitor → ← Overview