This article is the complete reference for what gmux ships, what's in progress, and what's deliberately not. Grouped by the three pillars — gestures, agent management, visual oversight — plus the foundation layer underneath.
Gestures.
Hands as a real input device, alongside the keyboard. MediaPipe local, no cloud, no API key. Voice rides the same layer.
21 landmarks per hand, both hands, ~60 fps. 7.5MB model ships with the package; CDN fallback if missing.
OPEN_PALM · FIST · POINT · PEACE · PINCH. Plus THUMBS_UP. Geometric classifier, no ML inference at recognition time.
Right hand = navigation (swipe, pinch-scroll, pinch-click). Left hand = commands (point, thumbs, three-finger jump). Prevents typing-time false fires.
Passive (typing, 0.90 threshold, swipes blocked). Active (1.5s open palm, 0.72 threshold, all gestures live). 3-frame confirmation prevents twitch.
Hold fingertip over a session pill for 900ms — fill-bar animates, selects on completion. Useful for projector mode where pinch is hard.
Local STT daemon on port 8770. tiny/base/small/medium model options. Energy-based VAD, 0.8s silence gap before transcribe.
"next window", "approve", "split horizontal", "run tests", "jump red", "window 1–9". Anything else routes as chat to focused agent.
Background ffmpeg reads /dev/video0 exclusively, writes to virtual /dev/video2. Many readers, no "camera in use" conflicts.
Desktop gesture bridges via D-Bus (KDE) and hyprctl (Hyprland). Open palm + swipe = workspace switching. 800ms cooldown between actions.
2xl typography, single-agent fullscreen, swipe to cycle. 4-corner homography calibration. Activate: gmux start --projector.
Killed by Wayland — no way to align an overlay with another window's pixel coordinates. Replaced by Tauri owning the terminal.
Per-agent crab avatars animated by state. Extracted from main UI — competed with state dot and pane glow for attention. Code preserved.
Agent management.
How gmux knows what every agent is doing, what they need, and what they remember. SSE-based, not screen-scraping.
One listener per qalcode2/opencode instance on /event?directory=.... State transitions pushed instantly. Reconnects every 5s on disconnect.
◉ working · ● waiting · ! permission · ◆ done · ─ no AI · ○ idle · ✗ error. Plus ^! for sub-agent permission (Task tool child).
Real todos[] array per session pulled from /session/:id/todo. Progress bar + count on every card (e.g. 6/8 tasks).
Permission events surface as orange ! across status bar, sidebar, phone. Approve/once/always/reject via gesture, voice, click, or volume key.
A tmux window can hold many panes. The tab shows the most urgent state across them all: error → permission → waiting → working → done → idle.
psutil tree walk per pane every 10s — RAM, CPU%, uptime, child processes (bun, opencode, node...).
Aggregator thread sums per-pane: token_in, token_out, token_reasoning, cache_read/write. Cost USD computed client-side (OpenCode bug).
Last 30 SSE tool events per pane (read · write · edit · bash · glob). Renders as a horizontal timeline in the Hardware tab.
tmux-resurrect extensions: re-launch qalcode2 agents in correct CWD on restart. 30s atomic save of window names. 0.3s stagger between launches.
MCP router across: structural (graphify), episodic (kalarc-memory), workspace (gmux native). 30-min wire-up unlocks 600-token context.
Multiple spellings ("kalarc"/"qalarc"/"calarc"...) routed to the active pane. Longest match wins. Then voice command lands in the right agent.
v3.5 — new-agent modal prefills the working directory from the last agent in the same session. Persists to localStorage.
Visual oversight.
Four lenses on the same agent data. Each tuned for a different question. All live in your browser right now.
Default view — every agent as a card, sorted by urgency. Todo bars, model name, RAM/CPU, last terminal line, expandable Hardware tab.
Second-monitor view. agent → folder → file with coloured pulse-line and running timer. Default = one agent. Overview = all agents in a ring.
Four views: Activity / Memory / Files (heatmap) / Graph. Filter by memory type (episodic/semantic/procedural/shared) or by agent.
iOS-styled mock. Vol↓ cycles agents, Vol↑ push-to-talk. Tap to approve/reject permission. Agents · Voice · Controls · Settings.
Live every 1s. Window tabs: 2:◉ volkus 6/8. Right-side attention summary: !3 ●6 ✋ 📷 🎤. Auto-sources your existing tmux.conf.
Each tmux window is a tab in the Tauri app: state colour, name, todo count. Click to switch. Drag to reorder. Insert-above on drop.
Right-side panel showing real conversation history from /session/:id/message. Per-pane lazy fetch with 3s throttle. Send → routes to focused agent.
Agent monitor's ⊞ ALL view — every agent in a ring, files orbit around them inside dotted "territories". Shared files glow orange (potential conflict).
Per-agent t/s sparkline in the perf strip. Infrastructure done; real data feeds need verification.
Every demo on gmux.ai has a banner identifying it as mock data + linking to the other three demos + back to gmux.ai.
What everything sits on.
The infrastructure layers that aren't a pillar by themselves but make every pillar possible.
pip install gmux or paru -S gmux. monitor.py + pane_status.py + bridge.py + session_restore.py + tui.py.
Rust + WebKit. portable-pty Rust crate spawns tmux. xterm.js renders. Three sidecars auto-start: monitor, voice, session-saver.
portable-pty Rust crate, 40 rows × 200 cols. Full ANSI, full terminfo. PTY output → Tauri events → xterm.js. Keystrokes → invoke → PTY.
WebSocket on :8767 + HTTP on :8768. Pane layout API (pixel rects per tmux pane). Voice command POST. PWA manifest.
Cloudflare Pages + Worker. Four browser demos on mock data. Interest counter with non-clean wobble. Articles area with this content.
Written, working, deliberately not shipping until all 5 ship-criteria pass. Two remaining: voice port (:8770) and qalcode2 push patch.
The honest position.
Four demos on mock data. Open any of them — no install, no backend, no sign-in. They use the same UI code that ships in the desktop app.
The Python terminal stack. pip install gmux && gmux --status-only
adds live AI state to your tmux status bar right now. No camera. No mic. No ceremony.
The Tauri desktop app. Three of five ship-criteria green. Voice port and qalcode2 push-patch are the remaining work — both achievable in a focused afternoon.
gmux-brain wiring. Built and documented. A 30-minute config change to
opencode.json unlocks 600 tokens of context for every new agent.
Try every demo.
All features MIT licensed. Article generated May 13, 2026 from a complete audit of six gmux repositories.