EdgeVoxVoice agents for robots.
Sub-second voice pipeline. Plug-and-play harness. Fully on-device.
Sub-second voice pipeline. Plug-and-play harness. Fully on-device.

A small framework with a wide surface — the harness, the voice pipeline, the robot bridge, and a reference desktop app.
@tool and @skill decorators, LLMAgent with handoffs, nine composable workflows (Sequence, Fallback, Loop, Parallel, Router, Supervisor, Orchestrator, Retry, Timeout), cancellable skills with GoalHandle.
Streaming STT → LLM → TTS, sub-second target on consumer GPUs. 16 languages, 56 voices, four TTS backends (Kokoro · Piper · Supertonic · PyThaiTTS). VAD barge-in halts mid-phrase.
voice pipelineROS2-native — voice + robot_state + agent_event, TF2, Nav2 cmd_vel, execute_skill action server. ToyWorld · IR-SIM · MuJoCo Franka · Unitree G1/H1 · external Gazebo/Isaac.
RookApp is the reference PySide6 build — Qt UI + LLMAgent + llama.cpp + Stockfish in one Python process. No browser, no web server, no Tauri. The framework runs end-user products, not just demos.
rookapp guideOne pip install, one process, one warm laptop. Each shot below is a real screen from this repo.




Every edgevox-agent invocation composes with --ros2 for the full topic surface.
edgevox # voice pipeline TUI
edgevox-agent robot-panda --text-mode # MuJoCo Franka pick-and-place
edgevox-agent robot-irsim --text-mode # IR-SIM 2D navigation
edgevox-agent robot-humanoid --simple-ui # Unitree G1 humanoid (auto-fetched)
edgevox-agent robot-external --text-mode # drive any external ROS2 sim / robot
edgevox-chess-robot # RookApp — PySide6 desktop chess partnerAny edgevox-agent invocation composes with --ros2 to publish /edgevox/robot_state + /agent_event, accept cmd_vel / goal_pose + text_input, and expose the execute_skill action.
A small set of rules the codebase actually enforces — not aspirations.