Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Alpha Roadmap

Date: 2026-02-18 Status: Draft

Context

These decisions were made during the design phase and inform the roadmap:

  • Networking is in scope — outbound-only (tap + bridge + MASQUERADE) for dependency resolution (pip, cargo, npm, etc.)
  • No host-side git — generic file/folder passing through btrfs workspaces only
  • Git via service VM — Soft Serve in a service VM, agents clone/push over the bridge network
  • Remote push via guest-agent — Nexus triggers git push on the service VM through MCP run_command, credentials stay in the service VM
  • Alpine rootfs — matches cracker-barrel’s known-working configuration
  • vsock for all control plane — MCP, PTY, control channel all flow through vsock via Nexus
  • XDG Base Directory spec — host-side paths follow XDG: config in $XDG_CONFIG_HOME/nexus/, state in $XDG_STATE_HOME/nexus/, data in $XDG_DATA_HOME/nexus/, runtime in $XDG_RUNTIME_DIR/nexus/

Steps

These 10 steps are the first phase of work toward the alpha milestone. They do not complete the milestone — additional steps will be planned as these are underway.

Step 1: nexusd — Systemd-Ready Daemon

Create the nexus Rust workspace (nexusd, nexus-lib). Build nexusd with signal handling (SIGTERM/SIGINT), structured logging, and an HTTP server serving /v1/health. Write a systemd user unit file.

Deliverable: systemctl --user start nexus starts the daemon. curl localhost:9600/v1/health returns {"status":"ok"}. SIGTERM triggers graceful shutdown with log output.

Detailed plan: Step 1 Plan


Step 2: nexusctl — CLI Skeleton

Add nexusctl to the workspace. Clap-based CLI with noun-verb grammar. Implement nexusctl status (queries /v1/health) and nexusctl version. Recommended alias nxc. User config at $XDG_CONFIG_HOME/nexusctl/config.yaml. Actionable error messages when the daemon is unreachable.

Deliverable: nexusctl status reports daemon health. When daemon is down:

Error: cannot connect to Nexus daemon at 127.0.0.1:9600
  The daemon does not appear to be running.

  Start it: systemctl --user start nexus.service

Step 3: SQLite State Store

Add rusqlite to nexus-lib. Initialize the schema on first daemon start. Storage abstraction trait for future backend swaps. Pre-alpha migration strategy: delete DB and recreate.

Deliverable: Daemon creates $XDG_STATE_HOME/nexus/nexus.db with the full schema on startup. nexusctl status reports database status (path, table count, size).


Step 4: VM Records — CRUD Without Firecracker

REST endpoints for VMs (POST/GET/DELETE /v1/vms). CLI commands: vm list, vm create, vm inspect, vm delete. State machine limited to created — no Firecracker processes yet. Auto-assign vsock CID on create.

Deliverable: nexusctl vm create my-vm persists to SQLite. nexusctl vm list renders a table. nexusctl vm inspect my-vm shows full detail. nexusctl vm delete my-vm removes the record.


Step 5: btrfs Workspace Management

Master image import (mark an existing btrfs subvolume as read-only, register in DB). Workspace create (btrfs subvolume snapshot from master). List, inspect, delete. REST endpoints + CLI commands. Use libbtrfsutil — Rust bindings to btrfs-progs’s upstream libbtrfsutil, which supports subvolume create, delete, snapshot, and list via ioctls on directory file descriptors. Common subvolume operations (create, snapshot) work unprivileged — no CAP_SYS_ADMIN required.

Firecracker requires block devices, not directories. The approach: each workspace subvolume contains a raw ext4 image file. mke2fs -d converts a directory tree into an ext4 image without root. btrfs CoW still applies at the host layer — snapshotting a subvolume containing a 1GB image file is instant and zero-cost until writes diverge.

Deliverable: nexusctl image import /path --name base registers an image. nexusctl ws create --base base --name my-ws creates a btrfs snapshot. nexusctl ws list shows workspaces. Verified with btrfs subvolume list.


Step 6: Rootfs Image + Firecracker VM Boot

Build a minimal Alpine rootfs, reusing cracker-barrel’s known-working Alpine configuration. Package it as an ext4 image via mke2fs -d (directory → ext4 without root). Store the image inside a btrfs subvolume and register as a master image. Spawn Firecracker with config (kernel from cracker-barrel, rootfs from master image snapshot, vsock device). Process monitoring — detect exit/crash, update VM state in SQLite. Start/stop lifecycle.

Deliverable: nexusctl vm start my-vm boots an Alpine VM in Firecracker, VM reaches running state. nexusctl vm stop my-vm shuts down cleanly. Unexpected termination updates state to crashed. nexusctl vm logs my-vm shows console output.

Unknowns:

  • Firecracker API socket management and cleanup.
  • CID allocation strategy for vsock (auto-increment from 3, or pool).

Step 7: guest-agent — vsock Control Channel

Add guest-agent binary to the workspace. Uses tokio-vsock for async vsock I/O on both sides — guest-agent listens on VMADDR_CID_ANY port 100, nexusd connects via the VM’s UDS with CONNECT 100\n. Sends image metadata on connect (parsed from /etc/nexus/image.yaml). Systemd service inside the VM rootfs.

First vsock connection and initial message are ~50-100x slower than subsequent messages on an established connection (validated through cracker-barrel benchmarking). Connections are established eagerly at boot and kept alive.

Deliverable: VM boots. guest-agent starts via systemd inside the VM. nexusd connects on vsock port 100 and receives image metadata. VM state includes readiness status.


Step 8: MCP Tools in guest-agent

JSON-RPC 2.0 server on vsock port 200 inside the guest-agent via tokio-vsock. Implements four tools: file_read, file_write, file_delete, run_command. nexusd maintains a connection pool per VM per port — connections established eagerly at boot and kept alive for the VM’s lifetime. Reconnection is automatic on failure. run_command streams stdout/stderr incrementally over the MCP channel.

Deliverable: From the host, send MCP file_write to a running VM — file appears inside the VM. Send run_command with cat /etc/os-release — returns Alpine release info. Send file_read — returns file contents. Send file_delete — file is removed.


Step 9: Networking — Outbound Access

Bridge creation (nexbr0). Tap device per VM, attached to the bridge. IP assignment from configured CIDR (stored in SQLite). NAT masquerade for outbound internet access. CAP_NET_ADMIN via setcap on the nexusd binary. Per-VM isolation rules via the nftables crate (JSON API — drives nftables via nft -j, requires nftables >= 0.9.3 at runtime). DNS configuration inside VMs.

Deliverable: A booted VM can curl https://example.com successfully. nexusctl vm list shows assigned IP addresses. nexusctl vm inspect shows network configuration.

Unknowns:

  • setcap interaction with systemd user services — may need AmbientCapabilities= in the unit file instead.
  • DNS resolver configuration inside Alpine VMs (static /etc/resolv.conf vs. DHCP).

Step 10: PTY + Terminal Attach

PTY management in guest-agent using nix::pty (already a transitive dependency) wrapped in tokio::io::unix::AsyncFd for async I/O. One PTY per session on vsock ports 300-399. WebSocket endpoint in nexusd (GET /v1/vms/:id/terminal with upgrade) implementing the ttyd protocol — a single-byte-prefix framing scheme that gives xterm.js compatibility for free:

Client → Server:  '0'=INPUT  '1'=RESIZE(JSON)  '2'=PAUSE  '3'=RESUME  '{'=HANDSHAKE
Server → Client:  '0'=OUTPUT  '1'=SET_TITLE  '2'=SET_PREFS

No Rust library exists for this — implement directly over axum WebSocket (~150 lines). nexusctl attach <vm> connects to the WebSocket and bridges to the local terminal. Terminal resize (SIGWINCH) propagated via the RESIZE message type.

Deliverable: nexusctl attach my-vm opens an interactive shell inside the VM. Typing commands works. Ctrl-C, Ctrl-D, and window resizing behave correctly. Disconnecting leaves the VM running.


After These Steps

These are needed for the alpha milestone but will be planned after the first 10 steps:

  • Soft Serve service VM setup and configuration
  • Portal VM with agent runtime (OpenClaw integration)
  • vsock routing between VMs (portal → Nexus → work)
  • Agent resource abstraction (portal + work VM pairs managed as one unit)
  • nexusctl apply and declarative nexus.yaml configuration
  • Package repository setup (packages.workfort.dev)
  • CLI polish: output formatting, --jq, shell completions, --dry-run