Superpowers v5.0.2

This post was written by Claude (Anthropic's Opus 4.6 model, running in Claude Code) at Jesse's request. I built the replacement server, wrote the tests, and debugged the process lifecycle issues described here.

Two things happened in the same release cycle: a security complaint about vendored dependencies, and a behavioral bug where subagent reviewers were rejecting perfectly good code. They turned out to be unrelated problems with unrelated fixes, but they both trace back to the same design instinct — carrying too much context forward when less would be better.

The Aggressive Reviewer Problem #

This one was subtle. Codex users reported that subagent reviewers — the spec and code review agents that Superpowers dispatches during brainstorming and implementation — were behaving strangely. They'd reject reasonable code for not matching preferences that were never stated. They'd demand rewrites that exceeded the scope of the review. They'd treat advisory feedback as blocking.

The problem was context inheritance. When an agent dispatches a subagent, it can either forward its full session context or construct fresh context specifically for the task. The delegation skills were doing the former. A reviewer subagent would inherit the dispatcher's entire conversation history — the user's tone, the design decisions, the back-and-forth, the dispatcher's internal reasoning. The reviewer would absorb all of that and start behaving as if it were the lead developer, not a reviewer. It had opinions about things reviewers shouldn't have opinions about, because it had context reviewers shouldn't have.

The fix is explicit in all five delegation skills (brainstorming, parallel agents, code review, subagent-driven development, writing plans): construct exactly what each subagent needs. The spec. The code. The review criteria. Nothing else. Never forward session history.

This is the same principle as the zero-dependency server rewrite, applied to a different domain: less context is more reliable. A reviewer that only sees the work product reviews the work product. A reviewer that sees the whole conversation role-plays as the developer.

714 Packages, 340 Lines #

The v5.0.1 release bundled the brainstorm server's npm dependencies — Express, ws (WebSocket), chokidar (file watching), and their transitive dependency tree — directly into the repository. This was a pragmatic fix: npm install at runtime was unreliable, and the server needed to work on fresh installs.

But vendoring means tracking 714 files and 84,000 lines of third-party code in git. A user filed a security complaint pointing out that this is supply chain surface area. They were right. It's auditable in principle, but nobody is going to audit 714 files across package updates. And if a compromised version of accepts or mime-types or debug slipped in, it would ship to every Superpowers user via the plugin marketplace.

We looked at what the server actually does. It serves HTML files from a directory. It broadcasts reload events over WebSocket when files change. It injects a helper script. That's it. Express is overkill. ws is overkill. chokidar is overkill.

The replacement server.js is 340 lines using only Node built-ins (http, crypto, fs, path):

HTTP server — http.createServer. Serves the newest .html file from the session directory, wraps content fragments in the frame template, injects the helper script, serves static files from the session directory. Four routes, no middleware framework.
WebSocket protocol — RFC 6455 from scratch. Handshake (SHA-1 + magic GUID), frame encoding and decoding, masking and unmasking, ping/pong. About 70 lines. The protocol is simple when you only need TEXT, CLOSE, PING, and PONG opcodes.
File watching — fs.watch with a debounce timer. One complication: macOS reports rename for both new files and file overwrites. We maintain a Set of known filenames to distinguish new screens from updates.

We wrote 56 tests before writing the server — 31 unit tests for the WebSocket protocol layer (handshake computation, frame encoding at every size boundary, masking arithmetic, multi-frame buffers) and 25 integration tests for the full server (HTTP serving, WebSocket event relay, file watching, template wrapping, helper injection). The tests run in under 8 seconds with zero dependencies.

The result: 731 files changed, 1,700 insertions, 85,000 deletions. The entire node_modules/ directory is gone.

Servers That Clean Up After Themselves #

The brainstorm server runs in the background while you work. When you're done — when you close Claude Code, or Codex finishes, or you walk away — the server should stop. Previously, it didn't. Orphaned node server.js processes accumulated until someone noticed them in ps aux.

The server now has two shutdown mechanisms:

Owner process tracking. At startup, start-server.sh captures the PID of the process that launched it — Claude Code, Codex, Gemini CLI, whatever harness is running. The server checks every 60 seconds whether that process is still alive. If it's gone, the server shuts down cleanly.

Getting the right PID was trickier than expected. $PPID inside the startup script is the ephemeral shell that the harness spawns to run the script. That shell dies immediately when the script exits. The server was tracking a dead process and shutting down after 60 seconds every time.

The fix resolves the grandparent PID — $PPID's parent — which is the actual harness process:

OWNER_PID="$(ps -o ppid= -p "$PPID" 2>/dev/null | tr -d ' ')"
if [[ -z "$OWNER_PID" || "$OWNER_PID" == "1" ]]; then
  OWNER_PID="$PPID"
fi

We verified this under Claude Code (where the chain is zsh -> claude -> bash -> start-server.sh -> node) and Codex CLI (where it's tmux -> codex -> start-server.sh -> node). In both cases, the server now correctly tracks the harness and stays alive as long as it's running.

Idle timeout. 30-minute fallback. HTTP requests, WebSocket messages, and file changes all reset the timer. If nothing touches the server for half an hour, it shuts down. This catches cases where the harness exits in a way that the owner PID check misses (SIGKILL, for example, which leaves child processes orphaned).

On shutdown, the server removes .server-info and writes .server-stopped. The visual companion guide now instructs agents to check for .server-info before each file write. If it's missing, restart the server before continuing. This prevents the failure mode where an agent writes HTML files to a directory that nobody is serving.

Source: github.com/obra/superpowers