mirror of
https://github.com/compiler-explorer/compiler-explorer.git
synced 2026-05-16 12:33:04 -04:00
Driving the staging endpoint with realistic LLM scenarios surfaced
four real UX problems. Fixed all four together because they're
interlocking — an agent's first call to list_compilers should always
return something useful, and the result should be readable.
1. **Hard cap on lean responses (200 items)**. Without a cap, a call
like `list_compilers({language: "c++"})` returns ~88KB / 4500 lines
(every arch × every major × every variant) and is rejected by host
MCP clients with a too-large error. Worse: `list_compilers()` with
no language returns ~459KB / 23,800 lines. Even lean mode (id+name
only) doesn't fit. Truncate at 200 entries with a hint that names
the total and recommends specific narrowing knobs (`match`,
`language`, `instructionSet`, `latestPerMajor`).
2. **Strip ANSI escape codes** from compile output. gcc/clang colour
their diagnostics; the web UI parses them but an MCP/LLM consumer
sees noise like `[01;31m[Kerror:`. Reuse the existing
`filterEscapeSequences` helper (now exported from lib/utils.ts) in
`truncateLines`, so it covers stdout, stderr, AND buildResult /
execResult streams in one place.
3. **Add `instructionSet` filter** to list_compilers. The original
tester wanted to ask "what's the newest x86-64 GCC" without having
to know the exact `match` substring. Now the cleanest call is
`{language: "c++", instructionSet: "amd64", latestPerMajor: true}`
— a small, structured query with the answer the agent actually
wanted.
4. **Schema descriptions** updated: the `match` doc now points at the
new `instructionSet` parameter as the preferred answer to the
"newest X for arch Y" question. Lean-cap hint surfaces all four
refinement knobs explicitly.
97/97 MCP + release-track tests pass; 6 new tests cover the lean cap
(auto-degrade truncation, forceLean truncation, no-cap-when-fits),
ANSI stripping, instructionSet filter, and the canonical
"instructionSet + latestPerMajor" combination.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>