2 Commits

Author SHA1 Message Date
Alan Tse
33fe84532e Add save_trace MCP tool for snapshotting live or loaded captures.
- save_worker binding: wraps Worker::Write under
  Worker::ObtainLockForMainThread() so live instances yield their
  receive thread cooperatively for the save's duration — the same
  pattern View::Save uses in the GUI.
- save_trace MCP tool: defaults to async_mode=True for multi-GB
  traces; reuses the existing Task/executor machinery so callers
  poll via the task tool. Path resolution mirrors load_capture.
- manual/tracy.tex: add save_trace bullet to the MCP tool list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:23:31 -07:00
Alan Tse
9a7233ced5 Add MCP server for AI-assisted trace analysis (#1347)
* Add MCP server for AI-assisted trace analysis.

Introduce an optional Model Context Protocol (MCP) server that lets AI
assistants analyze Tracy captures and live sessions through Tracy's own
server engine. The server runs as a Python sidecar and talks to the
existing C++ analysis code through new pybind11 bindings.

- python/bindings/ServerModule.cpp: TracyServerBindings module exposing
  Worker, file I/O, zones, GPU zones, frame data, plots, messages, locks,
  source locations, and summary statistics (zone/GPU child stats, frame
  timing, etc.).
- python/CMakeLists.txt: builds and installs TracyServerBindings alongside
  TracyClientBindings.
- extra/mcp/tracy_mcp.py: FastMCP SSE singleton with dynamic port
  discovery, PID-file based singleton detection, session-isolated worker
  instances, synchronous and background eval, task polling, and a
  shutdown tool to release the .pyd lock during development.
- extra/mcp/start_mcp.sh, .gitignore: launcher with local override hook;
  ignores generated port/pid files.
- manual/tracy.md: documents building, running, and integrating the
  server with an AI assistant.

* Improve Tracy MCP cold-start guidance.

Cold-start usability testing showed an LLM agent burned ~7 exploratory
calls discovering the ctx object model, time-unit conventions, and join
keys before producing useful analysis. Surface that information up front
through MCP resources and entry-point tool guidance.

- extra/mcp/eval_guide.md: new bindings-layer reference covering the
  Worker object graph (zone / GPU zone / frame / thread / message /
  plot / lock / memory entry points), nanosecond time units, ZoneStats
  field semantics including self-time via get_child_zone_stats, the
  opaque 'name (addr)[arch] <srcloc_id>' key format, and worked
  examples translating common queries into ctx Python.
- extra/mcp/tracy_mcp.py: expose system.prompt.md and eval_guide.md as
  MCP resources (tracy://prompt and tracy://eval-guide) so external
  agents and Tracy Assist share the same guidance source. Resource
  content is re-read per request — edits propagate without a server
  restart.
- Point load_capture and live_connect return values plus the eval tool
  description at the resources, so the agent reads them before its
  first eval rather than introspecting blind.
- Expand load_capture docstring: name the path parameter explicitly,
  show Windows path syntax, and direct agents to list_captures plus
  TRACY_CAPTURES_DIR for capture discovery.
- Probe is_connected() briefly after Worker construction in
  live_connect and surface an actionable error on silent handshake
  failures (typically a Tracy client/server version mismatch or
  TRACY_ON_DEMAND) instead of returning misleading success.

Reduces a fresh agent's cold-start overhead from 7 exploratory calls
to 4, where the remaining 4 are unavoidable harness/schema-fetch
overhead, not API-design friction.

* Detect Tracy protocol mismatches via UDP broadcast pre-flight.

Tracy clients announce themselves on UDP port 8086 every ~3 seconds with
a BroadcastMessage carrying the protocol version, listen port, and
program name (public/common/TracyProtocol.hpp). The Tracy GUI reads this
and refuses to attempt a TCP connection on protocol mismatch, surfacing
a precise error. live_connect previously had no equivalent check, so a
mismatch produced an opaque 2-second handshake timeout with no
diagnostic about what was wrong.

- Add a broadcast parser handling versions 0-3, with variable-length
  programName (Tracy sends only the actual name + null terminator on
  the wire, not the full 64-byte buffer).
- Add a non-blocking UDP listener that binds 8086 with SO_REUSEADDR
  and waits up to 3.5s — enough to guarantee catching at least one
  beat at the 3s broadcast cadence.
- Read our bindings' ProtocolVersion at startup by parsing
  TracyProtocol.hpp, so the comparison stays in sync with the build
  without new C++ wiring.
- live_connect runs the broadcast pre-flight before constructing
  Worker. On a matched listen_port with a differing protocol_version,
  it returns a single-line error naming the program, both versions,
  and the remediation, without ever opening a TCP connection. If no
  matching broadcast arrives, it falls through to the existing
  handshake probe, which now reports any other broadcasts seen as a
  hint (helpful when the target uses a non-default port).

* Add MCP Server section to LaTeX manual.

The markdown manual is auto-generated from the LaTeX source; add the
corresponding \subsection{MCP Server} so the two stay in sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove hand-written MCP section from tracy.md.

tracy.md is generated from tracy.tex via latex2md.sh. The MCP section
was previously written by hand directly in the markdown; now that the
LaTeX source has been updated, the markdown section should be
regenerated by running latex2md.sh rather than maintained manually.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 16:17:55 +02:00