diff --git a/sequence-diagram-skill/.gitignore b/sequence-diagram-skill/.gitignore new file mode 100644 index 0000000..7b40247 --- /dev/null +++ b/sequence-diagram-skill/.gitignore @@ -0,0 +1,10 @@ +# autoresearch session +autoresearch.jsonl +autoresearch.ideas.md + +# temp +.tmp_* +*.tmp + +# OS +.DS_Store diff --git a/sequence-diagram-skill/README.md b/sequence-diagram-skill/README.md new file mode 100644 index 0000000..c78b201 --- /dev/null +++ b/sequence-diagram-skill/README.md @@ -0,0 +1,80 @@ +# Sequence Diagram Skill — Autoresearch + +Optimizes a pi skill for generating Mermaid sequence diagrams from +Elixir/Phoenix codebases, using [pi-autoresearch](https://github.com/davebcn87/pi-autoresearch). + +## The Problem + +Small local models (Qwen3.5-35B-A3B) produce great sequence diagrams for +well-represented languages (C#, Java) but go off the rails with Elixir/Phoenix — +sidetracking into imaginary code reviews instead of finishing the diagram. + +## How It Works + +The autoresearch loop mutates `skill/SKILL.md`, runs it against 3 scenarios +from a real Phoenix codebase (Firehose), and scores with **zero-judge-model +bash evals**: + +| Eval | Check | Tool | +|------|-------|------| +| has_diagram | Output has `` ```mermaid `` + `sequenceDiagram` | grep | +| diagram_parseable | Valid mermaid syntax (participants + messages) | grep / mmdc | +| uses_real_modules | ≥2 actual module names from codebase | grep | +| uses_real_functions | ≥1 actual function name | grep | +| no_sidetracking | No review/critique language | grep against blocklist | +| concise | Under 3000 chars | wc | + +3 tasks × 6 evals = 18 max score. + +## Setup + +1. Clone the Firehose repo into `workspace/`: + ```bash + git clone https://gitea.apps.sustainabledelivery.com/mostalive/firehose workspace + ``` + +2. Make scripts executable: + ```bash + chmod +x autoresearch.sh autoresearch.checks.sh scripts/*.sh + ``` + +3. Configure model access in `scripts/config.env`: + - Local: leave `SSH_TARGET` empty, have pi configured with your model + - Remote: set `SSH_TARGET=analyst@your-host` and `SSH_PORT=2222` + +4. Init git and start: + ```bash + git init && git add -A && git commit -m "initial" + pi + # then: /autoresearch + ``` + +## Project Structure + +``` +sequence-diagram-skill/ +├── autoresearch.md # Session doc (pi reads this) +├── autoresearch.sh # Benchmark runner +├── autoresearch.checks.sh # Sanity checks on SKILL.md +├── skill/ +│ └── SKILL.md # THE FILE BEING OPTIMIZED +├── benchmark/ +│ └── tasks.jsonl # 3 test scenarios +├── scripts/ +│ ├── config.env # Endpoint config +│ ├── run_one.sh # Run pi with skill + single task +│ ├── score.sh # Score a single output (6 binary evals) +│ └── sidetrack_blocklist.txt # Phrases that indicate off-task behavior +└── workspace/ # Clone of Firehose repo (mounted/symlinked) +``` + +## Mutation Ideas for the Agent + +The autoresearch agent only edits `skill/SKILL.md`. Good mutations include: + +- Stronger "do not review" constraints +- Explicit Elixir/Phoenix vocabulary hints (NimblePublisher, module attributes) +- Output format enforcement (ONLY the mermaid block, nothing else) +- Step-by-step process instructions (read router first, then controller, etc.) +- Short generic example of a good sequence diagram +- Negative examples ("do NOT include suggestions or improvements") diff --git a/sequence-diagram-skill/autoresearch.checks.sh b/sequence-diagram-skill/autoresearch.checks.sh new file mode 100755 index 0000000..ec3e2ba --- /dev/null +++ b/sequence-diagram-skill/autoresearch.checks.sh @@ -0,0 +1,57 @@ +#!/usr/bin/env bash +set -euo pipefail + +# ─── autoresearch.checks.sh ───────────────────────────────────────────────── +# Backpressure checks for the sequence diagram skill. +# ───────────────────────────────────────────────────────────────────────────── + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +SKILL_FILE="${SCRIPT_DIR}/skill/SKILL.md" +ERRORS=0 + +# 1. Skill exists and is non-empty +if [[ ! -s "$SKILL_FILE" ]]; then + echo "FAIL: skill/SKILL.md is missing or empty" + ERRORS=$((ERRORS + 1)) +fi + +# 2. Skill is not trivially short +CHAR_COUNT=$(wc -c < "$SKILL_FILE" 2>/dev/null || echo "0") +if (( CHAR_COUNT < 200 )); then + echo "FAIL: skill/SKILL.md is only ${CHAR_COUNT} chars (min: 200)" + ERRORS=$((ERRORS + 1)) +fi + +# 3. Skill is not too long (rough token proxy: 1500 tokens ≈ 6000 chars) +if (( CHAR_COUNT > 6000 )); then + echo "FAIL: skill/SKILL.md is ${CHAR_COUNT} chars (max: ~6000)" + ERRORS=$((ERRORS + 1)) +fi + +# 4. Skill must contain "sequenceDiagram" or "sequence diagram" (it's a diagram skill) +if ! grep -qi 'sequence.diagram' "$SKILL_FILE" 2>/dev/null; then + echo "FAIL: skill/SKILL.md doesn't mention sequence diagrams" + ERRORS=$((ERRORS + 1)) +fi + +# 5. Skill must NOT contain Firehose-specific code (no overfitting) +for term in "BlogController" "EngineeringBlog" "Firehose" "blogex" "priv/blog"; do + if grep -q "$term" "$SKILL_FILE" 2>/dev/null; then + echo "FAIL: skill/SKILL.md contains codebase-specific term '${term}'" + ERRORS=$((ERRORS + 1)) + fi +done + +# 6. Valid UTF-8 +if ! iconv -f utf-8 -t utf-8 "$SKILL_FILE" > /dev/null 2>&1; then + echo "FAIL: skill/SKILL.md contains invalid UTF-8" + ERRORS=$((ERRORS + 1)) +fi + +if (( ERRORS > 0 )); then + echo "Checks FAILED with ${ERRORS} error(s)" + exit 1 +else + echo "All checks passed. Skill: ${CHAR_COUNT} chars." + exit 0 +fi diff --git a/sequence-diagram-skill/autoresearch.md b/sequence-diagram-skill/autoresearch.md new file mode 100644 index 0000000..e60214f --- /dev/null +++ b/sequence-diagram-skill/autoresearch.md @@ -0,0 +1,96 @@ +# Autoresearch: Sequence Diagram Skill for Elixir/Phoenix + +## Objective + +Optimize a pi skill (`skill/SKILL.md`) that generates Mermaid sequence diagrams +from Elixir/Phoenix codebases. The skill is used with a local Qwen3.5-35B-A3B +model running on CPU. The primary failure mode is **sidetracking** — the model +abandons the diagram task and starts reviewing/critiquing the code instead. + +## Primary Metric + +**score** — higher is better (0–18 scale, sum of 6 binary evals × 3 test inputs). + +## Secondary Metrics + +- **sidetrack_count** — number of test runs containing review/critique language (lower is better) +- **parse_count** — number of outputs that contain a parseable sequenceDiagram (higher is better) + +## Architecture + +Pi runs the skill against the Firehose codebase (mounted in the workspace) using +the target model. Scoring is done by bash scripts — no judge model needed. + +## The Codebase Under Test + +**Firehose** — a Phoenix blogging platform with a monorepo structure: + +- `app/` — Phoenix web app (OTP app: `:firehose`) + - `lib/firehose_web/router.ex` — routes + - `lib/firehose_web/controllers/blog_controller.ex` — blog actions + - `lib/firehose_web/controllers/page_controller.ex` — homepage + - `lib/firehose/blogs/` — blog context modules (EngineeringBlog, ReleaseNotes) +- `blogex/` — sibling library for compile-time blog engine + - `lib/blogex/blog.ex` — `use Blogex.Blog` macro (NimblePublisher) + - `lib/blogex/components.ex` — Phoenix function components (post_meta, tag_list, etc.) + - `lib/blogex/router.ex` — API/feed routes + +**Key architectural fact:** Blogex uses NimblePublisher. All blog posts are compiled +into BEAM module attributes at build time. There is NO runtime file I/O for reading +posts. Functions like `all_posts/0`, `get_post!/1`, `posts_by_tag/1` read from +`@posts` module attributes. This is the #1 thing models get wrong. + +## Test Inputs (3 scenarios) + +### 1. Click tag on post (small) +"Generate a sequence diagram for: a user on a blog post page clicks a tag link +(e.g., 'elixir'). Trace the full request from browser through to rendered response." + +### 2. Show homepage (small) +"Generate a sequence diagram for: a user visits the homepage (GET /). +Trace from browser through to rendered HTML." + +### 3. Add blog post on disk (larger, crosses compile/runtime boundary) +"Generate a sequence diagram for: a developer creates a new markdown file in +priv/blog/engineering/. Trace what happens from file creation through to the +post being visible on the blog. Include the compile-time and runtime phases." + +## Eval Criteria (6 binary checks) + +1. **has_diagram** — output contains `` ```mermaid `` and `sequenceDiagram` +2. **diagram_parseable** — the mermaid block is syntactically valid +3. **uses_real_modules** — diagram mentions at least 2 of: BlogController, EngineeringBlog, Blogex, Router, PageController +4. **uses_real_functions** — diagram mentions at least 1 of: posts_by_tag, get_post!, all_posts, paginate, resolve_blog, render +5. **no_sidetracking** — output does NOT contain code review language (see blocklist) +6. **concise** — total output is under 3000 characters + +## Files in Scope + +| File | Agent may edit? | +|------|-----------------| +| `skill/SKILL.md` | ✅ YES — the only file the agent modifies | +| `benchmark/tasks.jsonl` | ❌ NO | +| `scripts/score.sh` | ❌ NO | +| `scripts/run_one.sh` | ❌ NO | +| `scripts/sidetrack_blocklist.txt` | ❌ NO | +| `autoresearch.sh` | ❌ NO | +| `autoresearch.checks.sh` | ❌ NO | + +## Constraints + +- SKILL.md must stay under 1500 tokens. +- SKILL.md must NOT contain any code from the Firehose codebase (no overfitting). +- SKILL.md must remain generic — it should work for any Elixir/Phoenix codebase, + not just Firehose. + +## What Has Been Tried + +(autoresearch fills this in) + +## Dead Ends + +(autoresearch fills this in) + +## Key Wins + +(autoresearch fills this in) diff --git a/sequence-diagram-skill/autoresearch.sh b/sequence-diagram-skill/autoresearch.sh new file mode 100755 index 0000000..d5b7846 --- /dev/null +++ b/sequence-diagram-skill/autoresearch.sh @@ -0,0 +1,101 @@ +#!/usr/bin/env bash +set -euo pipefail + +# ─── autoresearch.sh ───────────────────────────────────────────────────────── +# Benchmark script for sequence diagram skill optimization. +# Runs all 3 test inputs, scores each, outputs METRIC lines. +# ───────────────────────────────────────────────────────────────────────────── + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +source "${SCRIPT_DIR}/scripts/config.env" 2>/dev/null || true + +# Defaults +SSH_TARGET="${SSH_TARGET:-}" +SSH_PORT="${SSH_PORT:-2222}" +export TASK_TIMEOUT="${TASK_TIMEOUT:-180}" + +# ─── Pre-checks ────────────────────────────────────────────────────────────── + +SKILL_FILE="${SCRIPT_DIR}/skill/SKILL.md" +if [[ ! -s "$SKILL_FILE" ]]; then + echo "ERROR: skill/SKILL.md is missing or empty" + exit 1 +fi + +SKILL_CHARS=$(wc -c < "$SKILL_FILE") +echo "Skill: ${SKILL_CHARS} chars" + +TASKS_FILE="${SCRIPT_DIR}/benchmark/tasks.jsonl" +if [[ ! -f "$TASKS_FILE" ]]; then + echo "ERROR: benchmark/tasks.jsonl not found" + exit 1 +fi + +echo "────────────────────────────────────────────────────" + +# ─── Run all tasks ─────────────────────────────────────────────────────────── + +TMPDIR=$(mktemp -d) +TOTAL_SCORE=0 +SIDETRACK_COUNT=0 +PARSE_COUNT=0 +TASK_COUNT=0 + +START_TIME=$(date +%s) + +while IFS= read -r line; do + TASK_ID=$(echo "$line" | jq -r '.id') + TASK_PROMPT=$(echo "$line" | jq -r '.prompt') + TASK_COUNT=$((TASK_COUNT + 1)) + + OUTPUT_FILE="${TMPDIR}/${TASK_ID}.txt" + SCORE_FILE="${TMPDIR}/${TASK_ID}.json" + + echo " [${TASK_COUNT}/3] ${TASK_ID}..." + + # Run the task + bash "${SCRIPT_DIR}/scripts/run_one.sh" \ + "$TASK_PROMPT" \ + "$OUTPUT_FILE" \ + "$SSH_TARGET" \ + "$SSH_PORT" + + # Score it + SCORE_JSON=$(bash "${SCRIPT_DIR}/scripts/score.sh" "$OUTPUT_FILE") + echo "$SCORE_JSON" > "$SCORE_FILE" + + # Extract scores + TASK_SCORE=$(echo "$SCORE_JSON" | jq -r '.score') + TASK_SIDETRACK=$(echo "$SCORE_JSON" | jq -r '.no_sidetracking') + TASK_PARSE=$(echo "$SCORE_JSON" | jq -r '.diagram_parseable') + TASK_CHARS=$(echo "$SCORE_JSON" | jq -r '.char_count') + + TOTAL_SCORE=$((TOTAL_SCORE + TASK_SCORE)) + + if (( TASK_SIDETRACK == 0 )); then + SIDETRACK_COUNT=$((SIDETRACK_COUNT + 1)) + fi + + if (( TASK_PARSE == 1 )); then + PARSE_COUNT=$((PARSE_COUNT + 1)) + fi + + echo " score=${TASK_SCORE}/6 sidetrack=$(( 1 - TASK_SIDETRACK )) parseable=${TASK_PARSE} chars=${TASK_CHARS}" + +done < "$TASKS_FILE" + +END_TIME=$(date +%s) +TOTAL_SECONDS=$((END_TIME - START_TIME)) + +# ─── Cleanup ───────────────────────────────────────────────────────────────── + +rm -rf "$TMPDIR" + +# ─── Output METRIC lines ──────────────────────────────────────────────────── + +echo "" +echo "METRIC score=${TOTAL_SCORE}" +echo "METRIC sidetrack_count=${SIDETRACK_COUNT}" +echo "METRIC parse_count=${PARSE_COUNT}" +echo "METRIC total_seconds=${TOTAL_SECONDS}" +echo "METRIC skill_chars=${SKILL_CHARS}" diff --git a/sequence-diagram-skill/benchmark/tasks.jsonl b/sequence-diagram-skill/benchmark/tasks.jsonl new file mode 100644 index 0000000..abfd16c --- /dev/null +++ b/sequence-diagram-skill/benchmark/tasks.jsonl @@ -0,0 +1,3 @@ +{"id": "click-tag", "prompt": "Generate a sequence diagram for: a user on a blog post page clicks a tag link (e.g., 'elixir'). Trace the full HTTP request from browser through the Phoenix router, controller, domain modules, templates, and back to the browser. The codebase is in /home/analyst/workspace/. Read the relevant source files first."} +{"id": "show-homepage", "prompt": "Generate a sequence diagram for: a user visits the homepage (GET /). Trace from the browser's HTTP request through the Phoenix router, controller, template rendering, layout wrapping, and back to the browser. The codebase is in /home/analyst/workspace/. Read the relevant source files first."} +{"id": "add-post", "prompt": "Generate a sequence diagram for: a developer creates a new markdown file in priv/blog/engineering/ and the post becomes visible on the blog. Trace what happens including the compile-time phase (NimblePublisher, module recompilation) and the runtime request phase. The codebase is in /home/analyst/workspace/. Read the relevant source files first."} diff --git a/sequence-diagram-skill/scripts/config.env b/sequence-diagram-skill/scripts/config.env new file mode 100644 index 0000000..6b60d19 --- /dev/null +++ b/sequence-diagram-skill/scripts/config.env @@ -0,0 +1,10 @@ +# ─── config.env ────────────────────────────────────────────────────────────── +# Leave SSH_TARGET empty to run pi locally (e.g., on your Mac). +# Set it to use the remote pi container. + +# Remote pi container (leave empty for local) +SSH_TARGET="" +SSH_PORT=2222 + +# Timeout per task (seconds) +TASK_TIMEOUT=180 diff --git a/sequence-diagram-skill/scripts/run_one.sh b/sequence-diagram-skill/scripts/run_one.sh new file mode 100755 index 0000000..b34b95c --- /dev/null +++ b/sequence-diagram-skill/scripts/run_one.sh @@ -0,0 +1,58 @@ +#!/usr/bin/env bash +set -euo pipefail + +# ─── run_one.sh ────────────────────────────────────────────────────────────── +# Run pi with the sequence-diagram skill on a single task. +# Usage: ./scripts/run_one.sh [ssh_target] [ssh_port] +# +# If ssh_target is provided, runs remotely via SSH into the pi container. +# Otherwise runs pi locally. +# ───────────────────────────────────────────────────────────────────────────── + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_DIR="$(dirname "$SCRIPT_DIR")" + +TASK_PROMPT="$1" +OUTPUT_FILE="$2" +SSH_TARGET="${3:-}" +SSH_PORT="${4:-2222}" +TIMEOUT="${TASK_TIMEOUT:-180}" + +SKILL_FILE="${PROJECT_DIR}/skill/SKILL.md" + +if [[ ! -f "$SKILL_FILE" ]]; then + echo "ERROR: skill/SKILL.md not found" >&2 + exit 1 +fi + +SKILL_CONTENT=$(cat "$SKILL_FILE") + +# Build the full prompt: skill instructions + task +FULL_PROMPT="## Skill Instructions + +${SKILL_CONTENT} + +## Task + +${TASK_PROMPT}" + +if [[ -n "$SSH_TARGET" ]]; then + # ─── Remote: SSH into pi container ─────────────────────────────────── + PAYLOAD=$(jq -n --arg prompt "$FULL_PROMPT" '{"prompt": $prompt}') + + ssh -p "$SSH_PORT" \ + -o StrictHostKeyChecking=no \ + -o ConnectTimeout=10 \ + -o BatchMode=yes \ + "$SSH_TARGET" \ + "run-task --stdin --mode print --thinking off --timeout $TIMEOUT" \ + <<< "$PAYLOAD" > "$OUTPUT_FILE" 2>/dev/null +else + # ─── Local: run pi directly ────────────────────────────────────────── + timeout "${TIMEOUT}s" pi \ + --mode print \ + --no-session \ + --no-extensions \ + --thinking none \ + -p "$FULL_PROMPT" > "$OUTPUT_FILE" 2>/dev/null || true +fi diff --git a/sequence-diagram-skill/scripts/score.sh b/sequence-diagram-skill/scripts/score.sh new file mode 100755 index 0000000..1fc2417 --- /dev/null +++ b/sequence-diagram-skill/scripts/score.sh @@ -0,0 +1,109 @@ +#!/usr/bin/env bash +set -euo pipefail + +# ─── score.sh ──────────────────────────────────────────────────────────────── +# Score a single diagram output against 6 binary evals. +# Usage: ./scripts/score.sh +# Prints a JSON line with pass/fail for each eval and total score. +# ───────────────────────────────────────────────────────────────────────────── + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +OUTPUT_FILE="$1" + +if [[ ! -f "$OUTPUT_FILE" ]]; then + echo '{"error": "file not found", "score": 0}' + exit 0 +fi + +CONTENT=$(cat "$OUTPUT_FILE") +CHAR_COUNT=${#CONTENT} + +# ─── Eval 1: has_diagram ───────────────────────────────────────────────────── +# Output contains a mermaid fenced block with sequenceDiagram +has_diagram=0 +if echo "$CONTENT" | grep -q '```mermaid' && echo "$CONTENT" | grep -q 'sequenceDiagram'; then + has_diagram=1 +fi + +# ─── Eval 2: diagram_parseable ─────────────────────────────────────────────── +# Extract the mermaid block and check basic syntax +diagram_parseable=0 +if (( has_diagram == 1 )); then + # Extract mermaid block + MERMAID_BLOCK=$(echo "$CONTENT" | sed -n '/```mermaid/,/```/p' | sed '1d;$d') + + if [[ -n "$MERMAID_BLOCK" ]]; then + # Basic syntax checks: + # - Has "sequenceDiagram" keyword + # - Has at least one "participant" line + # - Has at least one "->>", "-->>", or "->>" message line + has_keyword=$(echo "$MERMAID_BLOCK" | grep -c 'sequenceDiagram' || true) + has_participant=$(echo "$MERMAID_BLOCK" | grep -c 'participant' || true) + has_message=$(echo "$MERMAID_BLOCK" | grep -cE '\->>|-->>|\->' || true) + + if (( has_keyword > 0 && has_participant > 0 && has_message > 0 )); then + diagram_parseable=1 + fi + fi + + # If mmdc (mermaid CLI) is available, use it for real validation + if command -v mmdc &> /dev/null && (( diagram_parseable == 1 )); then + TMPFILE=$(mktemp /tmp/mermaid_XXXXXX.mmd) + echo "$MERMAID_BLOCK" > "$TMPFILE" + if mmdc -i "$TMPFILE" -o /dev/null 2>/dev/null; then + diagram_parseable=1 + else + diagram_parseable=0 + fi + rm -f "$TMPFILE" + fi +fi + +# ─── Eval 3: uses_real_modules ─────────────────────────────────────────────── +# Diagram mentions at least 2 real modules from the Firehose codebase +uses_real_modules=0 +module_count=0 +for module in BlogController EngineeringBlog ReleaseNotes Blogex Router PageController Layouts; do + if echo "$CONTENT" | grep -qi "$module"; then + module_count=$((module_count + 1)) + fi +done +if (( module_count >= 2 )); then + uses_real_modules=1 +fi + +# ─── Eval 4: uses_real_functions ───────────────────────────────────────────── +# Diagram mentions at least 1 real function from the codebase +uses_real_functions=0 +for func in posts_by_tag get_post all_posts paginate resolve_blog render recent_posts; do + if echo "$CONTENT" | grep -qi "$func"; then + uses_real_functions=1 + break + fi +done + +# ─── Eval 5: no_sidetracking ──────────────────────────────────────────────── +# Output does NOT contain code review / critique language +no_sidetracking=1 +BLOCKLIST="${SCRIPT_DIR}/sidetrack_blocklist.txt" +if [[ -f "$BLOCKLIST" ]]; then + while IFS= read -r phrase; do + phrase=$(echo "$phrase" | xargs) # trim whitespace + if [[ -n "$phrase" ]] && echo "$CONTENT" | grep -qi "$phrase"; then + no_sidetracking=0 + break + fi + done < "$BLOCKLIST" +fi + +# ─── Eval 6: concise ──────────────────────────────────────────────────────── +# Total output under 3000 characters +concise=0 +if (( CHAR_COUNT < 3000 )); then + concise=1 +fi + +# ─── Total ─────────────────────────────────────────────────────────────────── +score=$((has_diagram + diagram_parseable + uses_real_modules + uses_real_functions + no_sidetracking + concise)) + +echo "{\"score\":${score},\"has_diagram\":${has_diagram},\"diagram_parseable\":${diagram_parseable},\"uses_real_modules\":${uses_real_modules},\"uses_real_functions\":${uses_real_functions},\"no_sidetracking\":${no_sidetracking},\"concise\":${concise},\"char_count\":${CHAR_COUNT}}" diff --git a/sequence-diagram-skill/scripts/sidetrack_blocklist.txt b/sequence-diagram-skill/scripts/sidetrack_blocklist.txt new file mode 100644 index 0000000..58b233c --- /dev/null +++ b/sequence-diagram-skill/scripts/sidetrack_blocklist.txt @@ -0,0 +1,23 @@ +potential issue +consider using +should be +could be improved +recommend +suggestion +improvement +code review +refactor +best practice +security concern +vulnerability +error handling could +missing error +you might want +it would be better +note that this +be aware that +one concern +problematic +anti-pattern +smell +technical debt diff --git a/sequence-diagram-skill/skill/SKILL.md b/sequence-diagram-skill/skill/SKILL.md new file mode 100644 index 0000000..39f9962 --- /dev/null +++ b/sequence-diagram-skill/skill/SKILL.md @@ -0,0 +1,54 @@ +--- +name: sequence-diagram +description: Generate a Mermaid sequence diagram showing message flow across module boundaries for an Elixir/Phoenix interaction. Use when asked to diagram, trace, or visualize a user interaction, request flow, or feature path through the codebase. +--- + +# Sequence Diagram Skill + +Generate a Mermaid `sequenceDiagram` that traces a specific user interaction +across module boundaries in an Elixir/Phoenix codebase. + +## Your Task + +Given a description of an interaction (e.g., "user clicks a tag on a blog post") +and access to the source files, produce a Mermaid sequence diagram that accurately +shows the message flow between modules. + +## Process + +1. **Identify the entry point.** What triggers this interaction? (HTTP request, + LiveView event, PubSub message, etc.) +2. **Read the router** to find which controller/live module handles the route. +3. **Read the controller/live module** to find which functions are called and + which domain modules they delegate to. +4. **Read the domain modules** to understand what they return and how. +5. **Read templates/components** if the rendering path matters. +6. **Emit the diagram.** Use `sequenceDiagram` with participants named after + actual modules. Show function calls as messages. + +## Output Format + +Respond with ONLY a fenced Mermaid code block. No preamble, no explanation, +no code review, no suggestions. Just the diagram. + +```mermaid +sequenceDiagram + participant Browser + participant Router as FirehoseWeb.Router + ... +``` + +## Rules + +- **Participants must be real modules** from the codebase. Never invent modules. +- **Messages must be real function calls** or HTTP verbs. Use the actual function + names you found in the source (e.g., `blog.posts_by_tag(tag)`, not "get posts"). +- **Show the return path.** Responses flow back: module returns data, controller + renders, browser receives HTML. +- **Distinguish compile-time from runtime.** If a module uses NimblePublisher + or module attributes, the data is compiled into the BEAM — there is no runtime + file I/O. Show this as a note, not as a message to the filesystem. +- **Stay on task.** Do NOT review the code. Do NOT suggest improvements. Do NOT + mention potential issues. Your only job is the diagram. +- **Keep it readable.** Use `Note over` for context. Use short aliases for + long module names in the participant declaration.