diff --git a/sequence-diagram-skill/.gitignore b/sequence-diagram-skill/.gitignore
new file mode 100644
index 0000000..7b40247
--- /dev/null
+++ b/sequence-diagram-skill/.gitignore
@@ -0,0 +1,10 @@
+# autoresearch session
+autoresearch.jsonl
+autoresearch.ideas.md
+
+# temp
+.tmp_*
+*.tmp
+
+# OS
+.DS_Store
diff --git a/sequence-diagram-skill/README.md b/sequence-diagram-skill/README.md
new file mode 100644
index 0000000..c78b201
--- /dev/null
+++ b/sequence-diagram-skill/README.md
@@ -0,0 +1,80 @@
+# Sequence Diagram Skill — Autoresearch
+
+Optimizes a pi skill for generating Mermaid sequence diagrams from
+Elixir/Phoenix codebases, using [pi-autoresearch](https://github.com/davebcn87/pi-autoresearch).
+
+## The Problem
+
+Small local models (Qwen3.5-35B-A3B) produce great sequence diagrams for
+well-represented languages (C#, Java) but go off the rails with Elixir/Phoenix —
+sidetracking into imaginary code reviews instead of finishing the diagram.
+
+## How It Works
+
+The autoresearch loop mutates `skill/SKILL.md`, runs it against 3 scenarios
+from a real Phoenix codebase (Firehose), and scores with **zero-judge-model
+bash evals**:
+
+| Eval | Check | Tool |
+|------|-------|------|
+| has_diagram | Output has `` ```mermaid `` + `sequenceDiagram` | grep |
+| diagram_parseable | Valid mermaid syntax (participants + messages) | grep / mmdc |
+| uses_real_modules | ≥2 actual module names from codebase | grep |
+| uses_real_functions | ≥1 actual function name | grep |
+| no_sidetracking | No review/critique language | grep against blocklist |
+| concise | Under 3000 chars | wc |
+
+3 tasks × 6 evals = 18 max score.
+
+## Setup
+
+1. Clone the Firehose repo into `workspace/`:
+   ```bash
+   git clone https://gitea.apps.sustainabledelivery.com/mostalive/firehose workspace
+   ```
+
+2. Make scripts executable:
+   ```bash
+   chmod +x autoresearch.sh autoresearch.checks.sh scripts/*.sh
+   ```
+
+3. Configure model access in `scripts/config.env`:
+   - Local: leave `SSH_TARGET` empty, have pi configured with your model
+   - Remote: set `SSH_TARGET=analyst@your-host` and `SSH_PORT=2222`
+
+4. Init git and start:
+   ```bash
+   git init && git add -A && git commit -m "initial"
+   pi
+   # then: /autoresearch
+   ```
+
+## Project Structure
+
+```
+sequence-diagram-skill/
+├── autoresearch.md           # Session doc (pi reads this)
+├── autoresearch.sh           # Benchmark runner
+├── autoresearch.checks.sh    # Sanity checks on SKILL.md
+├── skill/
+│   └── SKILL.md              # THE FILE BEING OPTIMIZED
+├── benchmark/
+│   └── tasks.jsonl           # 3 test scenarios
+├── scripts/
+│   ├── config.env            # Endpoint config
+│   ├── run_one.sh            # Run pi with skill + single task
+│   ├── score.sh              # Score a single output (6 binary evals)
+│   └── sidetrack_blocklist.txt  # Phrases that indicate off-task behavior
+└── workspace/                # Clone of Firehose repo (mounted/symlinked)
+```
+
+## Mutation Ideas for the Agent
+
+The autoresearch agent only edits `skill/SKILL.md`. Good mutations include:
+
+- Stronger "do not review" constraints
+- Explicit Elixir/Phoenix vocabulary hints (NimblePublisher, module attributes)
+- Output format enforcement (ONLY the mermaid block, nothing else)
+- Step-by-step process instructions (read router first, then controller, etc.)
+- Short generic example of a good sequence diagram
+- Negative examples ("do NOT include suggestions or improvements")
diff --git a/sequence-diagram-skill/autoresearch.checks.sh b/sequence-diagram-skill/autoresearch.checks.sh
new file mode 100755
index 0000000..ec3e2ba
--- /dev/null
+++ b/sequence-diagram-skill/autoresearch.checks.sh
@@ -0,0 +1,57 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# ─── autoresearch.checks.sh ─────────────────────────────────────────────────
+# Backpressure checks for the sequence diagram skill.
+# ─────────────────────────────────────────────────────────────────────────────
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SKILL_FILE="${SCRIPT_DIR}/skill/SKILL.md"
+ERRORS=0
+
+# 1. Skill exists and is non-empty
+if [[ ! -s "$SKILL_FILE" ]]; then
+    echo "FAIL: skill/SKILL.md is missing or empty"
+    ERRORS=$((ERRORS + 1))
+fi
+
+# 2. Skill is not trivially short
+CHAR_COUNT=$(wc -c < "$SKILL_FILE" 2>/dev/null || echo "0")
+if (( CHAR_COUNT < 200 )); then
+    echo "FAIL: skill/SKILL.md is only ${CHAR_COUNT} chars (min: 200)"
+    ERRORS=$((ERRORS + 1))
+fi
+
+# 3. Skill is not too long (rough token proxy: 1500 tokens ≈ 6000 chars)
+if (( CHAR_COUNT > 6000 )); then
+    echo "FAIL: skill/SKILL.md is ${CHAR_COUNT} chars (max: ~6000)"
+    ERRORS=$((ERRORS + 1))
+fi
+
+# 4. Skill must contain "sequenceDiagram" or "sequence diagram" (it's a diagram skill)
+if ! grep -qi 'sequence.diagram' "$SKILL_FILE" 2>/dev/null; then
+    echo "FAIL: skill/SKILL.md doesn't mention sequence diagrams"
+    ERRORS=$((ERRORS + 1))
+fi
+
+# 5. Skill must NOT contain Firehose-specific code (no overfitting)
+for term in "BlogController" "EngineeringBlog" "Firehose" "blogex" "priv/blog"; do
+    if grep -q "$term" "$SKILL_FILE" 2>/dev/null; then
+        echo "FAIL: skill/SKILL.md contains codebase-specific term '${term}'"
+        ERRORS=$((ERRORS + 1))
+    fi
+done
+
+# 6. Valid UTF-8
+if ! iconv -f utf-8 -t utf-8 "$SKILL_FILE" > /dev/null 2>&1; then
+    echo "FAIL: skill/SKILL.md contains invalid UTF-8"
+    ERRORS=$((ERRORS + 1))
+fi
+
+if (( ERRORS > 0 )); then
+    echo "Checks FAILED with ${ERRORS} error(s)"
+    exit 1
+else
+    echo "All checks passed. Skill: ${CHAR_COUNT} chars."
+    exit 0
+fi
diff --git a/sequence-diagram-skill/autoresearch.md b/sequence-diagram-skill/autoresearch.md
new file mode 100644
index 0000000..e60214f
--- /dev/null
+++ b/sequence-diagram-skill/autoresearch.md
@@ -0,0 +1,96 @@
+# Autoresearch: Sequence Diagram Skill for Elixir/Phoenix
+
+## Objective
+
+Optimize a pi skill (`skill/SKILL.md`) that generates Mermaid sequence diagrams
+from Elixir/Phoenix codebases. The skill is used with a local Qwen3.5-35B-A3B
+model running on CPU. The primary failure mode is **sidetracking** — the model
+abandons the diagram task and starts reviewing/critiquing the code instead.
+
+## Primary Metric
+
+**score** — higher is better (0–18 scale, sum of 6 binary evals × 3 test inputs).
+
+## Secondary Metrics
+
+- **sidetrack_count** — number of test runs containing review/critique language (lower is better)
+- **parse_count** — number of outputs that contain a parseable sequenceDiagram (higher is better)
+
+## Architecture
+
+Pi runs the skill against the Firehose codebase (mounted in the workspace) using
+the target model. Scoring is done by bash scripts — no judge model needed.
+
+## The Codebase Under Test
+
+**Firehose** — a Phoenix blogging platform with a monorepo structure:
+
+- `app/` — Phoenix web app (OTP app: `:firehose`)
+  - `lib/firehose_web/router.ex` — routes
+  - `lib/firehose_web/controllers/blog_controller.ex` — blog actions
+  - `lib/firehose_web/controllers/page_controller.ex` — homepage
+  - `lib/firehose/blogs/` — blog context modules (EngineeringBlog, ReleaseNotes)
+- `blogex/` — sibling library for compile-time blog engine
+  - `lib/blogex/blog.ex` — `use Blogex.Blog` macro (NimblePublisher)
+  - `lib/blogex/components.ex` — Phoenix function components (post_meta, tag_list, etc.)
+  - `lib/blogex/router.ex` — API/feed routes
+
+**Key architectural fact:** Blogex uses NimblePublisher. All blog posts are compiled
+into BEAM module attributes at build time. There is NO runtime file I/O for reading
+posts. Functions like `all_posts/0`, `get_post!/1`, `posts_by_tag/1` read from
+`@posts` module attributes. This is the #1 thing models get wrong.
+
+## Test Inputs (3 scenarios)
+
+### 1. Click tag on post (small)
+"Generate a sequence diagram for: a user on a blog post page clicks a tag link
+(e.g., 'elixir'). Trace the full request from browser through to rendered response."
+
+### 2. Show homepage (small)
+"Generate a sequence diagram for: a user visits the homepage (GET /).
+Trace from browser through to rendered HTML."
+
+### 3. Add blog post on disk (larger, crosses compile/runtime boundary)
+"Generate a sequence diagram for: a developer creates a new markdown file in
+priv/blog/engineering/. Trace what happens from file creation through to the
+post being visible on the blog. Include the compile-time and runtime phases."
+
+## Eval Criteria (6 binary checks)
+
+1. **has_diagram** — output contains `` ```mermaid `` and `sequenceDiagram`
+2. **diagram_parseable** — the mermaid block is syntactically valid
+3. **uses_real_modules** — diagram mentions at least 2 of: BlogController, EngineeringBlog, Blogex, Router, PageController
+4. **uses_real_functions** — diagram mentions at least 1 of: posts_by_tag, get_post!, all_posts, paginate, resolve_blog, render
+5. **no_sidetracking** — output does NOT contain code review language (see blocklist)
+6. **concise** — total output is under 3000 characters
+
+## Files in Scope
+
+| File | Agent may edit? |
+|------|-----------------|
+| `skill/SKILL.md` | ✅ YES — the only file the agent modifies |
+| `benchmark/tasks.jsonl` | ❌ NO |
+| `scripts/score.sh` | ❌ NO |
+| `scripts/run_one.sh` | ❌ NO |
+| `scripts/sidetrack_blocklist.txt` | ❌ NO |
+| `autoresearch.sh` | ❌ NO |
+| `autoresearch.checks.sh` | ❌ NO |
+
+## Constraints
+
+- SKILL.md must stay under 1500 tokens.
+- SKILL.md must NOT contain any code from the Firehose codebase (no overfitting).
+- SKILL.md must remain generic — it should work for any Elixir/Phoenix codebase,
+  not just Firehose.
+
+## What Has Been Tried
+
+(autoresearch fills this in)
+
+## Dead Ends
+
+(autoresearch fills this in)
+
+## Key Wins
+
+(autoresearch fills this in)
diff --git a/sequence-diagram-skill/autoresearch.sh b/sequence-diagram-skill/autoresearch.sh
new file mode 100755
index 0000000..d5b7846
--- /dev/null
+++ b/sequence-diagram-skill/autoresearch.sh
@@ -0,0 +1,101 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# ─── autoresearch.sh ─────────────────────────────────────────────────────────
+# Benchmark script for sequence diagram skill optimization.
+# Runs all 3 test inputs, scores each, outputs METRIC lines.
+# ─────────────────────────────────────────────────────────────────────────────
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+source "${SCRIPT_DIR}/scripts/config.env" 2>/dev/null || true
+
+# Defaults
+SSH_TARGET="${SSH_TARGET:-}"
+SSH_PORT="${SSH_PORT:-2222}"
+export TASK_TIMEOUT="${TASK_TIMEOUT:-180}"
+
+# ─── Pre-checks ──────────────────────────────────────────────────────────────
+
+SKILL_FILE="${SCRIPT_DIR}/skill/SKILL.md"
+if [[ ! -s "$SKILL_FILE" ]]; then
+    echo "ERROR: skill/SKILL.md is missing or empty"
+    exit 1
+fi
+
+SKILL_CHARS=$(wc -c < "$SKILL_FILE")
+echo "Skill: ${SKILL_CHARS} chars"
+
+TASKS_FILE="${SCRIPT_DIR}/benchmark/tasks.jsonl"
+if [[ ! -f "$TASKS_FILE" ]]; then
+    echo "ERROR: benchmark/tasks.jsonl not found"
+    exit 1
+fi
+
+echo "────────────────────────────────────────────────────"
+
+# ─── Run all tasks ───────────────────────────────────────────────────────────
+
+TMPDIR=$(mktemp -d)
+TOTAL_SCORE=0
+SIDETRACK_COUNT=0
+PARSE_COUNT=0
+TASK_COUNT=0
+
+START_TIME=$(date +%s)
+
+while IFS= read -r line; do
+    TASK_ID=$(echo "$line" | jq -r '.id')
+    TASK_PROMPT=$(echo "$line" | jq -r '.prompt')
+    TASK_COUNT=$((TASK_COUNT + 1))
+    
+    OUTPUT_FILE="${TMPDIR}/${TASK_ID}.txt"
+    SCORE_FILE="${TMPDIR}/${TASK_ID}.json"
+    
+    echo "  [${TASK_COUNT}/3] ${TASK_ID}..."
+    
+    # Run the task
+    bash "${SCRIPT_DIR}/scripts/run_one.sh" \
+        "$TASK_PROMPT" \
+        "$OUTPUT_FILE" \
+        "$SSH_TARGET" \
+        "$SSH_PORT"
+    
+    # Score it
+    SCORE_JSON=$(bash "${SCRIPT_DIR}/scripts/score.sh" "$OUTPUT_FILE")
+    echo "$SCORE_JSON" > "$SCORE_FILE"
+    
+    # Extract scores
+    TASK_SCORE=$(echo "$SCORE_JSON" | jq -r '.score')
+    TASK_SIDETRACK=$(echo "$SCORE_JSON" | jq -r '.no_sidetracking')
+    TASK_PARSE=$(echo "$SCORE_JSON" | jq -r '.diagram_parseable')
+    TASK_CHARS=$(echo "$SCORE_JSON" | jq -r '.char_count')
+    
+    TOTAL_SCORE=$((TOTAL_SCORE + TASK_SCORE))
+    
+    if (( TASK_SIDETRACK == 0 )); then
+        SIDETRACK_COUNT=$((SIDETRACK_COUNT + 1))
+    fi
+    
+    if (( TASK_PARSE == 1 )); then
+        PARSE_COUNT=$((PARSE_COUNT + 1))
+    fi
+    
+    echo "    score=${TASK_SCORE}/6 sidetrack=$(( 1 - TASK_SIDETRACK )) parseable=${TASK_PARSE} chars=${TASK_CHARS}"
+    
+done < "$TASKS_FILE"
+
+END_TIME=$(date +%s)
+TOTAL_SECONDS=$((END_TIME - START_TIME))
+
+# ─── Cleanup ─────────────────────────────────────────────────────────────────
+
+rm -rf "$TMPDIR"
+
+# ─── Output METRIC lines ────────────────────────────────────────────────────
+
+echo ""
+echo "METRIC score=${TOTAL_SCORE}"
+echo "METRIC sidetrack_count=${SIDETRACK_COUNT}"
+echo "METRIC parse_count=${PARSE_COUNT}"
+echo "METRIC total_seconds=${TOTAL_SECONDS}"
+echo "METRIC skill_chars=${SKILL_CHARS}"
diff --git a/sequence-diagram-skill/benchmark/tasks.jsonl b/sequence-diagram-skill/benchmark/tasks.jsonl
new file mode 100644
index 0000000..abfd16c
--- /dev/null
+++ b/sequence-diagram-skill/benchmark/tasks.jsonl
@@ -0,0 +1,3 @@
+{"id": "click-tag", "prompt": "Generate a sequence diagram for: a user on a blog post page clicks a tag link (e.g., 'elixir'). Trace the full HTTP request from browser through the Phoenix router, controller, domain modules, templates, and back to the browser. The codebase is in /home/analyst/workspace/. Read the relevant source files first."}
+{"id": "show-homepage", "prompt": "Generate a sequence diagram for: a user visits the homepage (GET /). Trace from the browser's HTTP request through the Phoenix router, controller, template rendering, layout wrapping, and back to the browser. The codebase is in /home/analyst/workspace/. Read the relevant source files first."}
+{"id": "add-post", "prompt": "Generate a sequence diagram for: a developer creates a new markdown file in priv/blog/engineering/ and the post becomes visible on the blog. Trace what happens including the compile-time phase (NimblePublisher, module recompilation) and the runtime request phase. The codebase is in /home/analyst/workspace/. Read the relevant source files first."}
diff --git a/sequence-diagram-skill/scripts/config.env b/sequence-diagram-skill/scripts/config.env
new file mode 100644
index 0000000..6b60d19
--- /dev/null
+++ b/sequence-diagram-skill/scripts/config.env
@@ -0,0 +1,10 @@
+# ─── config.env ──────────────────────────────────────────────────────────────
+# Leave SSH_TARGET empty to run pi locally (e.g., on your Mac).
+# Set it to use the remote pi container.
+
+# Remote pi container (leave empty for local)
+SSH_TARGET=""
+SSH_PORT=2222
+
+# Timeout per task (seconds)
+TASK_TIMEOUT=180
diff --git a/sequence-diagram-skill/scripts/run_one.sh b/sequence-diagram-skill/scripts/run_one.sh
new file mode 100755
index 0000000..b34b95c
--- /dev/null
+++ b/sequence-diagram-skill/scripts/run_one.sh
@@ -0,0 +1,58 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# ─── run_one.sh ──────────────────────────────────────────────────────────────
+# Run pi with the sequence-diagram skill on a single task.
+# Usage: ./scripts/run_one.sh <task_prompt> <output_file> [ssh_target] [ssh_port]
+#
+# If ssh_target is provided, runs remotely via SSH into the pi container.
+# Otherwise runs pi locally.
+# ─────────────────────────────────────────────────────────────────────────────
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
+
+TASK_PROMPT="$1"
+OUTPUT_FILE="$2"
+SSH_TARGET="${3:-}"
+SSH_PORT="${4:-2222}"
+TIMEOUT="${TASK_TIMEOUT:-180}"
+
+SKILL_FILE="${PROJECT_DIR}/skill/SKILL.md"
+
+if [[ ! -f "$SKILL_FILE" ]]; then
+    echo "ERROR: skill/SKILL.md not found" >&2
+    exit 1
+fi
+
+SKILL_CONTENT=$(cat "$SKILL_FILE")
+
+# Build the full prompt: skill instructions + task
+FULL_PROMPT="## Skill Instructions
+
+${SKILL_CONTENT}
+
+## Task
+
+${TASK_PROMPT}"
+
+if [[ -n "$SSH_TARGET" ]]; then
+    # ─── Remote: SSH into pi container ───────────────────────────────────
+    PAYLOAD=$(jq -n --arg prompt "$FULL_PROMPT" '{"prompt": $prompt}')
+    
+    ssh -p "$SSH_PORT" \
+        -o StrictHostKeyChecking=no \
+        -o ConnectTimeout=10 \
+        -o BatchMode=yes \
+        "$SSH_TARGET" \
+        "run-task --stdin --mode print --thinking off --timeout $TIMEOUT" \
+        <<< "$PAYLOAD" > "$OUTPUT_FILE" 2>/dev/null
+else
+    # ─── Local: run pi directly ──────────────────────────────────────────
+    timeout "${TIMEOUT}s" pi \
+        --mode print \
+        --no-session \
+        --no-extensions \
+        --thinking none \
+        -p "$FULL_PROMPT" > "$OUTPUT_FILE" 2>/dev/null || true
+fi
diff --git a/sequence-diagram-skill/scripts/score.sh b/sequence-diagram-skill/scripts/score.sh
new file mode 100755
index 0000000..1fc2417
--- /dev/null
+++ b/sequence-diagram-skill/scripts/score.sh
@@ -0,0 +1,109 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# ─── score.sh ────────────────────────────────────────────────────────────────
+# Score a single diagram output against 6 binary evals.
+# Usage: ./scripts/score.sh <output_file>
+# Prints a JSON line with pass/fail for each eval and total score.
+# ─────────────────────────────────────────────────────────────────────────────
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+OUTPUT_FILE="$1"
+
+if [[ ! -f "$OUTPUT_FILE" ]]; then
+    echo '{"error": "file not found", "score": 0}' 
+    exit 0
+fi
+
+CONTENT=$(cat "$OUTPUT_FILE")
+CHAR_COUNT=${#CONTENT}
+
+# ─── Eval 1: has_diagram ─────────────────────────────────────────────────────
+# Output contains a mermaid fenced block with sequenceDiagram
+has_diagram=0
+if echo "$CONTENT" | grep -q '```mermaid' && echo "$CONTENT" | grep -q 'sequenceDiagram'; then
+    has_diagram=1
+fi
+
+# ─── Eval 2: diagram_parseable ───────────────────────────────────────────────
+# Extract the mermaid block and check basic syntax
+diagram_parseable=0
+if (( has_diagram == 1 )); then
+    # Extract mermaid block
+    MERMAID_BLOCK=$(echo "$CONTENT" | sed -n '/```mermaid/,/```/p' | sed '1d;$d')
+    
+    if [[ -n "$MERMAID_BLOCK" ]]; then
+        # Basic syntax checks:
+        # - Has "sequenceDiagram" keyword
+        # - Has at least one "participant" line  
+        # - Has at least one "->>", "-->>", or "->>" message line
+        has_keyword=$(echo "$MERMAID_BLOCK" | grep -c 'sequenceDiagram' || true)
+        has_participant=$(echo "$MERMAID_BLOCK" | grep -c 'participant' || true)
+        has_message=$(echo "$MERMAID_BLOCK" | grep -cE '\->>|-->>|\->' || true)
+        
+        if (( has_keyword > 0 && has_participant > 0 && has_message > 0 )); then
+            diagram_parseable=1
+        fi
+    fi
+    
+    # If mmdc (mermaid CLI) is available, use it for real validation
+    if command -v mmdc &> /dev/null && (( diagram_parseable == 1 )); then
+        TMPFILE=$(mktemp /tmp/mermaid_XXXXXX.mmd)
+        echo "$MERMAID_BLOCK" > "$TMPFILE"
+        if mmdc -i "$TMPFILE" -o /dev/null 2>/dev/null; then
+            diagram_parseable=1
+        else
+            diagram_parseable=0
+        fi
+        rm -f "$TMPFILE"
+    fi
+fi
+
+# ─── Eval 3: uses_real_modules ───────────────────────────────────────────────
+# Diagram mentions at least 2 real modules from the Firehose codebase
+uses_real_modules=0
+module_count=0
+for module in BlogController EngineeringBlog ReleaseNotes Blogex Router PageController Layouts; do
+    if echo "$CONTENT" | grep -qi "$module"; then
+        module_count=$((module_count + 1))
+    fi
+done
+if (( module_count >= 2 )); then
+    uses_real_modules=1
+fi
+
+# ─── Eval 4: uses_real_functions ─────────────────────────────────────────────
+# Diagram mentions at least 1 real function from the codebase
+uses_real_functions=0
+for func in posts_by_tag get_post all_posts paginate resolve_blog render recent_posts; do
+    if echo "$CONTENT" | grep -qi "$func"; then
+        uses_real_functions=1
+        break
+    fi
+done
+
+# ─── Eval 5: no_sidetracking ────────────────────────────────────────────────
+# Output does NOT contain code review / critique language
+no_sidetracking=1
+BLOCKLIST="${SCRIPT_DIR}/sidetrack_blocklist.txt"
+if [[ -f "$BLOCKLIST" ]]; then
+    while IFS= read -r phrase; do
+        phrase=$(echo "$phrase" | xargs)  # trim whitespace
+        if [[ -n "$phrase" ]] && echo "$CONTENT" | grep -qi "$phrase"; then
+            no_sidetracking=0
+            break
+        fi
+    done < "$BLOCKLIST"
+fi
+
+# ─── Eval 6: concise ────────────────────────────────────────────────────────
+# Total output under 3000 characters
+concise=0
+if (( CHAR_COUNT < 3000 )); then
+    concise=1
+fi
+
+# ─── Total ───────────────────────────────────────────────────────────────────
+score=$((has_diagram + diagram_parseable + uses_real_modules + uses_real_functions + no_sidetracking + concise))
+
+echo "{\"score\":${score},\"has_diagram\":${has_diagram},\"diagram_parseable\":${diagram_parseable},\"uses_real_modules\":${uses_real_modules},\"uses_real_functions\":${uses_real_functions},\"no_sidetracking\":${no_sidetracking},\"concise\":${concise},\"char_count\":${CHAR_COUNT}}"
diff --git a/sequence-diagram-skill/scripts/sidetrack_blocklist.txt b/sequence-diagram-skill/scripts/sidetrack_blocklist.txt
new file mode 100644
index 0000000..58b233c
--- /dev/null
+++ b/sequence-diagram-skill/scripts/sidetrack_blocklist.txt
@@ -0,0 +1,23 @@
+potential issue
+consider using
+should be
+could be improved
+recommend
+suggestion
+improvement
+code review
+refactor
+best practice
+security concern
+vulnerability
+error handling could
+missing error
+you might want
+it would be better
+note that this
+be aware that
+one concern
+problematic
+anti-pattern
+smell
+technical debt
diff --git a/sequence-diagram-skill/skill/SKILL.md b/sequence-diagram-skill/skill/SKILL.md
new file mode 100644
index 0000000..39f9962
--- /dev/null
+++ b/sequence-diagram-skill/skill/SKILL.md
@@ -0,0 +1,54 @@
+---
+name: sequence-diagram
+description: Generate a Mermaid sequence diagram showing message flow across module boundaries for an Elixir/Phoenix interaction. Use when asked to diagram, trace, or visualize a user interaction, request flow, or feature path through the codebase.
+---
+
+# Sequence Diagram Skill
+
+Generate a Mermaid `sequenceDiagram` that traces a specific user interaction
+across module boundaries in an Elixir/Phoenix codebase.
+
+## Your Task
+
+Given a description of an interaction (e.g., "user clicks a tag on a blog post")
+and access to the source files, produce a Mermaid sequence diagram that accurately
+shows the message flow between modules.
+
+## Process
+
+1. **Identify the entry point.** What triggers this interaction? (HTTP request,
+   LiveView event, PubSub message, etc.)
+2. **Read the router** to find which controller/live module handles the route.
+3. **Read the controller/live module** to find which functions are called and
+   which domain modules they delegate to.
+4. **Read the domain modules** to understand what they return and how.
+5. **Read templates/components** if the rendering path matters.
+6. **Emit the diagram.** Use `sequenceDiagram` with participants named after
+   actual modules. Show function calls as messages.
+
+## Output Format
+
+Respond with ONLY a fenced Mermaid code block. No preamble, no explanation,
+no code review, no suggestions. Just the diagram.
+
+```mermaid
+sequenceDiagram
+    participant Browser
+    participant Router as FirehoseWeb.Router
+    ...
+```
+
+## Rules
+
+- **Participants must be real modules** from the codebase. Never invent modules.
+- **Messages must be real function calls** or HTTP verbs. Use the actual function
+  names you found in the source (e.g., `blog.posts_by_tag(tag)`, not "get posts").
+- **Show the return path.** Responses flow back: module returns data, controller
+  renders, browser receives HTML.
+- **Distinguish compile-time from runtime.** If a module uses NimblePublisher
+  or module attributes, the data is compiled into the BEAM — there is no runtime
+  file I/O. Show this as a note, not as a message to the filesystem.
+- **Stay on task.** Do NOT review the code. Do NOT suggest improvements. Do NOT
+  mention potential issues. Your only job is the diagram.
+- **Keep it readable.** Use `Note over` for context. Use short aliases for
+  long module names in the participant declaration.