Compare commits
No commits in common. "919f8be4090d2370ce03bf2cc78c5d26f166a283" and "6eee29a9f2ba1a033def0b662e74b640a9cf9ee5" have entirely different histories.
919f8be409
...
6eee29a9f2
@ -1,83 +0,0 @@
|
|||||||
%{
|
|
||||||
title: "Coding agent generates its' own extensions",
|
|
||||||
author: "Willem van den Ende",
|
|
||||||
tags: ~w(ai loops),
|
|
||||||
description: "Handwritten note about generating extensions for the coding agent you are in a session with on something else. Engineer solutions in the moment.",
|
|
||||||
published: true
|
|
||||||
}
|
|
||||||
---
|
|
||||||
|
|
||||||
_This post was written longhand. Conversion to text was done with MyScript Notes, I did some minor manual edits to correct words and explain a few things (in parentheses).
|
|
||||||
|
|
||||||
|
|
||||||
I see a few people writing about sharing struggles with Large language models, saying we are all still figuring this out.
|
|
||||||
|
|
||||||
For me it is easier to do at the moment of a small success.
|
|
||||||
|
|
||||||
The other day I tried to develop an extension for [Pi](https://shittycodingagent.ai) model when it goes off the rails.
|
|
||||||
|
|
||||||
I now have a local model that is fast, can call tools (search, run test, edit code etc.). It does, however, perform some model assisted coding quirks frequently:
|
|
||||||
|
|
||||||
- Replace production code that Works With throwing an exception
|
|
||||||
- Write if statements in tests
|
|
||||||
- Add fallbacks for things that can't fail
|
|
||||||
- Find "problems" in code that works (passes tests + other checks, works for the user etc)
|
|
||||||
|
|
||||||
Long term the solution probably is to work in small steps. But these steps come from experience.
|
|
||||||
|
|
||||||
Catching the problem when it happens by simply matching some words is a starting point for that: scan for key words in any edits and prompt the user for permission. Just abort when the session is not interactive.
|
|
||||||
|
|
||||||
Looks simple, so I let a more powerful but slower local model figure out how to build an extensions. Every Pi session opens with an invitation:
|
|
||||||
|
|
||||||
> Pi can explain its own features and look up its docs. Ask it how to use or extend Pi.
|
|
||||||
|
|
||||||
After some iteration we had a plan and Pi generated d plausible looking extension.
|
|
||||||
|
|
||||||
I tested it manually, in Pi. Nothing happened. Back to the drawing board.
|
|
||||||
|
|
||||||
I had quite a few iterations, compared with sample code, looked into the Pi API.
|
|
||||||
No luck.
|
|
||||||
|
|
||||||
Eventually I installed the sample extension. That worked. Then I deleted most of my
|
|
||||||
extension, added some logging - I could see something.
|
|
||||||
|
|
||||||
I learned quite a bit about Pi and its extension mechanism.
|
|
||||||
|
|
||||||
It looks like only the last "UI notification" gets shown for any exension point (e.g. a tool call or system startup). I am not yet sure if this is by design or not.
|
|
||||||
|
|
||||||
I did take away that, here too, I want to work test-first for parts that do not interface
|
|
||||||
with the agent directly. The feedback loop is just too slow otherwise.
|
|
||||||
|
|
||||||
This also required experimentation. I did not want to set up a separate project for an extension that is little more than an idea. But I do want tests. So I asked a model again. The suggestion was to use Deno, because that has testing built in. Some more fiddling followed:
|
|
||||||
|
|
||||||
- Get [deno](https://deno.com/) to work in the [nono](https://nono.sh/cli) sandbox
|
|
||||||
- Learn that Pi auto loads any thing in the extensions folder. If you put a test there, Pi crashes. (the test does not have the method that defines an extension. All files in `.pi/agent/extensions` must have it.)
|
|
||||||
- Learn that "domain" files also don't work there. (I wanted to have the extension files thin and the testable functions separate. So that the tests don't depend on `pi` and its' types).
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
|
|
||||||
So eventually I ended up with
|
|
||||||
|
|
||||||
```
|
|
||||||
- .pi / test
|
|
||||||
/ core
|
|
||||||
/ extensions
|
|
||||||
```
|
|
||||||
|
|
||||||
`Core` contains the functional cores, `test` tests the core. `Extensions` is a thin integration with Pi that uses the core. (the Deno project lives in )
|
|
||||||
|
|
||||||
This was clear enough that the slow, dense model could build a second extension and performance metrics in chat, with relatively little guidance after iterations on a plan.
|
|
||||||
|
|
||||||
I haven't looked at the code yet (in detail). not out of principle, but because it is late,
|
|
||||||
and I want to write down my trial and error before I forget.
|
|
||||||
|
|
||||||
Afterword
|
|
||||||
---
|
|
||||||
|
|
||||||
I hope you enjoyed this slowly written note. I have added the
|
|
||||||
[handwritten draft](/images/blog/2026/coding-agent-generates-extensions-handwritten.pdf) as pdf.
|
|
||||||
|
|
||||||
I found writing in long hand helped me slow down and step away from the slot machine that wishcraft can be sometimes.
|
|
||||||
|
|
||||||
|
|
||||||
Binary file not shown.
Binary file not shown.
|
Before Width: | Height: | Size: 178 KiB |
Loading…
x
Reference in New Issue
Block a user