edited blogpost

This commit is contained in:
Firehose Bot 2026-04-19 23:05:00 +01:00
parent efc50d4ba8
commit 3c592bdbc7
4 changed files with 83 additions and 104 deletions

View File

@ -1,104 +0,0 @@
%{
title: "Coding agent generates its' own extensions",
author: "Willem van den Ende",
tags: ~w(ai loops),
description: "Handwritten note",
published: false
}
---
I see a few people writing about
sharing struggles with Llm's.
for me it is easier to do at
the moment of a small success.
The challenge with writing about
my experiments is that it gets meta pretty quickly.
therefore I am going to leave out
a bunch of things, including this commentary.
the other day I tried to develop on
extension for pi - the shitty coding agent, that would stop a model when it
goes off the rails.
I now have a local model that
is fast, can call tools (edit Files, run test, etc) edit code etc.
It does, however, perform some model assisted coding quinks frequently:
- replace production code that Works With throwing an exception
- write if statements in tests
- add fallbacks for things that can't fail
- Find "problems" in that works
(passes tests + other checks, works for
the user etc)
Long form the Solution probably is to work
in small Steps. But these steps come From
experience.
Catching the hovel when it happens by
simply matching Some Lords is a starting point for that: scan for key words in
any edits on a prompt the user for permission, or abort when the session is not interactive.
Looks simple, so I let a more powerful
but slower local model figure out how to build an extensions - pi has a system prompt for that -. After cone iteration
we had a plan and pi generated d
plausible looking extension.
I tested it manually, in pi. Nothing happened. Back to the drawing board.
I had quite a few iterations, compared with
sample code, looked into the pi API,
70 luck.
Eventually I installed the sample extension. that worked. then I deleted most of my
extension, added some logging - I could
see sore thing.
I learned while a bit about pi and its extension mechanism.
It looks like only the last "UI notification" gets shown for any exension point (e.g. a fool call or system startup).
I am not get sure if this is by design or not.
I did take away the, here too, I wont to work test-first for parts that do not interface
with the agent directly. The feedback loop
is just too slow.
this also ren vired experimentation. I did not want to set up a separate project for
an extension that is little more than an idea. But I do want tests
to asked a model again. and suggestion was
to use Deno, because that has testing built
in. Some more Fiddlig Adowed:
- get Dena to work in the Sandbox
- Learn that pi auto loads any thing in
the excusions folder. If you put a test there,
pi crashes
- learn that" domain" files also don't work there.
So eventually I ended up with
```
- .pi/ test
/ core
/ extensions
```
Core contains the functional cores tests ter the core, exertion, is a thin integration with
pi that uses the core.
this was clear enough that the slow, dense model could build a second extension & performance metrics in chat.) with relatively little guidance after iterations on a plan.
I haven't looked at the code bet. not
out of principle, but because it is late,
and I want to write down my trial and error before I forget.
TODO link pdf
TODO add image from scan (in downloads)

View File

@ -0,0 +1,83 @@
%{
title: "Coding agent generates its' own extensions",
author: "Willem van den Ende",
tags: ~w(ai loops),
description: "Handwritten note about generating extensions for the coding agent you are in a session with on something else. Engineer solutions in the moment.",
published: true
}
---
(This post was written longhand. Conversion to text was done with MyScript Notes, I did some minor manual edits to correct words and explain a few things (in parentheses) See the Afterword for what happened since writing this).
I see a few people writing about sharing struggles with Large language models, saying we are all still figuring this out.
For me it is easier to do at the moment of a small success.
The other day I tried to develop an extension for [Pi](https://shittycodingagent.ai) model when it goes off the rails.
I now have a local model that is fast, can call tools (search, run test, edit code etc.). It does, however, perform some model assisted coding quirks frequently:
- Replace production code that Works With throwing an exception
- Write if statements in tests
- Add fallbacks for things that can't fail
- Find "problems" in code that works (passes tests + other checks, works for the user etc)
Long term the solution probably is to work in small steps. But these steps come from experience.
Catching the problem when it happens by simply matching some words is a starting point for that: scan for key words in any edits and prompt the user for permission. Just abort when the session is not interactive.
Looks simple, so I let a more powerful but slower local model figure out how to build an extensions. Every Pi session opens with an invitation:
> Pi can explain its own features and look up its docs. Ask it how to use or extend Pi.
After some iteration we had a plan and Pi generated d plausible looking extension.
I tested it manually, in Pi. Nothing happened. Back to the drawing board.
I had quite a few iterations, compared with sample code, looked into the Pi API.
No luck.
Eventually I installed the sample extension. That worked. Then I deleted most of my
extension, added some logging - I could see something.
I learned quite a bit about Pi and its extension mechanism.
It looks like only the last "UI notification" gets shown for any exension point (e.g. a tool call or system startup). I am not yet sure if this is by design or not.
I did take away that, here too, I want to work test-first for parts that do not interface
with the agent directly. The feedback loop is just too slow otherwise.
This also required experimentation. I did not want to set up a separate project for an extension that is little more than an idea. But I do want tests. So I asked a model again. The suggestion was to use Deno, because that has testing built in. Some more fiddling followed:
- Get [deno](https://deno.com/) to work in the [nono](https://nono.sh/cli) sandbox
- Learn that Pi auto loads any thing in the extensions folder. If you put a test there, Pi crashes. (the test does not have the method that defines an extension. All files in `.pi/agent/extensions` must have it.)
- Learn that "domain" files also don't work there. (I wanted to have the extension files thin and the testable functions separate. So that the tests don't depend on `pi` and its' types).
[!handwritten draft of the paragraphg below](/images/blog/2026/local-llm-handwriting.png)
So eventually I ended up with
```
- .pi / test
/ core
/ extensions
```
`Core` contains the functional cores, `test` tests the core. `Extensions` is a thin integration with Pi that uses the core. (the Deno project lives in )
This was clear enough that the slow, dense model could build a second extension and performance metrics in chat, with relatively little guidance after iterations on a plan.
I haven't looked at the code yet (in detail). not out of principle, but because it is late,
and I want to write down my trial and error before I forget.
Afterword
---
I hope you enjoyed this slowly written note. I have added the
[handwritten draft](/images/blog/2026/coding-agent-generates-extensions-handwritten.pdf) as pdf.
I found writing in long hand helped me slow down and step away from the slot machine that wishcraft can be sometimes.

Binary file not shown.

After

Width:  |  Height:  |  Size: 178 KiB