as it happens - draft, needs editing

2026-04-18 10:22:21 +01:00 · 2026-04-18 10:22:21 +01:00 · efc50d4ba8
commit efc50d4ba8
parent 6eee29a9f2
1 changed files with 104 additions and 0 deletions
--- a/app/priv/blog/engineering/2026/04-18-coding-agent-generates-its-own-extensions.md
+++ b/app/priv/blog/engineering/2026/04-18-coding-agent-generates-its-own-extensions.md
@ -0,0 +1,104 @@
 %{
  title: "Coding agent generates its' own extensions",
  author: "Willem van den Ende",
  tags: ~w(ai loops),
  description: "Handwritten note",
  published: false
 }
 ---
 I see a few people writing about
 sharing struggles with Llm's.
 for me it is easier to do at
 the moment of a small success.
 The challenge with writing about
 my experiments is that it gets meta pretty quickly.
 therefore I am going to leave out
 a bunch of things, including this commentary.
 the other day I tried to develop on
 extension for pi - the shitty coding agent, that would stop a model when it
 goes off the rails.
 I now have a local model that
 is fast, can call tools (edit Files, run test, etc) edit code etc.
 It does, however, perform some model assisted coding quinks frequently:
 - replace production code that Works With throwing an exception
 - write if statements in tests
 - add fallbacks for things that can't fail
 - Find "problems" in that works
 	(passes tests + other checks, works for
 		the user etc)
 Long form the Solution probably is to work
 in small Steps. But these steps come From
 experience.
 Catching the hovel when it happens by
 simply matching Some Lords is a starting point for that: scan for key words in
 any edits on a prompt the user for permission, or abort when the session is not interactive.
 Looks simple, so I let a more powerful
 but slower local model figure out how to build an extensions - pi has a system prompt for that -. After cone iteration
 we had a plan and pi generated d
 plausible looking extension.
 I tested it manually, in pi. Nothing happened. Back to the drawing board.
 I had quite a few iterations, compared with
 sample code, looked into the pi API,
 70 luck.
 Eventually I installed the sample extension. that worked. then I deleted most of my
 extension, added some logging - I could
 see sore thing.
 I learned while a bit about pi and its extension mechanism.
 It looks like only the last "UI notification" gets shown for any exension point (e.g. a fool call or system startup).
 I am not get sure if this is by design or not.
 I did take away the, here too, I wont to work test-first for parts that do not interface
 with the agent directly. The feedback loop
 is just too slow.
 this also ren vired experimentation. I did not want to set up a separate project for
 an extension that is little more than an idea. But I do want tests
 to asked a model again. and suggestion was
 to use Deno, because that has testing built
 in. Some more Fiddlig Adowed:
 - get Dena to work in the Sandbox
 - Learn that pi auto loads any thing in
 	the excusions folder. If you put a test there,
 	pi crashes
 - learn that" domain" files also don't work there.
 So eventually I ended up with
 ```
 - .pi/ test
 	/ core
 	/ extensions
 ```
 Core contains the functional cores tests ter the core, exertion, is a thin integration with
 pi that uses the core.
 this was clear enough that the slow, dense model could build a second extension & performance metrics in chat.) with relatively little guidance after iterations on a plan.
 I haven't looked at the code bet. not
 out of principle, but because it is late,
 and I want to write down my trial and error before I forget.
 TODO link pdf
 TODO add image from scan (in downloads)