edited blogpost
This commit is contained in:
parent
efc50d4ba8
commit
3c592bdbc7
@ -1,104 +0,0 @@
|
||||
%{
|
||||
title: "Coding agent generates its' own extensions",
|
||||
author: "Willem van den Ende",
|
||||
tags: ~w(ai loops),
|
||||
description: "Handwritten note",
|
||||
published: false
|
||||
}
|
||||
---
|
||||
|
||||
I see a few people writing about
|
||||
sharing struggles with Llm's.
|
||||
|
||||
for me it is easier to do at
|
||||
the moment of a small success.
|
||||
|
||||
The challenge with writing about
|
||||
my experiments is that it gets meta pretty quickly.
|
||||
|
||||
therefore I am going to leave out
|
||||
a bunch of things, including this commentary.
|
||||
|
||||
the other day I tried to develop on
|
||||
extension for pi - the shitty coding agent, that would stop a model when it
|
||||
goes off the rails.
|
||||
|
||||
I now have a local model that
|
||||
is fast, can call tools (edit Files, run test, etc) edit code etc.
|
||||
|
||||
It does, however, perform some model assisted coding quinks frequently:
|
||||
|
||||
- replace production code that Works With throwing an exception
|
||||
|
||||
- write if statements in tests
|
||||
- add fallbacks for things that can't fail
|
||||
|
||||
- Find "problems" in that works
|
||||
(passes tests + other checks, works for
|
||||
the user etc)
|
||||
|
||||
Long form the Solution probably is to work
|
||||
in small Steps. But these steps come From
|
||||
experience.
|
||||
|
||||
Catching the hovel when it happens by
|
||||
simply matching Some Lords is a starting point for that: scan for key words in
|
||||
any edits on a prompt the user for permission, or abort when the session is not interactive.
|
||||
|
||||
Looks simple, so I let a more powerful
|
||||
but slower local model figure out how to build an extensions - pi has a system prompt for that -. After cone iteration
|
||||
we had a plan and pi generated d
|
||||
plausible looking extension.
|
||||
|
||||
I tested it manually, in pi. Nothing happened. Back to the drawing board.
|
||||
|
||||
I had quite a few iterations, compared with
|
||||
sample code, looked into the pi API,
|
||||
70 luck.
|
||||
|
||||
Eventually I installed the sample extension. that worked. then I deleted most of my
|
||||
extension, added some logging - I could
|
||||
see sore thing.
|
||||
|
||||
I learned while a bit about pi and its extension mechanism.
|
||||
|
||||
It looks like only the last "UI notification" gets shown for any exension point (e.g. a fool call or system startup).
|
||||
|
||||
I am not get sure if this is by design or not.
|
||||
|
||||
I did take away the, here too, I wont to work test-first for parts that do not interface
|
||||
with the agent directly. The feedback loop
|
||||
is just too slow.
|
||||
|
||||
this also ren vired experimentation. I did not want to set up a separate project for
|
||||
an extension that is little more than an idea. But I do want tests
|
||||
|
||||
to asked a model again. and suggestion was
|
||||
to use Deno, because that has testing built
|
||||
in. Some more Fiddlig Adowed:
|
||||
|
||||
- get Dena to work in the Sandbox
|
||||
- Learn that pi auto loads any thing in
|
||||
the excusions folder. If you put a test there,
|
||||
pi crashes
|
||||
- learn that" domain" files also don't work there.
|
||||
|
||||
So eventually I ended up with
|
||||
|
||||
```
|
||||
- .pi/ test
|
||||
/ core
|
||||
/ extensions
|
||||
```
|
||||
|
||||
Core contains the functional cores tests ter the core, exertion, is a thin integration with
|
||||
pi that uses the core.
|
||||
|
||||
this was clear enough that the slow, dense model could build a second extension & performance metrics in chat.) with relatively little guidance after iterations on a plan.
|
||||
|
||||
I haven't looked at the code bet. not
|
||||
out of principle, but because it is late,
|
||||
and I want to write down my trial and error before I forget.
|
||||
|
||||
TODO link pdf
|
||||
TODO add image from scan (in downloads)
|
||||
@ -0,0 +1,83 @@
|
||||
%{
|
||||
title: "Coding agent generates its' own extensions",
|
||||
author: "Willem van den Ende",
|
||||
tags: ~w(ai loops),
|
||||
description: "Handwritten note about generating extensions for the coding agent you are in a session with on something else. Engineer solutions in the moment.",
|
||||
published: true
|
||||
}
|
||||
---
|
||||
|
||||
(This post was written longhand. Conversion to text was done with MyScript Notes, I did some minor manual edits to correct words and explain a few things (in parentheses) See the Afterword for what happened since writing this).
|
||||
|
||||
|
||||
I see a few people writing about sharing struggles with Large language models, saying we are all still figuring this out.
|
||||
|
||||
For me it is easier to do at the moment of a small success.
|
||||
|
||||
The other day I tried to develop an extension for [Pi](https://shittycodingagent.ai) model when it goes off the rails.
|
||||
|
||||
I now have a local model that is fast, can call tools (search, run test, edit code etc.). It does, however, perform some model assisted coding quirks frequently:
|
||||
|
||||
- Replace production code that Works With throwing an exception
|
||||
- Write if statements in tests
|
||||
- Add fallbacks for things that can't fail
|
||||
- Find "problems" in code that works (passes tests + other checks, works for the user etc)
|
||||
|
||||
Long term the solution probably is to work in small steps. But these steps come from experience.
|
||||
|
||||
Catching the problem when it happens by simply matching some words is a starting point for that: scan for key words in any edits and prompt the user for permission. Just abort when the session is not interactive.
|
||||
|
||||
Looks simple, so I let a more powerful but slower local model figure out how to build an extensions. Every Pi session opens with an invitation:
|
||||
|
||||
> Pi can explain its own features and look up its docs. Ask it how to use or extend Pi.
|
||||
|
||||
After some iteration we had a plan and Pi generated d plausible looking extension.
|
||||
|
||||
I tested it manually, in Pi. Nothing happened. Back to the drawing board.
|
||||
|
||||
I had quite a few iterations, compared with sample code, looked into the Pi API.
|
||||
No luck.
|
||||
|
||||
Eventually I installed the sample extension. That worked. Then I deleted most of my
|
||||
extension, added some logging - I could see something.
|
||||
|
||||
I learned quite a bit about Pi and its extension mechanism.
|
||||
|
||||
It looks like only the last "UI notification" gets shown for any exension point (e.g. a tool call or system startup). I am not yet sure if this is by design or not.
|
||||
|
||||
I did take away that, here too, I want to work test-first for parts that do not interface
|
||||
with the agent directly. The feedback loop is just too slow otherwise.
|
||||
|
||||
This also required experimentation. I did not want to set up a separate project for an extension that is little more than an idea. But I do want tests. So I asked a model again. The suggestion was to use Deno, because that has testing built in. Some more fiddling followed:
|
||||
|
||||
- Get [deno](https://deno.com/) to work in the [nono](https://nono.sh/cli) sandbox
|
||||
- Learn that Pi auto loads any thing in the extensions folder. If you put a test there, Pi crashes. (the test does not have the method that defines an extension. All files in `.pi/agent/extensions` must have it.)
|
||||
- Learn that "domain" files also don't work there. (I wanted to have the extension files thin and the testable functions separate. So that the tests don't depend on `pi` and its' types).
|
||||
|
||||
[!handwritten draft of the paragraphg below](/images/blog/2026/local-llm-handwriting.png)
|
||||
|
||||
|
||||
So eventually I ended up with
|
||||
|
||||
```
|
||||
- .pi / test
|
||||
/ core
|
||||
/ extensions
|
||||
```
|
||||
|
||||
`Core` contains the functional cores, `test` tests the core. `Extensions` is a thin integration with Pi that uses the core. (the Deno project lives in )
|
||||
|
||||
This was clear enough that the slow, dense model could build a second extension and performance metrics in chat, with relatively little guidance after iterations on a plan.
|
||||
|
||||
I haven't looked at the code yet (in detail). not out of principle, but because it is late,
|
||||
and I want to write down my trial and error before I forget.
|
||||
|
||||
Afterword
|
||||
---
|
||||
|
||||
I hope you enjoyed this slowly written note. I have added the
|
||||
[handwritten draft](/images/blog/2026/coding-agent-generates-extensions-handwritten.pdf) as pdf.
|
||||
|
||||
I found writing in long hand helped me slow down and step away from the slot machine that wishcraft can be sometimes.
|
||||
|
||||
|
||||
Binary file not shown.
BIN
app/priv/static/images/blog/2026/local-llm-handwriting.png
Normal file
BIN
app/priv/static/images/blog/2026/local-llm-handwriting.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 178 KiB |
Loading…
x
Reference in New Issue
Block a user