agentic ai cli tools

Agentic Scratch Memory using Pensieve

Agentic coding is a phenomenal experience. I’ve been working with AI coding agents for a year now, primarily using Claude Code after trying a variety of models. But there’s one well-known problem — they have a ‘context window’ - a limited memory area for what they’re holding in their ‘agentic minds’. Context is nearly everything because it guides them to approach the task from the proper perspective and captures the small details that wire projects together, keeping the process focused and relevant.

What constitutes context? It could be the code already written, high-level architectural ideas, pitfalls to avoid, or observations about the domain the problem is trying to solve. “Code” alone does not capture everything. My earlier approach was to write extensive specs and capture them as docs, which I had to constantly reference when working on a task. It wasn’t scaling, and it wasn’t enough. The agents frequently forgot key details I had decided were essential and wrote docs on, but they weren’t always bringing that knowledge to bear… creating weird spaghetti. Spaghetti that solved the problem at hand but was inconsistent with earlier pieces. I solved most of this by following rigid structures (like templates) for different parts of the codebase, which nearly solved the inconsistency issue, but not entirely.

Depending on the programming stack you use, there’s still information loss about the domain under consideration unless you meticulously document it in the project. The agent wasn’t really “learning”. This is not a meta discussion about what “learning” is and whether they really do. My insight was that they lack “knowledge” - the kind we bring to bear when we learn a codebase and the domain it’s operating in — the little trickles of insights gained that turn into a knowledge repository. What they needed was a way to build this “knowledge” themselves, through working on a project, and to bring it to bear on the problem at hand.

How did I solve this without agents before? The answer was actually so simple. I used structured notes - a scratchpad filled with entries about different aspects of the codebase or domain. For example, I documented data flow, persistence methods, and design choices-things that happen all the time in artisan coding . These notes acted as a personal memory aid, helping me recall critical details quickly and consistently during development.

But what do I mean by structured notes? You need to distinguish between free-form writing and structured writing. Free-form writing about a problem solved in a project would be like a couple of paragraphs explaining what was solved. But depending upon how good you are at keeping all relevant details, the free-form approach becomes lossy and inconsistent. So I follow a system where I design a template for each type of learning that I have. For instance, a “problem solved” template might capture the problem, the root cause, and what was learned. Or a “design principles” template might have fields for the principle itself (limited to 150 characters), the rationale behind it, and what anti-pattern it prevents. The template forces me to list the relevant details. And I design little templates to use for other things, like “patterns discovered” or “workarounds learned”. These are, by necessity, brief and distinct from project-level documentation. Having a structure enhances consistency and makes the gathered knowledge really valuable.

Now, how do I make the agents follow such a system? Define templates, and use them whenever the context dictates, preferably in a seamless way… like when I think and jot them down.

Pensieve logo

Enter Pensieve, a scratchpad for AI agents to use - a command-line interface (CLI) tool that agents can use to define templates and record notes in them and recall them as needed. Agents are very good at using CLI-based tools, and they automatically discovered how to use them with a little prompting. The agents designed templates in collaboration with me for each project I was working on and started filling them in as we tackled tasks together. And slowly, as we started new sessions, the agent started recollecting past learnings, informing the task at hand and making it more guided and precise than before. You no longer needed to yell the same information over and over.

But why exactly is this better than using CLAUDE.md or docs in the repo? It is because they are unstructured and the agents follow those instructions in an inconsistent manner. The more you have in this memory, the less likely the agent is going to find it and use it. Also, without specifying limits the agents generated verbose pointless documentation which they themselves did not follow. By forcing the agents to create structured notes with well-defined fields and LIMITING the length of them as needed, more disciplined knowledge capture happened.

Besides templates, Pensieve also encourages agents to link entries with each other. Nothing is lost forever — a note can be “augmented,” “superseded,” or “deprecated” by another note. In most cases this is sufficient, and the agent is encouraged to follow these links when inspecting a note. This allows the agent to crawl through memory AS NEEDED for the task at hand, digging deeper and deeper into memory, building a “graph of sorts” in its agentic mind.

You can check out demos of how an agent uses it. There’s also a standalone version of the CLI tool available, and if you’re using Claude Code, you can install it as a plugin from the Cittamaya marketplace.