Memory and Intelligence in LLM

Intelligence as shown by LLMs is interesting.

Here by memory I mean “context window size” though they are not equivalent.

And by “Intelligence” I don’t mean human intelligence which jumps into metaphysical debates. Perhaps “cognitive ability” to solve a task at hand is reasonable narrow definition.

At the advent of LLMs the available “context window”for an LLM was limited and took the longest to scale. This led to real funny problems when LLMs were tasked with long running conversations or fed documents that ate up their context window. Half way through implementation they forgot key details discussed earlier or declared partially complete tasks as 💯 done.

I noticed Gemini which was the first to introduce longer contexts actually fared poorly . The longer context simply led to “dilution” of attention to key details. And meandering.

Claude Opus though appears to have nailed it . The longer context helps it keep long running sessions on track with all the important learnings in “focus”. I suspect this is due to the nifty feature where it constantly does mini summaries called “Insights” and recaps within a session.

This feature pares beautifully with my pensieve Claude plugin for even longer running projects.