I know everybody seems to want the agent to remember every conversation they've ever had with it, but I just don't see the value in that. In fact, it seems to hurt productivity to have the agent second guessing me based on something I said yesterday. Every time I've used any memory system, the agent gets distracted from the current tasks based on previous conversations and branches of development...often comingling unrelated projects (I work on code for work, open source projects, a bunch of unrelated side projects, etc.) and trying to satisfy requirements that don't make sense.
I've stopped trying to achieve general "memory". I just ask the agent to thoroughly, but concisely, document each project. If it writes developer documentation and a development plan/roadmap, as though a person was going to have to get up to speed and start working on the project, it provides all the information the agent needs tomorrow or next week to pick up where we left off.
The agent is not my friend. I don't need it to remember my birthday or the nasty thing I said about React last week. I need it to document what anyone, agent or human, would need to know to get productive in a particular repo, with no previous knowledge of the project.
Good, concise, developer and user documentation and a plan with checklists solves every problem people seem to think "memory" will solve: It tells the agent what tech stack to use (we hashed it out in planning), it tells it what commands it needs to run and test the app, it covers the static analysis tools in use (which formalizes code style, etc. in a way a vague comment I made a month ago cannot), and it is cheap. Markdown files are the native tongue of agents. No MCP, no skills, no API needed. Just read the file. It works for any agent, any model, and any human just getting started with the project.
Basically, I think memory makes agents dumber and less useful. I want it to focus on the task at hand.
I prefer ticketing systems for AI. I dont care that it forgets what I did last week, I just need it to be able to compact its own memory and grab the next task once done.
I'm ambivalent about that. I've seen people use beads, and they're just making busy work for the agents, splitting stuff up into tiny tasks that could have been one-shotted as part of the larger plan. They seem to just enjoy making thinky machine go brrr, even when it makes the work take longer and burn a lot more tokens.
I tend to think developing with agents should look at lot like managing a human (like, I use feature-branch development with PRs and review them, even on my own projects that have no other devs and don't need a paper trail for security audit purposes), so I theoretically can get down with an issue based process, but thus far I haven't seen it done in a way that isn't just making busy work for agents.
It strikes me as funny how we want to get super AI inteligence but keep trying to anthropomorphizing all AI aspects to make it more "human". IMHO, if we keep doing it we will create Human AI with all errors and deficiencies humans have.
Do you think humans don't have perfect memory because it's hard to achieve and millions of years of evolution haven't been able to? Or because it's convenient to forget in order to prioritize the more important recent information?
It's obviously the latter, a system that 'remembers everything perfectly' is probably not optimal in most senses. Mortality is a property of both life and artificial systems, forcing the same retention policy on new information and old information probably does so at the expense of lifespan or stability.
I only use a decay function to see how "hot" a chunk is - not for forgetting old ones. What concerns me more are memory chunks with errors in them - they need to be corrected/removed by some other mechanism, not by decay (since they might get retrieved often).
on the other "biological memory" post in so many weeks, I pointed out that the decay rate shouldn't be based on a real clock but a lifetime of it's use within the coding session. Elsewise your memory fades even when there's no process change (eg, coder goes on vacation). I'm not going to check whether thats true here, but it seems like a naive first assumption thats failed conceptualization.
The other comment is that spatial memory is probably a better trigger for memory, so if you're not tracking where the coding session starts, the folders it's visits, etc, then you're not really providing a good associative footpath for the assistant to retrieve whats important for any given project.
And a neural network is really just a composed, non-linear parameterized function that maps input vectors to output vectors. Sometimes metaphors or analogies do contribute something valuable.
I've stopped trying to achieve general "memory". I just ask the agent to thoroughly, but concisely, document each project. If it writes developer documentation and a development plan/roadmap, as though a person was going to have to get up to speed and start working on the project, it provides all the information the agent needs tomorrow or next week to pick up where we left off.
The agent is not my friend. I don't need it to remember my birthday or the nasty thing I said about React last week. I need it to document what anyone, agent or human, would need to know to get productive in a particular repo, with no previous knowledge of the project.
Good, concise, developer and user documentation and a plan with checklists solves every problem people seem to think "memory" will solve: It tells the agent what tech stack to use (we hashed it out in planning), it tells it what commands it needs to run and test the app, it covers the static analysis tools in use (which formalizes code style, etc. in a way a vague comment I made a month ago cannot), and it is cheap. Markdown files are the native tongue of agents. No MCP, no skills, no API needed. Just read the file. It works for any agent, any model, and any human just getting started with the project.
Basically, I think memory makes agents dumber and less useful. I want it to focus on the task at hand.
I tend to think developing with agents should look at lot like managing a human (like, I use feature-branch development with PRs and review them, even on my own projects that have no other devs and don't need a paper trail for security audit purposes), so I theoretically can get down with an issue based process, but thus far I haven't seen it done in a way that isn't just making busy work for agents.
me: "Hi AI, can you debug this SQL Statement?"
ai: "Well,based on your passion for garden hoses and extensive research of refrigerators, I'm going to guess you really want to discuss that"
It's obviously the latter, a system that 'remembers everything perfectly' is probably not optimal in most senses. Mortality is a property of both life and artificial systems, forcing the same retention policy on new information and old information probably does so at the expense of lifespan or stability.
What I do now is preserve all my claude code conversations and set the context from there.
This allows me to curate memory and it’s been the best way so far.
The other comment is that spatial memory is probably a better trigger for memory, so if you're not tracking where the coding session starts, the folders it's visits, etc, then you're not really providing a good associative footpath for the assistant to retrieve whats important for any given project.
You said it cuts token usage by 84% but isn't that typical for any typical chunked RAG system?
And why did you specifically chose to test against the LoMoCo dataset when there's a lot of issues with it and it being very easy to cheat?
https://en.wikipedia.org/wiki/Forgetting_curve