构建永不遗忘的 Agents

A first-principles walk through agent memory: from Python lists to markdown files to vector search to graph-vector hybrids, and finally, a clean, open-source solution for all of this.

agent memory 的第一性原理之旅:从 Python lists 到 markdown files,到 vector search,到 graph-vector hybrids,最后是一个干净的、open-source 解决方案,涵盖所有这些。

Image

An LLM is stateless by design. Every API call starts fresh. The "memory" you feel when chatting with ChatGPT is an illusion created by re-sending the entire conversation history with every request.

LLM 按设计是无状态的。每个 API 调用都从头开始。你在与 ChatGPT 聊天时感受到的“记忆”是每次请求重新发送整个对话历史所创造的幻觉。

That trick works for casual chat. It falls apart the moment you try to build a real agent.

那个把戏在随意聊天中有效。一旦你试图构建一个真正的智能体,它就会失效。

Here are 7 failure modes show up the instant you skip memory:

以下是 7 种跳过内存立即出现的故障模式:

  1. Context amnesia: the agent asks for information you already gave it
  2. Context amnesia: 代理询问你已经提供给它的信息
  3. Zero personalization: every interaction feels generic
  4. 零个性化: 每次交互都感觉通用
  5. Multi-step task failure: intermediate state silently drops mid-task
  6. 多步骤任务失败: 中间状态在任务中途悄无声息地丢失
  7. Repeated mistakes: no episodic recall means the same errors, forever
  8. Repeated mistakes: 没有 episodic recall 意味着同样的错误,永远
  9. No knowledge accumulation: every session starts from scratch
  10. 无知识积累: 每个会话都从零开始
  11. Hallucination from gaps: when context overflows, the model invents
  12. 来自间隙的幻觉: 当上下文溢出时,模型会发明
  13. Identity collapse: no continuity, no trust
  14. 身份崩溃: 没有连续性,没有信任

The obvious response is "throw more context at it." That's why 128K and 200K token windows feel like they should solve everything.

显而易见的回应是“向它扔更多上下文”。这就是为什么 128K 和 200K 令牌窗口感觉它们应该解决一切。

They don't.

它们不会。

Accuracy drops over 30% when relevant information sits in the middle of a long context. This is the well-documented "lost in the middle" effect.

当相关信息位于长上下文中间时,准确率下降超过 30%。 这是众所周知的 "lost in the middle" 效应。

Context is a shared budget: system prompts, retrieved docs, conversation history, and output all fight for the same tokens.

Context 是一个共享预算:system prompts、retrieved docs、conversation history 和 output 都争夺相同的 tokens。

Even at 100K tokens, the abs...

开通本站会员,查看完整译文。

Home - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.1. UTC+08:00, 2026-04-24 23:54
浙ICP备14020137号-1 $Map of visitor$