Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

arXiv: 2604.22085 · PDF

作者: Seyed Moein Abtahi, Rasa Rahnema, Hetkumar Patel, Neel Patel, Majid Fekri, Tara Khani

单位: Moorcheh AI, EdgeAI Innovations

主分类: cs.AI · 全部: cs.AI

命中关键词: large language model, agent, agentic, retrieval, inference, latency

TL;DR

Memanto 用 typed semantic memory + Moorcheh 信息论检索引擎，以单次查询、无 ingestion 延迟在 LongMemEval/LoCoMo 上取得 89.8% 与 87.1% SOTA。

核心观点

挑战"agent memory 必须依赖 knowledge graph 复杂度"的主流假设。
提出 Memanto：通用 agentic memory 层，单次检索即可达到 SOTA。
核心组件：13 类 typed semantic memory schema、自动冲突消解、temporal versioning。
底层使用 Moorcheh 的 Information Theoretic Search：no-indexing、sub-90ms 延迟、零 ingestion 延迟。

方法

构建统一的 typed semantic memory schema，预定义 13 个记忆类别，取代 LLM 驱动的 entity extraction 与显式 graph schema 维护。写入时支持自动冲突消解和时间版本化；检索端接 Moorcheh 的 no-indexing 信息论语义数据库，用单次 query 取回结果，省去 multi-query pipeline。

实验

Benchmark：LongMemEval 与 LoCoMo 两套长程记忆评测套件；对比对象覆盖现有 hybrid graph 系统与 vector-based memory 系统；指标为准确率，并辅以延迟（<90ms）、ingestion 成本、检索查询数等运维维度。额外做了五阶段渐进 ablation，量化每个架构组件的贡献。

结果

LongMemEval 89.8%、LoCoMo 87.1%，超过所有参评 hybrid graph 与 vector baseline；且仅需单次 retrieval query、零 ingestion 成本、运维复杂度显著更低；检索延迟稳定在 sub-90ms。

为什么重要

对 long-horizon agent / 多会话助手而言，memory 是当前部署瓶颈。Memanto 表明无需 graph 即可获得高保真记忆，简化 ingestion、降低延迟与成本，为生产级 agentic memory 提供更易规模化的替代栈。

与已有工作的关系

定位在近年 hybrid semantic graph memory（如 GraphRAG 类方案）与纯向量检索 memory 之间，延续 LongMemEval、LoCoMo 的长程记忆评测脉络，直接与 graph-based 和 vector-based agent memory 系统对标。

尚未回答的问题

13 类 schema 的覆盖度与跨领域泛化性如何。
Moorcheh 引擎在更大规模、更多并发下的扩展曲线未披露。
冲突消解在复杂时序语义下的失败模式与人类一致性评估缺失。
与实时工具使用、多模态记忆整合的兼容性尚待验证。

原始摘要（中文翻译）

从无状态的语言模型推理向持久化、多会话自主 agent 的转变，揭示了 memory 是生产级 agentic 系统部署中的主要架构瓶颈。现有方法大量依赖 hybrid semantic graph 架构，在 ingestion 和 retrieval 阶段都带来可观的计算开销。这类系统通常需要由大语言模型中介的实体抽取、显式的 graph schema 维护以及多查询检索流水线。本文提出 Memanto，一种面向 agentic 人工智能的通用 memory 层，挑战"必须依赖知识图谱的复杂度才能实现高保真 agent memory"这一主流假设。Memanto 集成了一个包含十三个预定义 memory 类别的 typed semantic memory schema、一个自动冲突消解机制以及 temporal versioning。这些组件由 Moorcheh 的 Information Theoretic Search 引擎支撑，该引擎是一种无需索引的语义数据库，可在亚九十毫秒延迟内提供确定性检索，同时消除 ingestion 延迟。通过在 LongMemEval 和 LoCoMo 评测套件上的系统基准测试，Memanto 分别取得了 89.8% 和 87.1% 的最先进准确率。这些结果超越了所有参评的 hybrid graph 与基于向量的系统，同时只需一次检索查询、无 ingestion 成本，并保持显著更低的运维复杂度。论文还给出了一个五阶段渐进消融研究以量化各架构组件的贡献，并讨论了其对可扩展部署 agentic memory 系统的意义。

论文图表

图 1: Page 2 (rendered)

图 1

图 2: Page 3 (rendered)

图 2

图 3: Page 4 (rendered)

图 3