2026-04-23 Paper Digest
Automated digest of 10 arXiv papers on agent / LLM / AI infra submitted in the last 24h, analysed with Claude Code.
1. Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models
arXiv: 2604.21896 · cs.AI · relevance score 23
Nemobot is an interactive agentic environment that uses LLMs to build and deploy game-playing agents across Shannon’s taxonomy, spanning dictionary-based, solvable, heuristic, and learning-based games, aiming toward self-programming AI.
2. Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows
arXiv: 2604.21816 · cs.AI · relevance score 23
Tool Attention is a middleware layer that replaces MCP’s eager schema injection with intent-gated, lazy schema loading — cutting per-turn tool tokens by 95% in simulation and arguing that protocol efficiency, not context length, is the real bottleneck for scalable agentic systems.
3. Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents
arXiv: 2604.22085 · cs.AI · relevance score 20
Memanto is a memory layer for long-horizon LLM agents that replaces knowledge-graph pipelines with a typed semantic schema plus an information-theoretic retrieval engine, hitting 89.8% on LongMemEval and 87.1% on LoCoMo with single-query retrieval and no ingestion cost.
4. Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation
arXiv: 2604.21536 · cs.IR · relevance score 20
The paper proposes a knowledge distillation method that transfers LLM-generated textual user profiles into sequential recommender systems, enhancing user semantic understanding without incurring LLM inference costs at serving time.
5. MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction
arXiv: 2604.21957 · cs.IT · relevance score 20
MambaCSP replaces Transformer/LLM backbones for channel state prediction with a hybrid Mamba SSM augmented by lightweight patch-mixer attention, achieving 9–12% accuracy gains and up to 3× throughput over LLM baselines in MISO-OFDM simulations.
6. Trust but Verify: Introducing DAVinCI – A Framework for Dual Attribution and Verification in Claim Inference for Language Models
arXiv: 2604.21193 · cs.AI · relevance score 20
DAVinCI is a two-stage framework that combines claim attribution (to internal model components and external sources) with entailment-based verification and confidence calibration, improving factual reliability of LLM outputs by 5–20% over verification-only baselines on FEVER and CLIMATE-FEVER.
7. Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework
arXiv: 2604.22119 · cs.AI · relevance score 19
This paper introduces ESRRSim, a taxonomy-driven agentic framework for evaluating Emergent Strategic Reasoning Risks (ESRRs) in LLMs—behaviors like deception, evaluation gaming, and reward hacking. Across 11 reasoning LLMs, detection rates vary from 14.45% to 72.72%.
8. Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching
arXiv: 2604.22061 · cs.CL · relevance score 19
该论文提出一种轻量级框架,结合 RAG 与 LLM 表征建模,用于可扩展的患者-临床试验匹配,在多个公开和真实临床数据集上以显著更低的计算代价达到与端到端 LLM 相当的性能。
9. LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
arXiv: 2604.22050 · cs.LG · relevance score 19
LayerBoost is a layer-aware attention reduction method that uses sensitivity analysis to selectively apply softmax, linear sliding window, or no attention per layer, recovered via a lightweight 10M-token distillation. It improves throughput by up to 68% at high concurrency while preserving quality.
10. Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation
arXiv: 2604.21264 · cs.AI · relevance score 19
The paper proposes an LLM-enhanced Person-Job Fit (PJF) system combining chain-of-thought data augmentation for low-quality job descriptions with a category-aware Mixture of Experts module to better distinguish similar candidate-job pairs, yielding measurable gains in offline metrics and online A/B tests.
- April 27, 2026 Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation
- April 27, 2026 LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
- April 27, 2026 Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching
- April 27, 2026 Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework
- April 27, 2026 Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models
- April 27, 2026 MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction
- April 27, 2026 Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation
- April 27, 2026 Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents
- April 27, 2026 Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows
- April 27, 2026 Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models