2026-04-23 Paper Digest

Automated digest of 10 arXiv papers on agent / LLM / AI infra submitted in the last 24h, analysed with Claude Code.

1. Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models

arXiv: 2604.21896 · cs.AI · relevance score 23

Nemobot is an interactive agentic environment that uses LLMs to build and deploy game-playing agents across Shannon’s taxonomy, spanning dictionary-based, solvable, heuristic, and learning-based games, aiming toward self-programming AI.

Read detailed analysis →

2. Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows

arXiv: 2604.21816 · cs.AI · relevance score 23

Tool Attention is a middleware layer that replaces MCP’s eager schema injection with intent-gated, lazy schema loading — cutting per-turn tool tokens by 95% in simulation and arguing that protocol efficiency, not context length, is the real bottleneck for scalable agentic systems.

Read detailed analysis →

3. Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

arXiv: 2604.22085 · cs.AI · relevance score 20

Memanto is a memory layer for long-horizon LLM agents that replaces knowledge-graph pipelines with a typed semantic schema plus an information-theoretic retrieval engine, hitting 89.8% on LongMemEval and 87.1% on LoCoMo with single-query retrieval and no ingestion cost.

Read detailed analysis →

4. Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

arXiv: 2604.21536 · cs.IR · relevance score 20

The paper proposes a knowledge distillation method that transfers LLM-generated textual user profiles into sequential recommender systems, enhancing user semantic understanding without incurring LLM inference costs at serving time.

Read detailed analysis →

5. MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction

arXiv: 2604.21957 · cs.IT · relevance score 20

MambaCSP replaces Transformer/LLM backbones for channel state prediction with a hybrid Mamba SSM augmented by lightweight patch-mixer attention, achieving 9–12% accuracy gains and up to 3× throughput over LLM baselines in MISO-OFDM simulations.

Read detailed analysis →

6. Trust but Verify: Introducing DAVinCI – A Framework for Dual Attribution and Verification in Claim Inference for Language Models

arXiv: 2604.21193 · cs.AI · relevance score 20

DAVinCI is a two-stage framework that combines claim attribution (to internal model components and external sources) with entailment-based verification and confidence calibration, improving factual reliability of LLM outputs by 5–20% over verification-only baselines on FEVER and CLIMATE-FEVER.

Read detailed analysis →

7. Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework

arXiv: 2604.22119 · cs.AI · relevance score 19

This paper introduces ESRRSim, a taxonomy-driven agentic framework for evaluating Emergent Strategic Reasoning Risks (ESRRs) in LLMs—behaviors like deception, evaluation gaming, and reward hacking. Across 11 reasoning LLMs, detection rates vary from 14.45% to 72.72%.

Read detailed analysis →

8. Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching

arXiv: 2604.22061 · cs.CL · relevance score 19

该论文提出一种轻量级框架，结合 RAG 与 LLM 表征建模，用于可扩展的患者-临床试验匹配，在多个公开和真实临床数据集上以显著更低的计算代价达到与端到端 LLM 相当的性能。

Read detailed analysis →

9. LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs

arXiv: 2604.22050 · cs.LG · relevance score 19

LayerBoost is a layer-aware attention reduction method that uses sensitivity analysis to selectively apply softmax, linear sliding window, or no attention per layer, recovered via a lightweight 10M-token distillation. It improves throughput by up to 68% at high concurrency while preserving quality.

Read detailed analysis →

10. Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation

arXiv: 2604.21264 · cs.AI · relevance score 19

The paper proposes an LLM-enhanced Person-Job Fit (PJF) system combining chain-of-thought data augmentation for low-quality job descriptions with a category-aware Mixture of Experts module to better distinguish similar candidate-job pairs, yielding measurable gains in offline metrics and online A/B tests.

Read detailed analysis →