2026-04-22 Paper Digest

Automated digest of 10 arXiv papers on agent / LLM / AI infra submitted in the last 24h, analysed with Claude Code.

1. SAKE: Self-aware Knowledge Exploitation-Exploration for Grounded Multimodal Named Entity Recognition

arXiv: 2604.20146 · cs.IR · relevance score 32

SAKE is an end-to-end agentic framework for Grounded Multimodal Named Entity Recognition (GMNER) that blends internal MLLM knowledge with external retrieval via self-aware reasoning, deciding when to invoke search tools to handle long-tailed and unseen entities on social media.

Read detailed analysis →

2. HaS: Accelerating RAG through Homology-Aware Speculative Retrieval

arXiv: 2604.20452 · cs.IR · relevance score 26

HaS accelerates Retrieval-Augmented Generation by speculatively retrieving from a restricted scope, then validating candidates via “homologous query re-identification” — checking whether the incoming query matches a previously-seen one. This bypasses full-database search for repeat-like queries, cutting latency 24–37% with 1–2% accuracy loss.

Read detailed analysis →

3. Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems

arXiv: 2604.20795 · cs.AI · relevance score 22

The paper proposes a hybrid architecture augmenting LLMs with an external RDF/OWL ontological memory layer, automatically constructed from heterogeneous sources, to enable persistent, verifiable, and semantically grounded reasoning beyond vector-based RAG.

Read detailed analysis →

4. Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models

arXiv: 2604.20994 · cs.CR · relevance score 21

This paper introduces Function Hijacking Attacks (FHA), a novel adversarial technique that manipulates agentic LLMs’ tool selection to force invocation of attacker-chosen functions, achieving 70-100% attack success rates across five models on the BFCL benchmark, largely independent of query semantics.

Read detailed analysis →

5. Cooperative Profiles Predict Multi-Agent LLM Team Performance in AI for Science Workflows

arXiv: 2604.20658 · cs.CL · relevance score 21

Authors benchmark 35 open-weight LLMs on six behavioral-economics games and show that the resulting “cooperative profiles” predict downstream team performance in AI-for-Science workflows under shared budget constraints, offering a cheap diagnostic for multi-agent deployment.

Read detailed analysis →

6. FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving

arXiv: 2604.20503 · cs.DC · relevance score 21

FASER is a fine-grained speculative-decoding scheduler for dynamic LLM serving that tunes speculative length per request, prunes rejected tokens early, and spatially overlaps draft and verification phases, yielding up to 53% higher throughput and 1.92× lower latency over SOTA in vLLM.

Read detailed analysis →

7. Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving

arXiv: 2604.20183 · cs.CL · relevance score 20

DCM-Agent is a training-free framework that resolves structural ambiguity in LLM-based optimization problem solving by maintaining dual clusters of historical solutions (modeling + coding), distilled into Approach/Checklist/Pitfall knowledge, and using them for memory-augmented inference.

Read detailed analysis →

8. EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation

arXiv: 2604.20133 · cs.AI · relevance score 19

EvoAgent is an evolvable LLM agent framework combining structured skill learning, hierarchical sub-agent delegation, and a three-layer memory. On real-world foreign-trade tasks with GPT5.2, it lifts a five-dimensional LLM-as-Judge score by ~28%.

Read detailed analysis →

9. Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction

arXiv: 2604.21154 · cs.AI · relevance score 19

Proposes a four-agent system that parses clinical notes, generates patient-specific exercise videos, tracks poses in real time, and delivers corrective feedback for at-home physiotherapy. The paper is largely architectural, presenting a prototype and evaluation plan rather than clinical results.

Read detailed analysis →

10. Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

arXiv: 2604.20987 · cs.AI · relevance score 19

COSPLAY is a co-evolution framework pairing an LLM decision agent with a learnable skill bank: the decision agent retrieves skills to act, while a skill-pipeline agent mines reusable skills from unlabeled rollouts. An 8B model beats four frontier LLM baselines by >25% average reward on six game environments.

Read detailed analysis →