2026-04-28 Paper Digest

206 arXiv papers on agent / LLM / AI infra submitted that day matched our topic filter. 10 were hand-picked by Claude — using title + authors + affiliations — and received a full Claude-generated analysis; the remaining 196 are listed at the bottom.

1. FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training

arXiv: 2604.24013 · cs.LG · Claude pick

FlashOverlap 将 Reduce-Scatter 与 All-Gather 分解为异步 P2P 通信，并按 rank 自适应调度分片计算，使最后一块数据的计算不再依赖通信，从而消除数据切分类方案的 tail latency，在 TP=4、(b,s,d)=(32,4096,4096) 的 MLP 上把通信开销从 43.8 ms 降至 0.1 ms（99.8% 削减）。

Read detailed analysis →

2. Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling

arXiv: 2604.24715 · cs.CL · Claude pick

HyLo 是一套将预训练 Transformer 升级（upcycle）为 MLA + Mamba2/GDN 混合长上下文模型的训练配方，通过分阶段长上下文训练与教师蒸馏，把可用上下文扩展至 32×、KV cache 降低 >90%，在 RULER 上显著超越 Zebra-Llama 等现有升级基线。

Read detailed analysis →

3. The Chameleon’s Limit: Investigating Persona Collapse and Homogenization in Large Language Models

arXiv: 2604.24698 · cs.CL · Claude pick

Ten LLMs asked to role-play 1,144 richly specified personas collapse into a narrow behavioral mode — agents converge despite distinct profiles. A geometric framework (Coverage, Uniformity, Complexity on a Behavioral Trait Matrix) plus item-level diagnostics shows collapse is multi-axis and task-contingent, and that the highest-fidelity models produce the most stereotyped populations.

Read detailed analysis →

4. Stabilizing Efficient Reasoning with Step-Level Advantage Selection

arXiv: 2604.24003 · cs.CL · Claude pick

Step-level Advantage Selection (SAS) zeros advantages for low-confidence steps in correct GRPO rollouts and high-confidence steps in verifier-failed rollouts, stabilizing short-context post-training. On five math benchmarks it lifts Pass@1 by 0.86 points over the strongest length-aware baseline while cutting reasoning length by 16.3%.

Read detailed analysis →

5. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

arXiv: 2604.24443 · cs.AI · Claude pick

Vision-Language Models (VLMs) have demonstrated strong performance on textbook-style physics problems, yet they frequently fail when confronted with dynamic real-world scenarios that require temporal consistency and causal reasoning across frames. We identify two fundamental challenges underlying these failures: (1) spatio-temporal identity drift, where objects lose their physical identity across successive frames and break causal chains, and (2) volatility of inference-time insights, where a model may occasionally produce correct physical reasoning but never consolidates it for future reuse.

Read detailed analysis →

6. Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

arXiv: 2604.24062 · cs.AI · Claude pick

Using the OpenLock paradigm, the authors show that four frontier models (GPT-5.2, Claude-4.5-Sonnet, Gemini-3-Flash, DeepSeek-V3.2) can discover causal structures as efficiently as humans in text, but—unlike humans—fail to transfer Common Cause / Common Effect schemas to new environments until after an initial grounding solution, and are hurt rather than helped by visual input.

Read detailed analysis →

7. Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

arXiv: 2604.24512 · cs.AI · Claude pick

The paper formalizes the Attention Latch — a failure where multi-turn LLM agents stay anchored to stale goals — and proposes SSRP, an Architect/Executive split that auto-synthesizes per-task SOPs. On MultiWOZ 2.2 (9K trajectories), SSRP lifts GPT-5.4 from 0.1% to 71.6% on 3-hop semantic hijacking.

Read detailed analysis →

8. DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference

arXiv: 2604.24647 · cs.CL · Claude pick

DepthKV reallocates a fixed global KV-cache budget non-uniformly across transformer layers based on per-layer sensitivity to pruning, using InfoNCE-derived importance scores. At 60% global pruning, it consistently beats uniform pruning (e.g., H₂O) across summarization, QA, and GSM-∞ reasoning on Gemma-7B, LLaMA-3.1-8B, and Qwen2.5-7B.

Read detailed analysis →

9. BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

arXiv: 2604.24273 · cs.LG · Claude pick

BitRL freezes a 2B-parameter BitNet b1.58 backbone (ternary weights {−1,0,+1}) and trains only small (~50K-param) PPO policy/value heads, yielding RL agents that retain 85–98% of FP16 performance with 10–16× memory reduction and 3–5× energy savings on a Raspberry Pi 4.

Read detailed analysis →

10. AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

arXiv: 2604.24039 · cs.LG · Claude pick

AgenticCache caches 2-gram plan transitions for LLM-driven embodied agents, serving most planning decisions from a local cache while a background LLM updater asynchronously validates and corrects entries. Across 4 multi-agent benchmarks × 3 GPT-5 scales, it lifts success rate by 22% on average, cuts latency 65%, and reduces tokens 50%.

Read detailed analysis →

Other matched papers

These papers matched the same topic keywords but were not among Claude’s top-N deep-analysis picks.

Green Shielding: A User-Centric Approach Towards Trustworthy AI · cs.CL · arXiv 2604.24700 · score 27 — large language model, llm, agent, agentic, rag, serving
EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce · cs.CL · arXiv 2604.23993 · score 27 — llm, agent, agentic, multi-agent, retrieval, reasoning
JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training · cs.LG · arXiv 2604.23838 · score 25 — llm, agent, agentic, rag, parallelism, gpu
Kwai Summary Attention Technical Report · cs.CL · arXiv 2604.24432 · score 24 — large language model, agent, agentic, reasoning, inference, kv cache
Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing · cs.CR · arXiv 2604.24162 · score 24 — large language model, llm, rag, reasoning, inference, serving
RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation · cs.SE · arXiv 2604.24218 · score 23 — large language model, llm, agent, agentic, multi-agent, rag
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis · cs.CL · arXiv 2604.24198 · score 26 — large language model, llm, agent, agentic, reasoning, inference
Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery · cs.SE · arXiv 2604.23940 · score 21 — llm, agent, agentic, multi-agent, rag, compiler
GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems · cs.CR · arXiv 2604.24477 · score 20 — large language model, llm, agent, multi-agent, inference
Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing · cs.CR · arXiv 2604.24203 · score 20 — llm, agent, agentic, rag, reasoning, serving
Skill Retrieval Augmentation for Agentic AI · cs.CL · arXiv 2604.24594 · score 19 — large language model, llm, agent, agentic, retrieval
DPEPO: Diverse Parallel Exploration Policy Optimization for LLM-based Agents · cs.CL · arXiv 2604.24320 · score 19 — large language model, llm, agent, rag, reasoning, fine-tun
FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data · cs.AI · arXiv 2604.24572 · score 18 — llm, agent, agentic, multi-agent, reasoning
Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus · cs.AI · arXiv 2604.24473 · score 26 — llm, agent, agentic, retrieval, rag, reasoning
Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study · cs.SE · arXiv 2604.24678 · score 21 — large language model, llm, rag, serving, fine-tun
Strategic Bidding in 6G Spectrum Auctions with Large Language Models · cs.GT · arXiv 2604.24156 · score 17 — large language model, llm, agent, rag, reasoning
MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation · cs.SE · arXiv 2604.24222 · score 24 — large language model, llm, retrieval, rag, inference
Latency and Cost of Multi-Agent Intelligent Tutoring at Scale · cs.CY · arXiv 2604.24110 · score 16 — llm, agent, multi-agent, throughput, latency
The Pragmatic Persona: Discovering LLM Persona through Bridging Inference · cs.CL · arXiv 2604.24079 · score 20 — large language model, llm, rag, reasoning, inference
LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People · cs.AI · arXiv 2604.23970 · score 16 — llm, agent, agentic, multi-agent
XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation · cs.AI · arXiv 2604.24623 · score 15 — large language model, llm, retrieval, rag, reasoning
SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering · cs.CL · arXiv 2604.24515 · score 15 — large language model, llm, retrieval, reasoning, fine-tun
OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents · cs.CL · arXiv 2604.24348 · score 15 — large language model, llm, agent, latency
Generating Place-Based Compromises Between Two Points of View · cs.CL · arXiv 2604.24536 · score 14 — large language model, llm, reasoning, inference
SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution · cs.CL · arXiv 2604.24372 · score 14 — llm, agent, retrieval, ai system
ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems · cs.AI · arXiv 2604.23878 · score 14 — llm, agent, rag, ai system
Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models · cs.IR · arXiv 2604.24608 · score 13 — large language model, llm, rag, attention
MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG · cs.CL · arXiv 2604.24564 · score 13 — large language model, llm, retrieval, rag
Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations · cs.AI · arXiv 2604.24562 · score 13 — large language model, llm, rag, reasoning
Why AI Harms Can’t Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality · cs.CY · arXiv 2604.24519 · score 13 — large language model, llm, ai system
A Multi-Dimensional Audit of Politically Aligned Large Language Models · cs.CL · arXiv 2604.24429 · score 13 — large language model, llm, reasoning, fine-tun
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion · cs.LG · arXiv 2604.24351 · score 13 — rag, inference, serving, kv-cache
MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning · cs.CL · arXiv 2604.24186 · score 13 — large language model, llm, rag, reasoning
Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels · cs.LG · arXiv 2604.24008 · score 13 — large language model, rag, quantization, gpu, post-train
Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning · cs.LG · arXiv 2604.23987 · score 13 — large language model, llm, rag, fine-tun
What Did They Mean? How LLMs Resolve Ambiguous Social Situations across Perspectives and Roles · cs.HC · arXiv 2604.23942 · score 13 — large language model, llm, serving
Generative Synthetic Data for Causal Inference: Pitfalls, Remedies, and Opportunities · stat.ME · arXiv 2604.23904 · score 13 — llm, rag, inference, serving
Evaluation of Prompt Injection Defenses in Large Language Models · cs.CR · arXiv 2604.23887 · score 13 — large language model, llm, ai system
Knowledge Vector of Logical Reasoning in Large Language Models · cs.CL · arXiv 2604.23877 · score 13 — large language model, llm, rag, reasoning
Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models · cs.LG · arXiv 2604.24708 · score 12 — rag, serving, attention, gpu, scheduler
Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft · cs.AI · arXiv 2604.24697 · score 12 — llm, agent, ai system
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications · cs.AI · arXiv 2604.24668 · score 12 — llm, agent, agentic
Evaluating whether AI models would sabotage AI safety research · cs.AI · arXiv 2604.24618 · score 12 — llm, agent, rag, reasoning
Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models · cs.CR · arXiv 2604.24542 · score 12 — large language model, llm, inference
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills · cs.CL · arXiv 2604.24026 · score 12 — llm, agent, rag, reasoning
QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems · cs.AI · arXiv 2604.24021 · score 12 — llm, multi-agent, ai system
Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction · cs.LG · arXiv 2604.23989 · score 12 — large language model, llm, inference
TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment · cs.CL · arXiv 2604.23938 · score 12 — agent, agentic, multi-agent
Inverting Foundation Models of Brain Function with Simulation-Based Inference · cs.LG · arXiv 2604.23865 · score 12 — large language model, llm, inference
Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination · cs.CL · arXiv 2604.24690 · score 11 — large language model, llm, reasoning
Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation · cs.CL · arXiv 2604.24665 · score 11 — large language model, llm, reasoning
K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology · cs.CL · arXiv 2604.24645 · score 11 — large language model, agent, reasoning
STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator · cs.AI · arXiv 2604.24544 · score 11 — large language model, llm, rag
A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations · cs.CR · arXiv 2604.24468 · score 11 — large language model, llm, fine-tun
Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation · cs.CL · arXiv 2604.24361 · score 11 — large language model, llm, rag
AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models · cs.CL · arXiv 2604.24175 · score 11 — large language model, llm, reasoning
TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training · cs.DC · arXiv 2604.24088 · score 11 — llm, parallelism, quantization, throughput
QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering · cs.CV · arXiv 2604.24052 · score 11 — large language model, llm, rag
Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs · cs.AI · arXiv 2604.23949 · score 11 — large language model, llm, rag
SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems · cs.CR · arXiv 2604.23905 · score 11 — llm, retrieval, inference, fine-tun
LLM-Augmented Traffic Signal Control with LSTM-Based Traffic State Prediction and Safety-Constrained Decision Support · cs.AI · arXiv 2604.23902 · score 11 — large language model, llm, reasoning
ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation · cs.AI · arXiv 2604.23853 · score 11 — llm, agent, tool use
One Size Fits None: Heuristic Collapse in LLM Investment Advice · cs.CL · arXiv 2604.23837 · score 11 — large language model, llm, reasoning
Resource-Lean Lexicon Induction for German Dialects · cs.CL · arXiv 2604.23824 · score 11 — large language model, llm, retrieval
Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters · cs.AI · arXiv 2604.24710 · score 10 — llm, agent, rag
GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility · cs.LG · arXiv 2604.24549 · score 10 — agent, multi-agent, gpu
DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models · cs.LG · arXiv 2604.24357 · score 10 — llm, rag, reasoning, post-train
The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers · cs.CY · arXiv 2604.24155 · score 10 — agent, reasoning, ai system
Improving Robustness of Tabular Retrieval via Representational Stability · cs.CL · arXiv 2604.24040 · score 10 — retrieval, rag, serving, transformer
Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents · cs.AI · arXiv 2604.23990 · score 10 — agent, rag, serving
GAMED.AI: A Hierarchical Multi-Agent Framework for Automated Educational Game Generation · cs.AI · arXiv 2604.23947 · score 10 — agent, multi-agent, reasoning
Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis · cs.SE · arXiv 2604.24703 · score 9 — large language model, llm
AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents · cs.CR · arXiv 2604.24657 · score 9 — large language model, agent
Zero-shot Large Language Models for Automatic Readability Assessment · cs.CL · arXiv 2604.24470 · score 9 — large language model, llm
Can You Make It Sound Like You? Post-Editing LLM-Generated Text for Personal Style · cs.CL · arXiv 2604.24444 · score 9 — large language model, llm
Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment · cs.LG · arXiv 2604.24178 · score 9 — large language model, llm
Progressive Approximation in Deep Residual Networks: Theory and Validation · cs.LG · arXiv 2604.24154 · score 9 — llm, inference, transformer
An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress · cs.AI · arXiv 2604.24076 · score 9 — large language model, llm
A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees · cs.AI · arXiv 2604.24043 · score 9 — large language model, llm
Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents · cs.CR · arXiv 2604.24020 · score 9 — agent, rag, inference
IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models · cs.HC · arXiv 2604.24002 · score 9 — agent, reasoning, inference
When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models · cs.LG · arXiv 2604.23994 · score 9 — llm, inference, attention
Representational Curvature Modulates Behavioral Uncertainty in Large Language Models · cs.AI · arXiv 2604.23985 · score 9 — large language model, llm
Translate or Simplify First: An Analysis of Cross-lingual Text Simplification in English and French · cs.CL · arXiv 2604.23844 · score 9 — large language model, llm
Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs · cs.LG · arXiv 2604.23841 · score 9 — agent, inference, latency
Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application · cs.SE · arXiv 2604.24636 · score 8 — llm, rag, latency
Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence · cs.AI · arXiv 2604.24527 · score 8 — agent, ai system
Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales · cs.HC · arXiv 2604.24461 · score 8 — llm, agent
Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment · cs.RO · arXiv 2604.24447 · score 8 — inference, parallelism, gpu
Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering · cs.CL · arXiv 2604.24334 · score 8 — retrieval, rag, serving
Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU · cs.AI · arXiv 2604.24219 · score 8 — llm, retrieval, latency
Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems · cs.AI · arXiv 2604.24153 · score 8 — serving, ai system
An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources · cs.AI · arXiv 2604.24117 · score 8 — agent, multi-agent
FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost · cs.LG · arXiv 2604.24073 · score 8 — rag, distributed training, gpu
DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery · cs.CV · arXiv 2604.24029 · score 8 — retrieval, reasoning, chain-of-thought, fine-tun
Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions · cs.AI · arXiv 2604.23924 · score 8 — agent, agentic
MarketBench: Evaluating AI Agents as Market Participants · cs.AI · arXiv 2604.23897 · score 8 — llm, agent
Geometry Preserving Loss Functions Promote Improved Adaptation of Blackbox Generative Model · cs.LG · arXiv 2604.23888 · score 8 — rag, serving, fine-tun
Graph Memory Transformer (GMT) · cs.LG · arXiv 2604.23862 · score 8 — serving, attention, transformer
Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification · cs.AI · arXiv 2604.23854 · score 8 — serving, attention, fine-tun
The Last Human-Written Paper: Agent-Native Research Artifacts · cs.LG · arXiv 2604.24658 · score 7 — agent, compiler
CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies · cs.CV · arXiv 2604.24622 · score 7 — rag, inference, latency
Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation · cs.CV · arXiv 2604.24396 · score 7 — rag, inference, attention
AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation · cs.RO · arXiv 2604.24086 · score 7 — inference, latency, fine-tun
Architectural Isolation as a Timing Safety Primitive for Edge AI Medical Devices: Controlled Experimental Evidence on a Shared-Silicon Platform · cs.AR · arXiv 2604.23831 · score 7 — inference, gpu, latency
Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents · cs.AI · arXiv 2604.24686 · score 6 — agent, rag
Meta-CoT: Enhancing Granularity and Generalization in Image Editing · cs.CV · arXiv 2604.24625 · score 6 — rag, reasoning, chain-of-thought
GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation · quant-ph · arXiv 2604.24551 · score 6 — rag, serving
Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI · cs.CV · arXiv 2604.24492 · score 6 — latency, fine-tun, post-train
Incisor: Ex Ante Cloud Instance Selection for HPC Jobs · cs.DC · arXiv 2604.24464 · score 6 — llm, reasoning
Certified geometric robustness – Super-DeepG · cs.AI · arXiv 2604.24379 · score 6 — rag, reasoning, gpu
Learning Evidence of Depression Symptoms via Prompt Induction · cs.CL · arXiv 2604.24376 · score 6 — llm, fine-tun
SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation · cs.LG · arXiv 2604.24368 · score 6 — llm, rag
SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting · cs.LG · arXiv 2604.24306 · score 6 — rag, attention, transformer
Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop · cs.AI · arXiv 2604.24158 · score 6 — llm, reasoning
Leveraging Human Feedback for Semantically-Relevant Skill Discovery · cs.LG · arXiv 2604.24127 · score 6 — agent, rag
Psychologically-Grounded Graph Modeling for Interpretable Depression Detection · cs.CL · arXiv 2604.24126 · score 6 — llm, attention
Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising · cs.CL · arXiv 2604.24104 · score 6 — llm, fine-tun
Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B · cs.CL · arXiv 2604.24070 · score 6 — llm, fine-tun
TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents · cs.LG · arXiv 2604.24005 · score 6 — agent, reasoning
Hindsight Preference Optimization for Financial Time Series Advisory · cs.LG · arXiv 2604.23988 · score 6 — llm, reasoning
Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity · cs.CL · arXiv 2604.23972 · score 6 — llm, reasoning
Do Quantum Transformers Help? A Systematic VQC Architecture Comparison on Tabular Benchmarks · quant-ph · arXiv 2604.23931 · score 6 — rag, attention, transformer
Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering · cs.LG · arXiv 2604.23912 · score 6 — rag, serving
Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows · cs.CL · arXiv 2604.23855 · score 6 — llm, rag
Contextual Linear Activation Steering of Language Models · cs.CL · arXiv 2604.24693 · score 5 — large language model
MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining · cs.CL · arXiv 2604.24374 · score 5 — inference, attention
Machine-Learning-Based Classification of Radio Frequency Building Loss · cs.LG · arXiv 2604.24143 · score 5 — rag, inference
PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality · cs.CL · arXiv 2604.24071 · score 5 — large language model
Integrative neurocybernetic modeling in the era of large-scale neuroscience · q-bio.NC · arXiv 2604.23903 · score 5 — rag, inference
Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes · cs.LG · arXiv 2604.24745 · score 4 — serving
Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling · cs.AI · arXiv 2604.24717 · score 4 — attention, transformer
Fraud Detection in Cryptocurrency Markets with Spatio-Temporal Graph Neural Networks · cs.LG · arXiv 2604.24590 · score 4 — attention, transformer
A systematic evaluation of vision-language models for observational astronomical reasoning tasks · cs.AI · arXiv 2604.24589 · score 4 — reasoning, attention
Dialysis Risk Prediction and Treatment Effect Estimation for AKI patients using Longitudinal Electronic Health Records · cs.LG · arXiv 2604.24547 · score 4 — rag, transformer
Understanding the Limits of Automated Evaluation for Code Review Bots in Practice · cs.SE · arXiv 2604.24525 · score 4 — llm
ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet Data · cs.CV · arXiv 2604.24353 · score 4 — attention, transformer
Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training · cs.LG · arXiv 2604.24350 · score 4 — rag, attention
Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions · eess.IV · arXiv 2604.24347 · score 4 — rag, transformer
Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems · cs.DC · arXiv 2604.24340 · score 4 — rag, latency
Perfecting Aircraft Maneuvers with Reinforcement Learning · cs.LG · arXiv 2604.24338 · score 4 — agent
X-NegoBox: An Explainable Privacy-Budget Negotiation Framework for Secure Peer-to-Peer Energy Data Exchange · cs.CR · arXiv 2604.24326 · score 4 — serving
Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer · cs.CL · arXiv 2604.24302 · score 4 — retrieval, reasoning
Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions · cs.LG · arXiv 2604.24293 · score 4 — serving
RowHammer Vulnerability Counter (RVC): Redefining RowHammer Detection with Victim-Centric Tracking · cs.CR · arXiv 2604.24287 · score 4 — rag, latency
Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring · eess.IV · arXiv 2604.24236 · score 4 — rag, transformer
CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification · cs.LG · arXiv 2604.24201 · score 4 — rag, fine-tun
IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning · cs.CL · arXiv 2604.24114 · score 4 — reasoning, fine-tun
BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning · cs.CL · arXiv 2604.24089 · score 4 — serving
How Sensitive Are Safety Benchmarks to Judge Configuration Choices? · cs.CL · arXiv 2604.24074 · score 4 — llm
AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment · cs.AI · arXiv 2604.24038 · score 4 — agent
FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection · cs.LG · arXiv 2604.24012 · score 4 — serving
Adaptive-Distribution Randomized Neural Networks for PDEs: A Low-Dimensional Distribution-Learning Framework · math.NA · arXiv 2604.23999 · score 4 — serving
DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting · cs.LG · arXiv 2604.23968 · score 4 — attention, transformer
Crystal structure prediction using graph neural combinatorial optimization · cs.LG · arXiv 2604.23921 · score 4 — rag, gpu
Cardiac Stability Theory: An Axiomatically Grounded Framework for Continuous Cardiac Health Monitoring via Smartphone Photoplethysmography · cs.LG · arXiv 2604.23876 · score 4 — transformer, latency
Exploring Audio Hallucination in Egocentric Video Understanding · cs.CV · arXiv 2604.23860 · score 4 — llm
Focus on What Matters: Two-Stage ROI-Aware Refinement for Anatomy-Preserving Fetal Ultrasound Reconstruction · cs.CV · arXiv 2604.23839 · score 4 — serving
Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks · cs.LG · arXiv 2604.24637 · score 3 — inference
MIMIC: A Generative Multimodal Foundation Model for Biomolecules · cs.AI · arXiv 2604.24506 · score 3 — inference
Compilation and Execution of an Embeddable YOLO-NAS on the VTA · cs.AR · arXiv 2604.24455 · score 3 — compiler
SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors · cs.RO · arXiv 2604.24449 · score 3 — inference
Scaling Properties of Continuous Diffusion Spoken Language Models · cs.CL · arXiv 2604.24416 · score 3 — inference
Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach · cs.LG · arXiv 2604.24280 · score 3 — inference
Speech Enhancement Based on Drifting Models · cs.SD · arXiv 2604.24199 · score 3 — inference
Learning to Think from Multiple Thinkers · cs.LG · arXiv 2604.24737 · score 2 — chain-of-thought
Déjà Vu Packing: Optimizing FPGA Logic Clustering Runtime via Pattern Memoization · cs.AR · arXiv 2604.24649 · score 2 — rag
NeSyCat: A Monad-Based Categorical Semantics of the Neurosymbolic ULLER Framework · cs.AI · arXiv 2604.24612 · score 2 — reasoning
Hierarchical Behaviour Spaces · cs.AI · arXiv 2604.24558 · score 2 — reasoning
SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances · cs.DC · arXiv 2604.24548 · score 2 — rag
A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning · cs.LG · arXiv 2604.24532 · score 2 — rag
SceneSelect: Selective Learning for Trajectory Scene Classification and Expert Scheduling · cs.LG · arXiv 2604.24514 · score 2 — rag
Modeling Behavioral Intensity and Transitions for Generative Recommendation · cs.IR · arXiv 2604.24472 · score 2 — attention
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation · cs.SD · arXiv 2604.24401 · score 2 — rag
Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware · quant-ph · arXiv 2604.24397 · score 2 — fine-tun
PathMoG: A Pathway-Centric Modular Graph Neural Network for Multi-Omics Survival Prediction · cs.LG · arXiv 2604.24371 · score 2 — attention
See Further, Think Deeper: Advancing VLM’s Reasoning Ability with Low-level Visual Cues and Reflection · cs.CV · arXiv 2604.24339 · score 2 — reasoning
Mitigating Error Amplification in Fast Adversarial Training · cs.LG · arXiv 2604.24332 · score 2 — rag
Unconstrained Multi-view Human Pose Estimation with Algebraic Priors · cs.CV · arXiv 2604.24312 · score 2 — transformer
IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting · cs.LG · arXiv 2604.24224 · score 2 — attention
Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk · cs.CL · arXiv 2604.24197 · score 2 — reasoning
MemeScouts@LT-EDI 2026: Asking the Right Questions – Prompted Weak Supervision for Meme Hate Speech Detection · cs.CL · arXiv 2604.24179 · score 2 — reasoning
A Divergence-Based Method for Weighting and Averaging Model Predictions · stat.ML · arXiv 2604.24172 · score 2 — rag
Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales · cs.DC · arXiv 2604.24091 · score 2 — rag
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws · cs.LG · arXiv 2604.24037 · score 2 — rag
KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances · cs.DC · arXiv 2604.24027 · score 2 — rag
Geometry-Aware Offline-to-Online Learning in Linear Contextual Bandits · cs.LG · arXiv 2604.24016 · score 2 — rag
SDSL-Solver: Scalable Distributed Sparse Linear Solvers for Large-Scale Interior Point Methods · cs.DC · arXiv 2604.23979 · score 2 — rag
Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction · cs.LG · arXiv 2604.23964 · score 2 — attention
Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach · cs.CV · arXiv 2604.23953 · score 2 — rag
KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters · cs.CL · arXiv 2604.23948 · score 2 — rag
Sliced-Regularized Optimal Transport · stat.ML · arXiv 2604.23944 · score 2 — rag
Quasi-Quadratic Gradient: A New Direction for Accelerating the BFGS Method in Quasi-Newton Optimization · math.OC · arXiv 2604.23922 · score 2 — rag
Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia’s National Electricity Market · cs.LG · arXiv 2604.23908 · score 2 — transformer
Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity · cs.LG · arXiv 2604.23867 · score 2 — rag
Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features · cs.AI · arXiv 2604.23829 · score 2 — reasoning

April 28, 2026 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
April 28, 2026 BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment
April 28, 2026 DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
April 28, 2026 Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
April 28, 2026 Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
April 28, 2026 PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
April 28, 2026 Stabilizing Efficient Reasoning with Step-Level Advantage Selection
April 28, 2026 The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models
April 28, 2026 Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
April 28, 2026 FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training