2026-04-28 Paper Digest

206 arXiv papers on agent / LLM / AI infra submitted that day matched our topic filter. 10 were hand-picked by Claude — using title + authors + affiliations — and received a full Claude-generated analysis; the remaining 196 are listed at the bottom.

1. FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training

arXiv: 2604.24013 · cs.LG · Claude pick

FlashOverlap 将 Reduce-Scatter 与 All-Gather 分解为异步 P2P 通信,并按 rank 自适应调度分片计算,使最后一块数据的计算不再依赖通信,从而消除数据切分类方案的 tail latency,在 TP=4、(b,s,d)=(32,4096,4096) 的 MLP 上把通信开销从 43.8 ms 降至 0.1 ms(99.8% 削减)。

Read detailed analysis →


2. Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling

arXiv: 2604.24715 · cs.CL · Claude pick

HyLo 是一套将预训练 Transformer 升级(upcycle)为 MLA + Mamba2/GDN 混合长上下文模型的训练配方,通过分阶段长上下文训练与教师蒸馏,把可用上下文扩展至 32×、KV cache 降低 >90%,在 RULER 上显著超越 Zebra-Llama 等现有升级基线。

Read detailed analysis →


3. The Chameleon’s Limit: Investigating Persona Collapse and Homogenization in Large Language Models

arXiv: 2604.24698 · cs.CL · Claude pick

Ten LLMs asked to role-play 1,144 richly specified personas collapse into a narrow behavioral mode — agents converge despite distinct profiles. A geometric framework (Coverage, Uniformity, Complexity on a Behavioral Trait Matrix) plus item-level diagnostics shows collapse is multi-axis and task-contingent, and that the highest-fidelity models produce the most stereotyped populations.

Read detailed analysis →


4. Stabilizing Efficient Reasoning with Step-Level Advantage Selection

arXiv: 2604.24003 · cs.CL · Claude pick

Step-level Advantage Selection (SAS) zeros advantages for low-confidence steps in correct GRPO rollouts and high-confidence steps in verifier-failed rollouts, stabilizing short-context post-training. On five math benchmarks it lifts Pass@1 by 0.86 points over the strongest length-aware baseline while cutting reasoning length by 16.3%.

Read detailed analysis →


5. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

arXiv: 2604.24443 · cs.AI · Claude pick

Vision-Language Models (VLMs) have demonstrated strong performance on textbook-style physics problems, yet they frequently fail when confronted with dynamic real-world scenarios that require temporal consistency and causal reasoning across frames. We identify two fundamental challenges underlying these failures: (1) spatio-temporal identity drift, where objects lose their physical identity across successive frames and break causal chains, and (2) volatility of inference-time insights, where a model may occasionally produce correct physical reasoning but never consolidates it for future reuse.

Read detailed analysis →


6. Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

arXiv: 2604.24062 · cs.AI · Claude pick

Using the OpenLock paradigm, the authors show that four frontier models (GPT-5.2, Claude-4.5-Sonnet, Gemini-3-Flash, DeepSeek-V3.2) can discover causal structures as efficiently as humans in text, but—unlike humans—fail to transfer Common Cause / Common Effect schemas to new environments until after an initial grounding solution, and are hurt rather than helped by visual input.

Read detailed analysis →


7. Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

arXiv: 2604.24512 · cs.AI · Claude pick

The paper formalizes the Attention Latch — a failure where multi-turn LLM agents stay anchored to stale goals — and proposes SSRP, an Architect/Executive split that auto-synthesizes per-task SOPs. On MultiWOZ 2.2 (9K trajectories), SSRP lifts GPT-5.4 from 0.1% to 71.6% on 3-hop semantic hijacking.

Read detailed analysis →


8. DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference

arXiv: 2604.24647 · cs.CL · Claude pick

DepthKV reallocates a fixed global KV-cache budget non-uniformly across transformer layers based on per-layer sensitivity to pruning, using InfoNCE-derived importance scores. At 60% global pruning, it consistently beats uniform pruning (e.g., H₂O) across summarization, QA, and GSM-∞ reasoning on Gemma-7B, LLaMA-3.1-8B, and Qwen2.5-7B.

Read detailed analysis →


9. BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

arXiv: 2604.24273 · cs.LG · Claude pick

BitRL freezes a 2B-parameter BitNet b1.58 backbone (ternary weights {−1,0,+1}) and trains only small (~50K-param) PPO policy/value heads, yielding RL agents that retain 85–98% of FP16 performance with 10–16× memory reduction and 3–5× energy savings on a Raspberry Pi 4.

Read detailed analysis →


10. AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

arXiv: 2604.24039 · cs.LG · Claude pick

AgenticCache caches 2-gram plan transitions for LLM-driven embodied agents, serving most planning decisions from a local cache while a background LLM updater asynchronously validates and corrects entries. Across 4 multi-agent benchmarks × 3 GPT-5 scales, it lifts success rate by 22% on average, cuts latency 65%, and reduces tokens 50%.

Read detailed analysis →


Other matched papers

These papers matched the same topic keywords but were not among Claude’s top-N deep-analysis picks.

  1. Green Shielding: A User-Centric Approach Towards Trustworthy AI · cs.CL · arXiv 2604.24700 · score 27large language model, llm, agent, agentic, rag, serving
  2. EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce · cs.CL · arXiv 2604.23993 · score 27llm, agent, agentic, multi-agent, retrieval, reasoning
  3. JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training · cs.LG · arXiv 2604.23838 · score 25llm, agent, agentic, rag, parallelism, gpu
  4. Kwai Summary Attention Technical Report · cs.CL · arXiv 2604.24432 · score 24large language model, agent, agentic, reasoning, inference, kv cache
  5. Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing · cs.CR · arXiv 2604.24162 · score 24large language model, llm, rag, reasoning, inference, serving
  6. RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation · cs.SE · arXiv 2604.24218 · score 23large language model, llm, agent, agentic, multi-agent, rag
  7. Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis · cs.CL · arXiv 2604.24198 · score 26large language model, llm, agent, agentic, reasoning, inference
  8. Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery · cs.SE · arXiv 2604.23940 · score 21llm, agent, agentic, multi-agent, rag, compiler
  9. GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems · cs.CR · arXiv 2604.24477 · score 20large language model, llm, agent, multi-agent, inference
  10. Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing · cs.CR · arXiv 2604.24203 · score 20llm, agent, agentic, rag, reasoning, serving
  11. Skill Retrieval Augmentation for Agentic AI · cs.CL · arXiv 2604.24594 · score 19large language model, llm, agent, agentic, retrieval
  12. DPEPO: Diverse Parallel Exploration Policy Optimization for LLM-based Agents · cs.CL · arXiv 2604.24320 · score 19large language model, llm, agent, rag, reasoning, fine-tun
  13. FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data · cs.AI · arXiv 2604.24572 · score 18llm, agent, agentic, multi-agent, reasoning
  14. Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus · cs.AI · arXiv 2604.24473 · score 26llm, agent, agentic, retrieval, rag, reasoning
  15. Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study · cs.SE · arXiv 2604.24678 · score 21large language model, llm, rag, serving, fine-tun
  16. Strategic Bidding in 6G Spectrum Auctions with Large Language Models · cs.GT · arXiv 2604.24156 · score 17large language model, llm, agent, rag, reasoning
  17. MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation · cs.SE · arXiv 2604.24222 · score 24large language model, llm, retrieval, rag, inference
  18. Latency and Cost of Multi-Agent Intelligent Tutoring at Scale · cs.CY · arXiv 2604.24110 · score 16llm, agent, multi-agent, throughput, latency
  19. The Pragmatic Persona: Discovering LLM Persona through Bridging Inference · cs.CL · arXiv 2604.24079 · score 20large language model, llm, rag, reasoning, inference
  20. LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People · cs.AI · arXiv 2604.23970 · score 16llm, agent, agentic, multi-agent
  21. XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation · cs.AI · arXiv 2604.24623 · score 15large language model, llm, retrieval, rag, reasoning
  22. SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering · cs.CL · arXiv 2604.24515 · score 15large language model, llm, retrieval, reasoning, fine-tun
  23. OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents · cs.CL · arXiv 2604.24348 · score 15large language model, llm, agent, latency
  24. Generating Place-Based Compromises Between Two Points of View · cs.CL · arXiv 2604.24536 · score 14large language model, llm, reasoning, inference
  25. SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution · cs.CL · arXiv 2604.24372 · score 14llm, agent, retrieval, ai system
  26. ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems · cs.AI · arXiv 2604.23878 · score 14llm, agent, rag, ai system
  27. Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models · cs.IR · arXiv 2604.24608 · score 13large language model, llm, rag, attention
  28. MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG · cs.CL · arXiv 2604.24564 · score 13large language model, llm, retrieval, rag
  29. Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations · cs.AI · arXiv 2604.24562 · score 13large language model, llm, rag, reasoning
  30. Why AI Harms Can’t Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality · cs.CY · arXiv 2604.24519 · score 13large language model, llm, ai system
  31. A Multi-Dimensional Audit of Politically Aligned Large Language Models · cs.CL · arXiv 2604.24429 · score 13large language model, llm, reasoning, fine-tun
  32. Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion · cs.LG · arXiv 2604.24351 · score 13rag, inference, serving, kv-cache
  33. MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning · cs.CL · arXiv 2604.24186 · score 13large language model, llm, rag, reasoning
  34. Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels · cs.LG · arXiv 2604.24008 · score 13large language model, rag, quantization, gpu, post-train
  35. Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning · cs.LG · arXiv 2604.23987 · score 13large language model, llm, rag, fine-tun
  36. What Did They Mean? How LLMs Resolve Ambiguous Social Situations across Perspectives and Roles · cs.HC · arXiv 2604.23942 · score 13large language model, llm, serving
  37. Generative Synthetic Data for Causal Inference: Pitfalls, Remedies, and Opportunities · stat.ME · arXiv 2604.23904 · score 13llm, rag, inference, serving
  38. Evaluation of Prompt Injection Defenses in Large Language Models · cs.CR · arXiv 2604.23887 · score 13large language model, llm, ai system
  39. Knowledge Vector of Logical Reasoning in Large Language Models · cs.CL · arXiv 2604.23877 · score 13large language model, llm, rag, reasoning
  40. Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models · cs.LG · arXiv 2604.24708 · score 12rag, serving, attention, gpu, scheduler
  41. Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft · cs.AI · arXiv 2604.24697 · score 12llm, agent, ai system
  42. The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications · cs.AI · arXiv 2604.24668 · score 12llm, agent, agentic
  43. Evaluating whether AI models would sabotage AI safety research · cs.AI · arXiv 2604.24618 · score 12llm, agent, rag, reasoning
  44. Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models · cs.CR · arXiv 2604.24542 · score 12large language model, llm, inference
  45. From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills · cs.CL · arXiv 2604.24026 · score 12llm, agent, rag, reasoning
  46. QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems · cs.AI · arXiv 2604.24021 · score 12llm, multi-agent, ai system
  47. Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction · cs.LG · arXiv 2604.23989 · score 12large language model, llm, inference
  48. TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment · cs.CL · arXiv 2604.23938 · score 12agent, agentic, multi-agent
  49. Inverting Foundation Models of Brain Function with Simulation-Based Inference · cs.LG · arXiv 2604.23865 · score 12large language model, llm, inference
  50. Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination · cs.CL · arXiv 2604.24690 · score 11large language model, llm, reasoning
  51. Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation · cs.CL · arXiv 2604.24665 · score 11large language model, llm, reasoning
  52. K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology · cs.CL · arXiv 2604.24645 · score 11large language model, agent, reasoning
  53. STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator · cs.AI · arXiv 2604.24544 · score 11large language model, llm, rag
  54. A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations · cs.CR · arXiv 2604.24468 · score 11large language model, llm, fine-tun
  55. Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation · cs.CL · arXiv 2604.24361 · score 11large language model, llm, rag
  56. AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models · cs.CL · arXiv 2604.24175 · score 11large language model, llm, reasoning
  57. TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training · cs.DC · arXiv 2604.24088 · score 11llm, parallelism, quantization, throughput
  58. QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering · cs.CV · arXiv 2604.24052 · score 11large language model, llm, rag
  59. Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs · cs.AI · arXiv 2604.23949 · score 11large language model, llm, rag
  60. SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems · cs.CR · arXiv 2604.23905 · score 11llm, retrieval, inference, fine-tun
  61. LLM-Augmented Traffic Signal Control with LSTM-Based Traffic State Prediction and Safety-Constrained Decision Support · cs.AI · arXiv 2604.23902 · score 11large language model, llm, reasoning
  62. ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation · cs.AI · arXiv 2604.23853 · score 11llm, agent, tool use
  63. One Size Fits None: Heuristic Collapse in LLM Investment Advice · cs.CL · arXiv 2604.23837 · score 11large language model, llm, reasoning
  64. Resource-Lean Lexicon Induction for German Dialects · cs.CL · arXiv 2604.23824 · score 11large language model, llm, retrieval
  65. Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters · cs.AI · arXiv 2604.24710 · score 10llm, agent, rag
  66. GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility · cs.LG · arXiv 2604.24549 · score 10agent, multi-agent, gpu
  67. DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models · cs.LG · arXiv 2604.24357 · score 10llm, rag, reasoning, post-train
  68. The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers · cs.CY · arXiv 2604.24155 · score 10agent, reasoning, ai system
  69. Improving Robustness of Tabular Retrieval via Representational Stability · cs.CL · arXiv 2604.24040 · score 10retrieval, rag, serving, transformer
  70. Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents · cs.AI · arXiv 2604.23990 · score 10agent, rag, serving
  71. GAMED.AI: A Hierarchical Multi-Agent Framework for Automated Educational Game Generation · cs.AI · arXiv 2604.23947 · score 10agent, multi-agent, reasoning
  72. Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis · cs.SE · arXiv 2604.24703 · score 9large language model, llm
  73. AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents · cs.CR · arXiv 2604.24657 · score 9large language model, agent
  74. Zero-shot Large Language Models for Automatic Readability Assessment · cs.CL · arXiv 2604.24470 · score 9large language model, llm
  75. Can You Make It Sound Like You? Post-Editing LLM-Generated Text for Personal Style · cs.CL · arXiv 2604.24444 · score 9large language model, llm
  76. Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment · cs.LG · arXiv 2604.24178 · score 9large language model, llm
  77. Progressive Approximation in Deep Residual Networks: Theory and Validation · cs.LG · arXiv 2604.24154 · score 9llm, inference, transformer
  78. An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress · cs.AI · arXiv 2604.24076 · score 9large language model, llm
  79. A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees · cs.AI · arXiv 2604.24043 · score 9large language model, llm
  80. Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents · cs.CR · arXiv 2604.24020 · score 9agent, rag, inference
  81. IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models · cs.HC · arXiv 2604.24002 · score 9agent, reasoning, inference
  82. When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models · cs.LG · arXiv 2604.23994 · score 9llm, inference, attention
  83. Representational Curvature Modulates Behavioral Uncertainty in Large Language Models · cs.AI · arXiv 2604.23985 · score 9large language model, llm
  84. Translate or Simplify First: An Analysis of Cross-lingual Text Simplification in English and French · cs.CL · arXiv 2604.23844 · score 9large language model, llm
  85. Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs · cs.LG · arXiv 2604.23841 · score 9agent, inference, latency
  86. Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application · cs.SE · arXiv 2604.24636 · score 8llm, rag, latency
  87. Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence · cs.AI · arXiv 2604.24527 · score 8agent, ai system
  88. Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales · cs.HC · arXiv 2604.24461 · score 8llm, agent
  89. Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment · cs.RO · arXiv 2604.24447 · score 8inference, parallelism, gpu
  90. Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering · cs.CL · arXiv 2604.24334 · score 8retrieval, rag, serving
  91. Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU · cs.AI · arXiv 2604.24219 · score 8llm, retrieval, latency
  92. Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems · cs.AI · arXiv 2604.24153 · score 8serving, ai system
  93. An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources · cs.AI · arXiv 2604.24117 · score 8agent, multi-agent
  94. FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost · cs.LG · arXiv 2604.24073 · score 8rag, distributed training, gpu
  95. DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery · cs.CV · arXiv 2604.24029 · score 8retrieval, reasoning, chain-of-thought, fine-tun
  96. Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions · cs.AI · arXiv 2604.23924 · score 8agent, agentic
  97. MarketBench: Evaluating AI Agents as Market Participants · cs.AI · arXiv 2604.23897 · score 8llm, agent
  98. Geometry Preserving Loss Functions Promote Improved Adaptation of Blackbox Generative Model · cs.LG · arXiv 2604.23888 · score 8rag, serving, fine-tun
  99. Graph Memory Transformer (GMT) · cs.LG · arXiv 2604.23862 · score 8serving, attention, transformer
  100. Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification · cs.AI · arXiv 2604.23854 · score 8serving, attention, fine-tun
  101. The Last Human-Written Paper: Agent-Native Research Artifacts · cs.LG · arXiv 2604.24658 · score 7agent, compiler
  102. CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies · cs.CV · arXiv 2604.24622 · score 7rag, inference, latency
  103. Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation · cs.CV · arXiv 2604.24396 · score 7rag, inference, attention
  104. AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation · cs.RO · arXiv 2604.24086 · score 7inference, latency, fine-tun
  105. Architectural Isolation as a Timing Safety Primitive for Edge AI Medical Devices: Controlled Experimental Evidence on a Shared-Silicon Platform · cs.AR · arXiv 2604.23831 · score 7inference, gpu, latency
  106. Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents · cs.AI · arXiv 2604.24686 · score 6agent, rag
  107. Meta-CoT: Enhancing Granularity and Generalization in Image Editing · cs.CV · arXiv 2604.24625 · score 6rag, reasoning, chain-of-thought
  108. GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation · quant-ph · arXiv 2604.24551 · score 6rag, serving
  109. Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI · cs.CV · arXiv 2604.24492 · score 6latency, fine-tun, post-train
  110. Incisor: Ex Ante Cloud Instance Selection for HPC Jobs · cs.DC · arXiv 2604.24464 · score 6llm, reasoning
  111. Certified geometric robustness – Super-DeepG · cs.AI · arXiv 2604.24379 · score 6rag, reasoning, gpu
  112. Learning Evidence of Depression Symptoms via Prompt Induction · cs.CL · arXiv 2604.24376 · score 6llm, fine-tun
  113. SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation · cs.LG · arXiv 2604.24368 · score 6llm, rag
  114. SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting · cs.LG · arXiv 2604.24306 · score 6rag, attention, transformer
  115. Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop · cs.AI · arXiv 2604.24158 · score 6llm, reasoning
  116. Leveraging Human Feedback for Semantically-Relevant Skill Discovery · cs.LG · arXiv 2604.24127 · score 6agent, rag
  117. Psychologically-Grounded Graph Modeling for Interpretable Depression Detection · cs.CL · arXiv 2604.24126 · score 6llm, attention
  118. Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising · cs.CL · arXiv 2604.24104 · score 6llm, fine-tun
  119. Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B · cs.CL · arXiv 2604.24070 · score 6llm, fine-tun
  120. TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents · cs.LG · arXiv 2604.24005 · score 6agent, reasoning
  121. Hindsight Preference Optimization for Financial Time Series Advisory · cs.LG · arXiv 2604.23988 · score 6llm, reasoning
  122. Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity · cs.CL · arXiv 2604.23972 · score 6llm, reasoning
  123. Do Quantum Transformers Help? A Systematic VQC Architecture Comparison on Tabular Benchmarks · quant-ph · arXiv 2604.23931 · score 6rag, attention, transformer
  124. Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering · cs.LG · arXiv 2604.23912 · score 6rag, serving
  125. Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows · cs.CL · arXiv 2604.23855 · score 6llm, rag
  126. Contextual Linear Activation Steering of Language Models · cs.CL · arXiv 2604.24693 · score 5large language model
  127. MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining · cs.CL · arXiv 2604.24374 · score 5inference, attention
  128. Machine-Learning-Based Classification of Radio Frequency Building Loss · cs.LG · arXiv 2604.24143 · score 5rag, inference
  129. PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality · cs.CL · arXiv 2604.24071 · score 5large language model
  130. Integrative neurocybernetic modeling in the era of large-scale neuroscience · q-bio.NC · arXiv 2604.23903 · score 5rag, inference
  131. Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes · cs.LG · arXiv 2604.24745 · score 4serving
  132. Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling · cs.AI · arXiv 2604.24717 · score 4attention, transformer
  133. Fraud Detection in Cryptocurrency Markets with Spatio-Temporal Graph Neural Networks · cs.LG · arXiv 2604.24590 · score 4attention, transformer
  134. A systematic evaluation of vision-language models for observational astronomical reasoning tasks · cs.AI · arXiv 2604.24589 · score 4reasoning, attention
  135. Dialysis Risk Prediction and Treatment Effect Estimation for AKI patients using Longitudinal Electronic Health Records · cs.LG · arXiv 2604.24547 · score 4rag, transformer
  136. Understanding the Limits of Automated Evaluation for Code Review Bots in Practice · cs.SE · arXiv 2604.24525 · score 4llm
  137. ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet Data · cs.CV · arXiv 2604.24353 · score 4attention, transformer
  138. Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training · cs.LG · arXiv 2604.24350 · score 4rag, attention
  139. Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions · eess.IV · arXiv 2604.24347 · score 4rag, transformer
  140. Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems · cs.DC · arXiv 2604.24340 · score 4rag, latency
  141. Perfecting Aircraft Maneuvers with Reinforcement Learning · cs.LG · arXiv 2604.24338 · score 4agent
  142. X-NegoBox: An Explainable Privacy-Budget Negotiation Framework for Secure Peer-to-Peer Energy Data Exchange · cs.CR · arXiv 2604.24326 · score 4serving
  143. Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer · cs.CL · arXiv 2604.24302 · score 4retrieval, reasoning
  144. Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions · cs.LG · arXiv 2604.24293 · score 4serving
  145. RowHammer Vulnerability Counter (RVC): Redefining RowHammer Detection with Victim-Centric Tracking · cs.CR · arXiv 2604.24287 · score 4rag, latency
  146. Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring · eess.IV · arXiv 2604.24236 · score 4rag, transformer
  147. CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification · cs.LG · arXiv 2604.24201 · score 4rag, fine-tun
  148. IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning · cs.CL · arXiv 2604.24114 · score 4reasoning, fine-tun
  149. BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning · cs.CL · arXiv 2604.24089 · score 4serving
  150. How Sensitive Are Safety Benchmarks to Judge Configuration Choices? · cs.CL · arXiv 2604.24074 · score 4llm
  151. AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment · cs.AI · arXiv 2604.24038 · score 4agent
  152. FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection · cs.LG · arXiv 2604.24012 · score 4serving
  153. Adaptive-Distribution Randomized Neural Networks for PDEs: A Low-Dimensional Distribution-Learning Framework · math.NA · arXiv 2604.23999 · score 4serving
  154. DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting · cs.LG · arXiv 2604.23968 · score 4attention, transformer
  155. Crystal structure prediction using graph neural combinatorial optimization · cs.LG · arXiv 2604.23921 · score 4rag, gpu
  156. Cardiac Stability Theory: An Axiomatically Grounded Framework for Continuous Cardiac Health Monitoring via Smartphone Photoplethysmography · cs.LG · arXiv 2604.23876 · score 4transformer, latency
  157. Exploring Audio Hallucination in Egocentric Video Understanding · cs.CV · arXiv 2604.23860 · score 4llm
  158. Focus on What Matters: Two-Stage ROI-Aware Refinement for Anatomy-Preserving Fetal Ultrasound Reconstruction · cs.CV · arXiv 2604.23839 · score 4serving
  159. Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks · cs.LG · arXiv 2604.24637 · score 3inference
  160. MIMIC: A Generative Multimodal Foundation Model for Biomolecules · cs.AI · arXiv 2604.24506 · score 3inference
  161. Compilation and Execution of an Embeddable YOLO-NAS on the VTA · cs.AR · arXiv 2604.24455 · score 3compiler
  162. SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors · cs.RO · arXiv 2604.24449 · score 3inference
  163. Scaling Properties of Continuous Diffusion Spoken Language Models · cs.CL · arXiv 2604.24416 · score 3inference
  164. Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach · cs.LG · arXiv 2604.24280 · score 3inference
  165. Speech Enhancement Based on Drifting Models · cs.SD · arXiv 2604.24199 · score 3inference
  166. Learning to Think from Multiple Thinkers · cs.LG · arXiv 2604.24737 · score 2chain-of-thought
  167. Déjà Vu Packing: Optimizing FPGA Logic Clustering Runtime via Pattern Memoization · cs.AR · arXiv 2604.24649 · score 2rag
  168. NeSyCat: A Monad-Based Categorical Semantics of the Neurosymbolic ULLER Framework · cs.AI · arXiv 2604.24612 · score 2reasoning
  169. Hierarchical Behaviour Spaces · cs.AI · arXiv 2604.24558 · score 2reasoning
  170. SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances · cs.DC · arXiv 2604.24548 · score 2rag
  171. A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning · cs.LG · arXiv 2604.24532 · score 2rag
  172. SceneSelect: Selective Learning for Trajectory Scene Classification and Expert Scheduling · cs.LG · arXiv 2604.24514 · score 2rag
  173. Modeling Behavioral Intensity and Transitions for Generative Recommendation · cs.IR · arXiv 2604.24472 · score 2attention
  174. All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation · cs.SD · arXiv 2604.24401 · score 2rag
  175. Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware · quant-ph · arXiv 2604.24397 · score 2fine-tun
  176. PathMoG: A Pathway-Centric Modular Graph Neural Network for Multi-Omics Survival Prediction · cs.LG · arXiv 2604.24371 · score 2attention
  177. See Further, Think Deeper: Advancing VLM’s Reasoning Ability with Low-level Visual Cues and Reflection · cs.CV · arXiv 2604.24339 · score 2reasoning
  178. Mitigating Error Amplification in Fast Adversarial Training · cs.LG · arXiv 2604.24332 · score 2rag
  179. Unconstrained Multi-view Human Pose Estimation with Algebraic Priors · cs.CV · arXiv 2604.24312 · score 2transformer
  180. IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting · cs.LG · arXiv 2604.24224 · score 2attention
  181. Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk · cs.CL · arXiv 2604.24197 · score 2reasoning
  182. MemeScouts@LT-EDI 2026: Asking the Right Questions – Prompted Weak Supervision for Meme Hate Speech Detection · cs.CL · arXiv 2604.24179 · score 2reasoning
  183. A Divergence-Based Method for Weighting and Averaging Model Predictions · stat.ML · arXiv 2604.24172 · score 2rag
  184. Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales · cs.DC · arXiv 2604.24091 · score 2rag
  185. A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws · cs.LG · arXiv 2604.24037 · score 2rag
  186. KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances · cs.DC · arXiv 2604.24027 · score 2rag
  187. Geometry-Aware Offline-to-Online Learning in Linear Contextual Bandits · cs.LG · arXiv 2604.24016 · score 2rag
  188. SDSL-Solver: Scalable Distributed Sparse Linear Solvers for Large-Scale Interior Point Methods · cs.DC · arXiv 2604.23979 · score 2rag
  189. Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction · cs.LG · arXiv 2604.23964 · score 2attention
  190. Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach · cs.CV · arXiv 2604.23953 · score 2rag
  191. KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters · cs.CL · arXiv 2604.23948 · score 2rag
  192. Sliced-Regularized Optimal Transport · stat.ML · arXiv 2604.23944 · score 2rag
  193. Quasi-Quadratic Gradient: A New Direction for Accelerating the BFGS Method in Quasi-Newton Optimization · math.OC · arXiv 2604.23922 · score 2rag
  194. Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia’s National Electricity Market · cs.LG · arXiv 2604.23908 · score 2transformer
  195. Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity · cs.LG · arXiv 2604.23867 · score 2rag
  196. Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features · cs.AI · arXiv 2604.23829 · score 2reasoning