2026-04-28 Paper Digest
206 arXiv papers on agent / LLM / AI infra submitted that day matched our topic filter. 10 were hand-picked by Claude — using title + authors + affiliations — and received a full Claude-generated analysis; the remaining 196 are listed at the bottom.
1. FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training
arXiv: 2604.24013 · cs.LG · Claude pick
FlashOverlap 将 Reduce-Scatter 与 All-Gather 分解为异步 P2P 通信,并按 rank 自适应调度分片计算,使最后一块数据的计算不再依赖通信,从而消除数据切分类方案的 tail latency,在 TP=4、(b,s,d)=(32,4096,4096) 的 MLP 上把通信开销从 43.8 ms 降至 0.1 ms(99.8% 削减)。
2. Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
arXiv: 2604.24715 · cs.CL · Claude pick
HyLo 是一套将预训练 Transformer 升级(upcycle)为 MLA + Mamba2/GDN 混合长上下文模型的训练配方,通过分阶段长上下文训练与教师蒸馏,把可用上下文扩展至 32×、KV cache 降低 >90%,在 RULER 上显著超越 Zebra-Llama 等现有升级基线。
3. The Chameleon’s Limit: Investigating Persona Collapse and Homogenization in Large Language Models
arXiv: 2604.24698 · cs.CL · Claude pick
Ten LLMs asked to role-play 1,144 richly specified personas collapse into a narrow behavioral mode — agents converge despite distinct profiles. A geometric framework (Coverage, Uniformity, Complexity on a Behavioral Trait Matrix) plus item-level diagnostics shows collapse is multi-axis and task-contingent, and that the highest-fidelity models produce the most stereotyped populations.
4. Stabilizing Efficient Reasoning with Step-Level Advantage Selection
arXiv: 2604.24003 · cs.CL · Claude pick
Step-level Advantage Selection (SAS) zeros advantages for low-confidence steps in correct GRPO rollouts and high-confidence steps in verifier-failed rollouts, stabilizing short-context post-training. On five math benchmarks it lifts Pass@1 by 0.86 points over the strongest length-aware baseline while cutting reasoning length by 16.3%.
5. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
arXiv: 2604.24443 · cs.AI · Claude pick
Vision-Language Models (VLMs) have demonstrated strong performance on textbook-style physics problems, yet they frequently fail when confronted with dynamic real-world scenarios that require temporal consistency and causal reasoning across frames. We identify two fundamental challenges underlying these failures: (1) spatio-temporal identity drift, where objects lose their physical identity across successive frames and break causal chains, and (2) volatility of inference-time insights, where a model may occasionally produce correct physical reasoning but never consolidates it for future reuse.
6. Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
arXiv: 2604.24062 · cs.AI · Claude pick
Using the OpenLock paradigm, the authors show that four frontier models (GPT-5.2, Claude-4.5-Sonnet, Gemini-3-Flash, DeepSeek-V3.2) can discover causal structures as efficiently as humans in text, but—unlike humans—fail to transfer Common Cause / Common Effect schemas to new environments until after an initial grounding solution, and are hurt rather than helped by visual input.
7. Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
arXiv: 2604.24512 · cs.AI · Claude pick
The paper formalizes the Attention Latch — a failure where multi-turn LLM agents stay anchored to stale goals — and proposes SSRP, an Architect/Executive split that auto-synthesizes per-task SOPs. On MultiWOZ 2.2 (9K trajectories), SSRP lifts GPT-5.4 from 0.1% to 71.6% on 3-hop semantic hijacking.
8. DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
arXiv: 2604.24647 · cs.CL · Claude pick
DepthKV reallocates a fixed global KV-cache budget non-uniformly across transformer layers based on per-layer sensitivity to pruning, using InfoNCE-derived importance scores. At 60% global pruning, it consistently beats uniform pruning (e.g., H₂O) across summarization, QA, and GSM-∞ reasoning on Gemma-7B, LLaMA-3.1-8B, and Qwen2.5-7B.
9. BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment
arXiv: 2604.24273 · cs.LG · Claude pick
BitRL freezes a 2B-parameter BitNet b1.58 backbone (ternary weights {−1,0,+1}) and trains only small (~50K-param) PPO policy/value heads, yielding RL agents that retain 85–98% of FP16 performance with 10–16× memory reduction and 3–5× energy savings on a Raspberry Pi 4.
10. AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
arXiv: 2604.24039 · cs.LG · Claude pick
AgenticCache caches 2-gram plan transitions for LLM-driven embodied agents, serving most planning decisions from a local cache while a background LLM updater asynchronously validates and corrects entries. Across 4 multi-agent benchmarks × 3 GPT-5 scales, it lifts success rate by 22% on average, cuts latency 65%, and reduces tokens 50%.
Other matched papers
These papers matched the same topic keywords but were not among Claude’s top-N deep-analysis picks.
- Green Shielding: A User-Centric Approach Towards Trustworthy AI ·
cs.CL· arXiv 2604.24700 · score 27 —large language model, llm, agent, agentic, rag, serving - EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce ·
cs.CL· arXiv 2604.23993 · score 27 —llm, agent, agentic, multi-agent, retrieval, reasoning - JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training ·
cs.LG· arXiv 2604.23838 · score 25 —llm, agent, agentic, rag, parallelism, gpu - Kwai Summary Attention Technical Report ·
cs.CL· arXiv 2604.24432 · score 24 —large language model, agent, agentic, reasoning, inference, kv cache - Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing ·
cs.CR· arXiv 2604.24162 · score 24 —large language model, llm, rag, reasoning, inference, serving - RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation ·
cs.SE· arXiv 2604.24218 · score 23 —large language model, llm, agent, agentic, multi-agent, rag - Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis ·
cs.CL· arXiv 2604.24198 · score 26 —large language model, llm, agent, agentic, reasoning, inference - Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery ·
cs.SE· arXiv 2604.23940 · score 21 —llm, agent, agentic, multi-agent, rag, compiler - GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems ·
cs.CR· arXiv 2604.24477 · score 20 —large language model, llm, agent, multi-agent, inference - Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing ·
cs.CR· arXiv 2604.24203 · score 20 —llm, agent, agentic, rag, reasoning, serving - Skill Retrieval Augmentation for Agentic AI ·
cs.CL· arXiv 2604.24594 · score 19 —large language model, llm, agent, agentic, retrieval - DPEPO: Diverse Parallel Exploration Policy Optimization for LLM-based Agents ·
cs.CL· arXiv 2604.24320 · score 19 —large language model, llm, agent, rag, reasoning, fine-tun - FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data ·
cs.AI· arXiv 2604.24572 · score 18 —llm, agent, agentic, multi-agent, reasoning - Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus ·
cs.AI· arXiv 2604.24473 · score 26 —llm, agent, agentic, retrieval, rag, reasoning - Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study ·
cs.SE· arXiv 2604.24678 · score 21 —large language model, llm, rag, serving, fine-tun - Strategic Bidding in 6G Spectrum Auctions with Large Language Models ·
cs.GT· arXiv 2604.24156 · score 17 —large language model, llm, agent, rag, reasoning - MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation ·
cs.SE· arXiv 2604.24222 · score 24 —large language model, llm, retrieval, rag, inference - Latency and Cost of Multi-Agent Intelligent Tutoring at Scale ·
cs.CY· arXiv 2604.24110 · score 16 —llm, agent, multi-agent, throughput, latency - The Pragmatic Persona: Discovering LLM Persona through Bridging Inference ·
cs.CL· arXiv 2604.24079 · score 20 —large language model, llm, rag, reasoning, inference - LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People ·
cs.AI· arXiv 2604.23970 · score 16 —llm, agent, agentic, multi-agent - XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation ·
cs.AI· arXiv 2604.24623 · score 15 —large language model, llm, retrieval, rag, reasoning - SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering ·
cs.CL· arXiv 2604.24515 · score 15 —large language model, llm, retrieval, reasoning, fine-tun - OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents ·
cs.CL· arXiv 2604.24348 · score 15 —large language model, llm, agent, latency - Generating Place-Based Compromises Between Two Points of View ·
cs.CL· arXiv 2604.24536 · score 14 —large language model, llm, reasoning, inference - SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution ·
cs.CL· arXiv 2604.24372 · score 14 —llm, agent, retrieval, ai system - ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems ·
cs.AI· arXiv 2604.23878 · score 14 —llm, agent, rag, ai system - Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models ·
cs.IR· arXiv 2604.24608 · score 13 —large language model, llm, rag, attention - MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG ·
cs.CL· arXiv 2604.24564 · score 13 —large language model, llm, retrieval, rag - Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations ·
cs.AI· arXiv 2604.24562 · score 13 —large language model, llm, rag, reasoning - Why AI Harms Can’t Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality ·
cs.CY· arXiv 2604.24519 · score 13 —large language model, llm, ai system - A Multi-Dimensional Audit of Politically Aligned Large Language Models ·
cs.CL· arXiv 2604.24429 · score 13 —large language model, llm, reasoning, fine-tun - Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion ·
cs.LG· arXiv 2604.24351 · score 13 —rag, inference, serving, kv-cache - MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning ·
cs.CL· arXiv 2604.24186 · score 13 —large language model, llm, rag, reasoning - Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels ·
cs.LG· arXiv 2604.24008 · score 13 —large language model, rag, quantization, gpu, post-train - Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning ·
cs.LG· arXiv 2604.23987 · score 13 —large language model, llm, rag, fine-tun - What Did They Mean? How LLMs Resolve Ambiguous Social Situations across Perspectives and Roles ·
cs.HC· arXiv 2604.23942 · score 13 —large language model, llm, serving - Generative Synthetic Data for Causal Inference: Pitfalls, Remedies, and Opportunities ·
stat.ME· arXiv 2604.23904 · score 13 —llm, rag, inference, serving - Evaluation of Prompt Injection Defenses in Large Language Models ·
cs.CR· arXiv 2604.23887 · score 13 —large language model, llm, ai system - Knowledge Vector of Logical Reasoning in Large Language Models ·
cs.CL· arXiv 2604.23877 · score 13 —large language model, llm, rag, reasoning - Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models ·
cs.LG· arXiv 2604.24708 · score 12 —rag, serving, attention, gpu, scheduler - Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft ·
cs.AI· arXiv 2604.24697 · score 12 —llm, agent, ai system - The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications ·
cs.AI· arXiv 2604.24668 · score 12 —llm, agent, agentic - Evaluating whether AI models would sabotage AI safety research ·
cs.AI· arXiv 2604.24618 · score 12 —llm, agent, rag, reasoning - Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models ·
cs.CR· arXiv 2604.24542 · score 12 —large language model, llm, inference - From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills ·
cs.CL· arXiv 2604.24026 · score 12 —llm, agent, rag, reasoning - QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems ·
cs.AI· arXiv 2604.24021 · score 12 —llm, multi-agent, ai system - Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction ·
cs.LG· arXiv 2604.23989 · score 12 —large language model, llm, inference - TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment ·
cs.CL· arXiv 2604.23938 · score 12 —agent, agentic, multi-agent - Inverting Foundation Models of Brain Function with Simulation-Based Inference ·
cs.LG· arXiv 2604.23865 · score 12 —large language model, llm, inference - Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination ·
cs.CL· arXiv 2604.24690 · score 11 —large language model, llm, reasoning - Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation ·
cs.CL· arXiv 2604.24665 · score 11 —large language model, llm, reasoning - K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology ·
cs.CL· arXiv 2604.24645 · score 11 —large language model, agent, reasoning - STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator ·
cs.AI· arXiv 2604.24544 · score 11 —large language model, llm, rag - A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations ·
cs.CR· arXiv 2604.24468 · score 11 —large language model, llm, fine-tun - Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation ·
cs.CL· arXiv 2604.24361 · score 11 —large language model, llm, rag - AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models ·
cs.CL· arXiv 2604.24175 · score 11 —large language model, llm, reasoning - TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training ·
cs.DC· arXiv 2604.24088 · score 11 —llm, parallelism, quantization, throughput - QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering ·
cs.CV· arXiv 2604.24052 · score 11 —large language model, llm, rag - Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs ·
cs.AI· arXiv 2604.23949 · score 11 —large language model, llm, rag - SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems ·
cs.CR· arXiv 2604.23905 · score 11 —llm, retrieval, inference, fine-tun - LLM-Augmented Traffic Signal Control with LSTM-Based Traffic State Prediction and Safety-Constrained Decision Support ·
cs.AI· arXiv 2604.23902 · score 11 —large language model, llm, reasoning - ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation ·
cs.AI· arXiv 2604.23853 · score 11 —llm, agent, tool use - One Size Fits None: Heuristic Collapse in LLM Investment Advice ·
cs.CL· arXiv 2604.23837 · score 11 —large language model, llm, reasoning - Resource-Lean Lexicon Induction for German Dialects ·
cs.CL· arXiv 2604.23824 · score 11 —large language model, llm, retrieval - Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters ·
cs.AI· arXiv 2604.24710 · score 10 —llm, agent, rag - GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility ·
cs.LG· arXiv 2604.24549 · score 10 —agent, multi-agent, gpu - DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models ·
cs.LG· arXiv 2604.24357 · score 10 —llm, rag, reasoning, post-train - The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers ·
cs.CY· arXiv 2604.24155 · score 10 —agent, reasoning, ai system - Improving Robustness of Tabular Retrieval via Representational Stability ·
cs.CL· arXiv 2604.24040 · score 10 —retrieval, rag, serving, transformer - Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents ·
cs.AI· arXiv 2604.23990 · score 10 —agent, rag, serving - GAMED.AI: A Hierarchical Multi-Agent Framework for Automated Educational Game Generation ·
cs.AI· arXiv 2604.23947 · score 10 —agent, multi-agent, reasoning - Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis ·
cs.SE· arXiv 2604.24703 · score 9 —large language model, llm - AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents ·
cs.CR· arXiv 2604.24657 · score 9 —large language model, agent - Zero-shot Large Language Models for Automatic Readability Assessment ·
cs.CL· arXiv 2604.24470 · score 9 —large language model, llm - Can You Make It Sound Like You? Post-Editing LLM-Generated Text for Personal Style ·
cs.CL· arXiv 2604.24444 · score 9 —large language model, llm - Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment ·
cs.LG· arXiv 2604.24178 · score 9 —large language model, llm - Progressive Approximation in Deep Residual Networks: Theory and Validation ·
cs.LG· arXiv 2604.24154 · score 9 —llm, inference, transformer - An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress ·
cs.AI· arXiv 2604.24076 · score 9 —large language model, llm - A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees ·
cs.AI· arXiv 2604.24043 · score 9 —large language model, llm - Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents ·
cs.CR· arXiv 2604.24020 · score 9 —agent, rag, inference - IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models ·
cs.HC· arXiv 2604.24002 · score 9 —agent, reasoning, inference - When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models ·
cs.LG· arXiv 2604.23994 · score 9 —llm, inference, attention - Representational Curvature Modulates Behavioral Uncertainty in Large Language Models ·
cs.AI· arXiv 2604.23985 · score 9 —large language model, llm - Translate or Simplify First: An Analysis of Cross-lingual Text Simplification in English and French ·
cs.CL· arXiv 2604.23844 · score 9 —large language model, llm - Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs ·
cs.LG· arXiv 2604.23841 · score 9 —agent, inference, latency - Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application ·
cs.SE· arXiv 2604.24636 · score 8 —llm, rag, latency - Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence ·
cs.AI· arXiv 2604.24527 · score 8 —agent, ai system - Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales ·
cs.HC· arXiv 2604.24461 · score 8 —llm, agent - Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment ·
cs.RO· arXiv 2604.24447 · score 8 —inference, parallelism, gpu - Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering ·
cs.CL· arXiv 2604.24334 · score 8 —retrieval, rag, serving - Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU ·
cs.AI· arXiv 2604.24219 · score 8 —llm, retrieval, latency - Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems ·
cs.AI· arXiv 2604.24153 · score 8 —serving, ai system - An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources ·
cs.AI· arXiv 2604.24117 · score 8 —agent, multi-agent - FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost ·
cs.LG· arXiv 2604.24073 · score 8 —rag, distributed training, gpu - DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery ·
cs.CV· arXiv 2604.24029 · score 8 —retrieval, reasoning, chain-of-thought, fine-tun - Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions ·
cs.AI· arXiv 2604.23924 · score 8 —agent, agentic - MarketBench: Evaluating AI Agents as Market Participants ·
cs.AI· arXiv 2604.23897 · score 8 —llm, agent - Geometry Preserving Loss Functions Promote Improved Adaptation of Blackbox Generative Model ·
cs.LG· arXiv 2604.23888 · score 8 —rag, serving, fine-tun - Graph Memory Transformer (GMT) ·
cs.LG· arXiv 2604.23862 · score 8 —serving, attention, transformer - Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification ·
cs.AI· arXiv 2604.23854 · score 8 —serving, attention, fine-tun - The Last Human-Written Paper: Agent-Native Research Artifacts ·
cs.LG· arXiv 2604.24658 · score 7 —agent, compiler - CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies ·
cs.CV· arXiv 2604.24622 · score 7 —rag, inference, latency - Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation ·
cs.CV· arXiv 2604.24396 · score 7 —rag, inference, attention - AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation ·
cs.RO· arXiv 2604.24086 · score 7 —inference, latency, fine-tun - Architectural Isolation as a Timing Safety Primitive for Edge AI Medical Devices: Controlled Experimental Evidence on a Shared-Silicon Platform ·
cs.AR· arXiv 2604.23831 · score 7 —inference, gpu, latency - Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents ·
cs.AI· arXiv 2604.24686 · score 6 —agent, rag - Meta-CoT: Enhancing Granularity and Generalization in Image Editing ·
cs.CV· arXiv 2604.24625 · score 6 —rag, reasoning, chain-of-thought - GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation ·
quant-ph· arXiv 2604.24551 · score 6 —rag, serving - Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI ·
cs.CV· arXiv 2604.24492 · score 6 —latency, fine-tun, post-train - Incisor: Ex Ante Cloud Instance Selection for HPC Jobs ·
cs.DC· arXiv 2604.24464 · score 6 —llm, reasoning - Certified geometric robustness – Super-DeepG ·
cs.AI· arXiv 2604.24379 · score 6 —rag, reasoning, gpu - Learning Evidence of Depression Symptoms via Prompt Induction ·
cs.CL· arXiv 2604.24376 · score 6 —llm, fine-tun - SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation ·
cs.LG· arXiv 2604.24368 · score 6 —llm, rag - SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting ·
cs.LG· arXiv 2604.24306 · score 6 —rag, attention, transformer - Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop ·
cs.AI· arXiv 2604.24158 · score 6 —llm, reasoning - Leveraging Human Feedback for Semantically-Relevant Skill Discovery ·
cs.LG· arXiv 2604.24127 · score 6 —agent, rag - Psychologically-Grounded Graph Modeling for Interpretable Depression Detection ·
cs.CL· arXiv 2604.24126 · score 6 —llm, attention - Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising ·
cs.CL· arXiv 2604.24104 · score 6 —llm, fine-tun - Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B ·
cs.CL· arXiv 2604.24070 · score 6 —llm, fine-tun - TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents ·
cs.LG· arXiv 2604.24005 · score 6 —agent, reasoning - Hindsight Preference Optimization for Financial Time Series Advisory ·
cs.LG· arXiv 2604.23988 · score 6 —llm, reasoning - Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity ·
cs.CL· arXiv 2604.23972 · score 6 —llm, reasoning - Do Quantum Transformers Help? A Systematic VQC Architecture Comparison on Tabular Benchmarks ·
quant-ph· arXiv 2604.23931 · score 6 —rag, attention, transformer - Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering ·
cs.LG· arXiv 2604.23912 · score 6 —rag, serving - Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows ·
cs.CL· arXiv 2604.23855 · score 6 —llm, rag - Contextual Linear Activation Steering of Language Models ·
cs.CL· arXiv 2604.24693 · score 5 —large language model - MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining ·
cs.CL· arXiv 2604.24374 · score 5 —inference, attention - Machine-Learning-Based Classification of Radio Frequency Building Loss ·
cs.LG· arXiv 2604.24143 · score 5 —rag, inference - PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality ·
cs.CL· arXiv 2604.24071 · score 5 —large language model - Integrative neurocybernetic modeling in the era of large-scale neuroscience ·
q-bio.NC· arXiv 2604.23903 · score 5 —rag, inference - Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes ·
cs.LG· arXiv 2604.24745 · score 4 —serving - Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling ·
cs.AI· arXiv 2604.24717 · score 4 —attention, transformer - Fraud Detection in Cryptocurrency Markets with Spatio-Temporal Graph Neural Networks ·
cs.LG· arXiv 2604.24590 · score 4 —attention, transformer - A systematic evaluation of vision-language models for observational astronomical reasoning tasks ·
cs.AI· arXiv 2604.24589 · score 4 —reasoning, attention - Dialysis Risk Prediction and Treatment Effect Estimation for AKI patients using Longitudinal Electronic Health Records ·
cs.LG· arXiv 2604.24547 · score 4 —rag, transformer - Understanding the Limits of Automated Evaluation for Code Review Bots in Practice ·
cs.SE· arXiv 2604.24525 · score 4 —llm - ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet Data ·
cs.CV· arXiv 2604.24353 · score 4 —attention, transformer - Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training ·
cs.LG· arXiv 2604.24350 · score 4 —rag, attention - Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions ·
eess.IV· arXiv 2604.24347 · score 4 —rag, transformer - Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems ·
cs.DC· arXiv 2604.24340 · score 4 —rag, latency - Perfecting Aircraft Maneuvers with Reinforcement Learning ·
cs.LG· arXiv 2604.24338 · score 4 —agent - X-NegoBox: An Explainable Privacy-Budget Negotiation Framework for Secure Peer-to-Peer Energy Data Exchange ·
cs.CR· arXiv 2604.24326 · score 4 —serving - Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer ·
cs.CL· arXiv 2604.24302 · score 4 —retrieval, reasoning - Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions ·
cs.LG· arXiv 2604.24293 · score 4 —serving - RowHammer Vulnerability Counter (RVC): Redefining RowHammer Detection with Victim-Centric Tracking ·
cs.CR· arXiv 2604.24287 · score 4 —rag, latency - Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring ·
eess.IV· arXiv 2604.24236 · score 4 —rag, transformer - CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification ·
cs.LG· arXiv 2604.24201 · score 4 —rag, fine-tun - IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning ·
cs.CL· arXiv 2604.24114 · score 4 —reasoning, fine-tun - BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning ·
cs.CL· arXiv 2604.24089 · score 4 —serving - How Sensitive Are Safety Benchmarks to Judge Configuration Choices? ·
cs.CL· arXiv 2604.24074 · score 4 —llm - AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment ·
cs.AI· arXiv 2604.24038 · score 4 —agent - FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection ·
cs.LG· arXiv 2604.24012 · score 4 —serving - Adaptive-Distribution Randomized Neural Networks for PDEs: A Low-Dimensional Distribution-Learning Framework ·
math.NA· arXiv 2604.23999 · score 4 —serving - DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting ·
cs.LG· arXiv 2604.23968 · score 4 —attention, transformer - Crystal structure prediction using graph neural combinatorial optimization ·
cs.LG· arXiv 2604.23921 · score 4 —rag, gpu - Cardiac Stability Theory: An Axiomatically Grounded Framework for Continuous Cardiac Health Monitoring via Smartphone Photoplethysmography ·
cs.LG· arXiv 2604.23876 · score 4 —transformer, latency - Exploring Audio Hallucination in Egocentric Video Understanding ·
cs.CV· arXiv 2604.23860 · score 4 —llm - Focus on What Matters: Two-Stage ROI-Aware Refinement for Anatomy-Preserving Fetal Ultrasound Reconstruction ·
cs.CV· arXiv 2604.23839 · score 4 —serving - Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks ·
cs.LG· arXiv 2604.24637 · score 3 —inference - MIMIC: A Generative Multimodal Foundation Model for Biomolecules ·
cs.AI· arXiv 2604.24506 · score 3 —inference - Compilation and Execution of an Embeddable YOLO-NAS on the VTA ·
cs.AR· arXiv 2604.24455 · score 3 —compiler - SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors ·
cs.RO· arXiv 2604.24449 · score 3 —inference - Scaling Properties of Continuous Diffusion Spoken Language Models ·
cs.CL· arXiv 2604.24416 · score 3 —inference - Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach ·
cs.LG· arXiv 2604.24280 · score 3 —inference - Speech Enhancement Based on Drifting Models ·
cs.SD· arXiv 2604.24199 · score 3 —inference - Learning to Think from Multiple Thinkers ·
cs.LG· arXiv 2604.24737 · score 2 —chain-of-thought - Déjà Vu Packing: Optimizing FPGA Logic Clustering Runtime via Pattern Memoization ·
cs.AR· arXiv 2604.24649 · score 2 —rag - NeSyCat: A Monad-Based Categorical Semantics of the Neurosymbolic ULLER Framework ·
cs.AI· arXiv 2604.24612 · score 2 —reasoning - Hierarchical Behaviour Spaces ·
cs.AI· arXiv 2604.24558 · score 2 —reasoning - SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances ·
cs.DC· arXiv 2604.24548 · score 2 —rag - A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning ·
cs.LG· arXiv 2604.24532 · score 2 —rag - SceneSelect: Selective Learning for Trajectory Scene Classification and Expert Scheduling ·
cs.LG· arXiv 2604.24514 · score 2 —rag - Modeling Behavioral Intensity and Transitions for Generative Recommendation ·
cs.IR· arXiv 2604.24472 · score 2 —attention - All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation ·
cs.SD· arXiv 2604.24401 · score 2 —rag - Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware ·
quant-ph· arXiv 2604.24397 · score 2 —fine-tun - PathMoG: A Pathway-Centric Modular Graph Neural Network for Multi-Omics Survival Prediction ·
cs.LG· arXiv 2604.24371 · score 2 —attention - See Further, Think Deeper: Advancing VLM’s Reasoning Ability with Low-level Visual Cues and Reflection ·
cs.CV· arXiv 2604.24339 · score 2 —reasoning - Mitigating Error Amplification in Fast Adversarial Training ·
cs.LG· arXiv 2604.24332 · score 2 —rag - Unconstrained Multi-view Human Pose Estimation with Algebraic Priors ·
cs.CV· arXiv 2604.24312 · score 2 —transformer - IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting ·
cs.LG· arXiv 2604.24224 · score 2 —attention - Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk ·
cs.CL· arXiv 2604.24197 · score 2 —reasoning - MemeScouts@LT-EDI 2026: Asking the Right Questions – Prompted Weak Supervision for Meme Hate Speech Detection ·
cs.CL· arXiv 2604.24179 · score 2 —reasoning - A Divergence-Based Method for Weighting and Averaging Model Predictions ·
stat.ML· arXiv 2604.24172 · score 2 —rag - Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales ·
cs.DC· arXiv 2604.24091 · score 2 —rag - A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws ·
cs.LG· arXiv 2604.24037 · score 2 —rag - KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances ·
cs.DC· arXiv 2604.24027 · score 2 —rag - Geometry-Aware Offline-to-Online Learning in Linear Contextual Bandits ·
cs.LG· arXiv 2604.24016 · score 2 —rag - SDSL-Solver: Scalable Distributed Sparse Linear Solvers for Large-Scale Interior Point Methods ·
cs.DC· arXiv 2604.23979 · score 2 —rag - Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction ·
cs.LG· arXiv 2604.23964 · score 2 —attention - Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach ·
cs.CV· arXiv 2604.23953 · score 2 —rag - KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters ·
cs.CL· arXiv 2604.23948 · score 2 —rag - Sliced-Regularized Optimal Transport ·
stat.ML· arXiv 2604.23944 · score 2 —rag - Quasi-Quadratic Gradient: A New Direction for Accelerating the BFGS Method in Quasi-Newton Optimization ·
math.OC· arXiv 2604.23922 · score 2 —rag - Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia’s National Electricity Market ·
cs.LG· arXiv 2604.23908 · score 2 —transformer - Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity ·
cs.LG· arXiv 2604.23867 · score 2 —rag - Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features ·
cs.AI· arXiv 2604.23829 · score 2 —reasoning
- April 28, 2026 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
- April 28, 2026 BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment
- April 28, 2026 DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
- April 28, 2026 Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
- April 28, 2026 Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
- April 28, 2026 PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
- April 28, 2026 Stabilizing Efficient Reasoning with Step-Level Advantage Selection
- April 28, 2026 The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models
- April 28, 2026 Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
- April 28, 2026 FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training