2026-04-28 论文速递

当日 agent / LLM / AI 基础设施方向共匹配到 206 篇 arXiv 论文,其中 10 篇由 Claude 从标题 + 作者 + 单位 维度挑出并生成详细分析,其余 196 篇列在文末。

1. FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training

arXiv: 2604.24013 · cs.LG · Claude 精选

The rapid growth in the size of large language models has necessitated the partitioning of computational workloads across accelerators such as GPUs, TPUs, and NPUs. However, these parallelization strategies incur substantial data communication overhead significantly hindering computational efficiency.

阅读完整分析 →


2. Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling

arXiv: 2604.24715 · cs.CL · Claude 精选

Hybrid sequence models that combine efficient Transformer components with linear sequence modeling blocks are a promising alternative to pure Transformers, but most are still pretrained from scratch and therefore fail to reuse existing Transformer checkpoints. We study upcycling as a practical path to convert pretrained Transformer LLMs into hybrid architectures while preserving short-context quality and improving long-context capability.

阅读完整分析 →


3. The Chameleon’s Limit: Investigating Persona Collapse and Homogenization in Large Language Models

arXiv: 2604.24698 · cs.CL · Claude 精选

Applications based on large language models (LLMs), such as multi-agent simulations, require population diversity among agents. We identify a pervasive failure mode we term \emph{Persona Collapse}: agents each assigned a distinct profile nonetheless converge into a narrow behavioral mode, producing a homogeneous simulated population.

阅读完整分析 →


4. Stabilizing Efficient Reasoning with Step-Level Advantage Selection

arXiv: 2604.24003 · cs.CL · Claude 精选

在 4K 短上下文 GRPO 后训练中,用基于 token log-prob 的 step 级 confidence 对 rollout 内部做 advantage 零值遮罩,稳住训练并压缩推理长度。

阅读完整分析 →


5. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

arXiv: 2604.24443 · cs.AI · Claude 精选

PhysNote 让 VLM 通过自生成的 “Knowledge Notes” 外化并演化物理推理知识,结合时空规范化与 InfoAgent 迭代验证,在 PhysBench 测试集上达到 56.68% 准确率。

阅读完整分析 →


6. Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

arXiv: 2604.24062 · cs.AI · Claude 精选

用 OpenLock 范式对比人类与 GPT-5.2/Claude-4.5/Gemini-3-Flash/DeepSeek-V3.2,发现模型在单环境内可匹敌或超越人类,但跨环境的因果结构迁移必须先"环境接地"才生效,呈现延迟迁移。

阅读完整分析 →


7. Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

arXiv: 2604.24512 · cs.AI · Claude 精选

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and formalize a systemic failure mode termed the Attention Latch in decoder-only autoregressive Transformers.

阅读完整分析 →


8. DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference

arXiv: 2604.24647 · cs.CL · Claude 精选

DepthKV 指出 Transformer 各层对 KV cache 剪枝敏感度差异显著,按 InfoNCE 等表征指标在固定全局预算下做层级非均匀分配,在摘要/QA/数学推理任务上一致优于 uniform 剪枝。

阅读完整分析 →


9. BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

arXiv: 2604.24273 · cs.LG · Claude 精选

The deployment of intelligent reinforcement learning (RL) agents on resource-constrained edge devices remains a fundamental challenge due to the substantial memory, computational, and energy requirements of modern deep learning systems. While large language models (LLMs) have emerged as powerful architectures for decision-making agents, their multi-billion parameter scale confines them to cloud-based deployment, raising concerns about latency, privacy, and connectivity dependence.

阅读完整分析 →


10. AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

arXiv: 2604.24039 · cs.LG · Claude 精选

AgenticCache 利用 embodied 任务的「plan locality」,让 agent 通过 2-gram plan 缓存 + 后台异步 LLM 更新器避免逐步调用 LLM,在四个多 agent benchmark 上平均成功率 +22%、延迟 -65%、token -50%。

阅读完整分析 →


其他当日匹配论文

这些论文命中了同样的主题关键词,但未被 Claude 选入 top-N 深度分析。

  1. Green Shielding: A User-Centric Approach Towards Trustworthy AI · cs.CL · arXiv 2604.24700 · score 27large language model, llm, agent, agentic, rag, serving
  2. EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce · cs.CL · arXiv 2604.23993 · score 27llm, agent, agentic, multi-agent, retrieval, reasoning
  3. JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training · cs.LG · arXiv 2604.23838 · score 25llm, agent, agentic, rag, parallelism, gpu
  4. Kwai Summary Attention Technical Report · cs.CL · arXiv 2604.24432 · score 24large language model, agent, agentic, reasoning, inference, kv cache
  5. Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing · cs.CR · arXiv 2604.24162 · score 24large language model, llm, rag, reasoning, inference, serving
  6. RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation · cs.SE · arXiv 2604.24218 · score 23large language model, llm, agent, agentic, multi-agent, rag
  7. Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis · cs.CL · arXiv 2604.24198 · score 26large language model, llm, agent, agentic, reasoning, inference
  8. Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery · cs.SE · arXiv 2604.23940 · score 21llm, agent, agentic, multi-agent, rag, compiler
  9. GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems · cs.CR · arXiv 2604.24477 · score 20large language model, llm, agent, multi-agent, inference
  10. Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing · cs.CR · arXiv 2604.24203 · score 20llm, agent, agentic, rag, reasoning, serving
  11. Skill Retrieval Augmentation for Agentic AI · cs.CL · arXiv 2604.24594 · score 19large language model, llm, agent, agentic, retrieval
  12. DPEPO: Diverse Parallel Exploration Policy Optimization for LLM-based Agents · cs.CL · arXiv 2604.24320 · score 19large language model, llm, agent, rag, reasoning, fine-tun
  13. FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data · cs.AI · arXiv 2604.24572 · score 18llm, agent, agentic, multi-agent, reasoning
  14. Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus · cs.AI · arXiv 2604.24473 · score 26llm, agent, agentic, retrieval, rag, reasoning
  15. Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study · cs.SE · arXiv 2604.24678 · score 21large language model, llm, rag, serving, fine-tun
  16. Strategic Bidding in 6G Spectrum Auctions with Large Language Models · cs.GT · arXiv 2604.24156 · score 17large language model, llm, agent, rag, reasoning
  17. MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation · cs.SE · arXiv 2604.24222 · score 24large language model, llm, retrieval, rag, inference
  18. Latency and Cost of Multi-Agent Intelligent Tutoring at Scale · cs.CY · arXiv 2604.24110 · score 16llm, agent, multi-agent, throughput, latency
  19. The Pragmatic Persona: Discovering LLM Persona through Bridging Inference · cs.CL · arXiv 2604.24079 · score 20large language model, llm, rag, reasoning, inference
  20. LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People · cs.AI · arXiv 2604.23970 · score 16llm, agent, agentic, multi-agent
  21. XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation · cs.AI · arXiv 2604.24623 · score 15large language model, llm, retrieval, rag, reasoning
  22. SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering · cs.CL · arXiv 2604.24515 · score 15large language model, llm, retrieval, reasoning, fine-tun
  23. OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents · cs.CL · arXiv 2604.24348 · score 15large language model, llm, agent, latency
  24. Generating Place-Based Compromises Between Two Points of View · cs.CL · arXiv 2604.24536 · score 14large language model, llm, reasoning, inference
  25. SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution · cs.CL · arXiv 2604.24372 · score 14llm, agent, retrieval, ai system
  26. ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems · cs.AI · arXiv 2604.23878 · score 14llm, agent, rag, ai system
  27. Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models · cs.IR · arXiv 2604.24608 · score 13large language model, llm, rag, attention
  28. MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG · cs.CL · arXiv 2604.24564 · score 13large language model, llm, retrieval, rag
  29. Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations · cs.AI · arXiv 2604.24562 · score 13large language model, llm, rag, reasoning
  30. Why AI Harms Can’t Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality · cs.CY · arXiv 2604.24519 · score 13large language model, llm, ai system
  31. A Multi-Dimensional Audit of Politically Aligned Large Language Models · cs.CL · arXiv 2604.24429 · score 13large language model, llm, reasoning, fine-tun
  32. Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion · cs.LG · arXiv 2604.24351 · score 13rag, inference, serving, kv-cache
  33. MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning · cs.CL · arXiv 2604.24186 · score 13large language model, llm, rag, reasoning
  34. Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels · cs.LG · arXiv 2604.24008 · score 13large language model, rag, quantization, gpu, post-train
  35. Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning · cs.LG · arXiv 2604.23987 · score 13large language model, llm, rag, fine-tun
  36. What Did They Mean? How LLMs Resolve Ambiguous Social Situations across Perspectives and Roles · cs.HC · arXiv 2604.23942 · score 13large language model, llm, serving
  37. Generative Synthetic Data for Causal Inference: Pitfalls, Remedies, and Opportunities · stat.ME · arXiv 2604.23904 · score 13llm, rag, inference, serving
  38. Evaluation of Prompt Injection Defenses in Large Language Models · cs.CR · arXiv 2604.23887 · score 13large language model, llm, ai system
  39. Knowledge Vector of Logical Reasoning in Large Language Models · cs.CL · arXiv 2604.23877 · score 13large language model, llm, rag, reasoning
  40. Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models · cs.LG · arXiv 2604.24708 · score 12rag, serving, attention, gpu, scheduler
  41. Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft · cs.AI · arXiv 2604.24697 · score 12llm, agent, ai system
  42. The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications · cs.AI · arXiv 2604.24668 · score 12llm, agent, agentic
  43. Evaluating whether AI models would sabotage AI safety research · cs.AI · arXiv 2604.24618 · score 12llm, agent, rag, reasoning
  44. Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models · cs.CR · arXiv 2604.24542 · score 12large language model, llm, inference
  45. From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills · cs.CL · arXiv 2604.24026 · score 12llm, agent, rag, reasoning
  46. QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems · cs.AI · arXiv 2604.24021 · score 12llm, multi-agent, ai system
  47. Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction · cs.LG · arXiv 2604.23989 · score 12large language model, llm, inference
  48. TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment · cs.CL · arXiv 2604.23938 · score 12agent, agentic, multi-agent
  49. Inverting Foundation Models of Brain Function with Simulation-Based Inference · cs.LG · arXiv 2604.23865 · score 12large language model, llm, inference
  50. Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination · cs.CL · arXiv 2604.24690 · score 11large language model, llm, reasoning
  51. Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation · cs.CL · arXiv 2604.24665 · score 11large language model, llm, reasoning
  52. K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology · cs.CL · arXiv 2604.24645 · score 11large language model, agent, reasoning
  53. STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator · cs.AI · arXiv 2604.24544 · score 11large language model, llm, rag
  54. A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations · cs.CR · arXiv 2604.24468 · score 11large language model, llm, fine-tun
  55. Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation · cs.CL · arXiv 2604.24361 · score 11large language model, llm, rag
  56. AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models · cs.CL · arXiv 2604.24175 · score 11large language model, llm, reasoning
  57. TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training · cs.DC · arXiv 2604.24088 · score 11llm, parallelism, quantization, throughput
  58. QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering · cs.CV · arXiv 2604.24052 · score 11large language model, llm, rag
  59. Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs · cs.AI · arXiv 2604.23949 · score 11large language model, llm, rag
  60. SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems · cs.CR · arXiv 2604.23905 · score 11llm, retrieval, inference, fine-tun
  61. LLM-Augmented Traffic Signal Control with LSTM-Based Traffic State Prediction and Safety-Constrained Decision Support · cs.AI · arXiv 2604.23902 · score 11large language model, llm, reasoning
  62. ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation · cs.AI · arXiv 2604.23853 · score 11llm, agent, tool use
  63. One Size Fits None: Heuristic Collapse in LLM Investment Advice · cs.CL · arXiv 2604.23837 · score 11large language model, llm, reasoning
  64. Resource-Lean Lexicon Induction for German Dialects · cs.CL · arXiv 2604.23824 · score 11large language model, llm, retrieval
  65. Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters · cs.AI · arXiv 2604.24710 · score 10llm, agent, rag
  66. GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility · cs.LG · arXiv 2604.24549 · score 10agent, multi-agent, gpu
  67. DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models · cs.LG · arXiv 2604.24357 · score 10llm, rag, reasoning, post-train
  68. The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers · cs.CY · arXiv 2604.24155 · score 10agent, reasoning, ai system
  69. Improving Robustness of Tabular Retrieval via Representational Stability · cs.CL · arXiv 2604.24040 · score 10retrieval, rag, serving, transformer
  70. Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents · cs.AI · arXiv 2604.23990 · score 10agent, rag, serving
  71. GAMED.AI: A Hierarchical Multi-Agent Framework for Automated Educational Game Generation · cs.AI · arXiv 2604.23947 · score 10agent, multi-agent, reasoning
  72. Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis · cs.SE · arXiv 2604.24703 · score 9large language model, llm
  73. AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents · cs.CR · arXiv 2604.24657 · score 9large language model, agent
  74. Zero-shot Large Language Models for Automatic Readability Assessment · cs.CL · arXiv 2604.24470 · score 9large language model, llm
  75. Can You Make It Sound Like You? Post-Editing LLM-Generated Text for Personal Style · cs.CL · arXiv 2604.24444 · score 9large language model, llm
  76. Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment · cs.LG · arXiv 2604.24178 · score 9large language model, llm
  77. Progressive Approximation in Deep Residual Networks: Theory and Validation · cs.LG · arXiv 2604.24154 · score 9llm, inference, transformer
  78. An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress · cs.AI · arXiv 2604.24076 · score 9large language model, llm
  79. A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees · cs.AI · arXiv 2604.24043 · score 9large language model, llm
  80. Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents · cs.CR · arXiv 2604.24020 · score 9agent, rag, inference
  81. IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models · cs.HC · arXiv 2604.24002 · score 9agent, reasoning, inference
  82. When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models · cs.LG · arXiv 2604.23994 · score 9llm, inference, attention
  83. Representational Curvature Modulates Behavioral Uncertainty in Large Language Models · cs.AI · arXiv 2604.23985 · score 9large language model, llm
  84. Translate or Simplify First: An Analysis of Cross-lingual Text Simplification in English and French · cs.CL · arXiv 2604.23844 · score 9large language model, llm
  85. Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs · cs.LG · arXiv 2604.23841 · score 9agent, inference, latency
  86. Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application · cs.SE · arXiv 2604.24636 · score 8llm, rag, latency
  87. Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence · cs.AI · arXiv 2604.24527 · score 8agent, ai system
  88. Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales · cs.HC · arXiv 2604.24461 · score 8llm, agent
  89. Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment · cs.RO · arXiv 2604.24447 · score 8inference, parallelism, gpu
  90. Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering · cs.CL · arXiv 2604.24334 · score 8retrieval, rag, serving
  91. Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU · cs.AI · arXiv 2604.24219 · score 8llm, retrieval, latency
  92. Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems · cs.AI · arXiv 2604.24153 · score 8serving, ai system
  93. An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources · cs.AI · arXiv 2604.24117 · score 8agent, multi-agent
  94. FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost · cs.LG · arXiv 2604.24073 · score 8rag, distributed training, gpu
  95. DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery · cs.CV · arXiv 2604.24029 · score 8retrieval, reasoning, chain-of-thought, fine-tun
  96. Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions · cs.AI · arXiv 2604.23924 · score 8agent, agentic
  97. MarketBench: Evaluating AI Agents as Market Participants · cs.AI · arXiv 2604.23897 · score 8llm, agent
  98. Geometry Preserving Loss Functions Promote Improved Adaptation of Blackbox Generative Model · cs.LG · arXiv 2604.23888 · score 8rag, serving, fine-tun
  99. Graph Memory Transformer (GMT) · cs.LG · arXiv 2604.23862 · score 8serving, attention, transformer
  100. Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification · cs.AI · arXiv 2604.23854 · score 8serving, attention, fine-tun
  101. The Last Human-Written Paper: Agent-Native Research Artifacts · cs.LG · arXiv 2604.24658 · score 7agent, compiler
  102. CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies · cs.CV · arXiv 2604.24622 · score 7rag, inference, latency
  103. Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation · cs.CV · arXiv 2604.24396 · score 7rag, inference, attention
  104. AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation · cs.RO · arXiv 2604.24086 · score 7inference, latency, fine-tun
  105. Architectural Isolation as a Timing Safety Primitive for Edge AI Medical Devices: Controlled Experimental Evidence on a Shared-Silicon Platform · cs.AR · arXiv 2604.23831 · score 7inference, gpu, latency
  106. Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents · cs.AI · arXiv 2604.24686 · score 6agent, rag
  107. Meta-CoT: Enhancing Granularity and Generalization in Image Editing · cs.CV · arXiv 2604.24625 · score 6rag, reasoning, chain-of-thought
  108. GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation · quant-ph · arXiv 2604.24551 · score 6rag, serving
  109. Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI · cs.CV · arXiv 2604.24492 · score 6latency, fine-tun, post-train
  110. Incisor: Ex Ante Cloud Instance Selection for HPC Jobs · cs.DC · arXiv 2604.24464 · score 6llm, reasoning
  111. Certified geometric robustness – Super-DeepG · cs.AI · arXiv 2604.24379 · score 6rag, reasoning, gpu
  112. Learning Evidence of Depression Symptoms via Prompt Induction · cs.CL · arXiv 2604.24376 · score 6llm, fine-tun
  113. SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation · cs.LG · arXiv 2604.24368 · score 6llm, rag
  114. SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting · cs.LG · arXiv 2604.24306 · score 6rag, attention, transformer
  115. Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop · cs.AI · arXiv 2604.24158 · score 6llm, reasoning
  116. Leveraging Human Feedback for Semantically-Relevant Skill Discovery · cs.LG · arXiv 2604.24127 · score 6agent, rag
  117. Psychologically-Grounded Graph Modeling for Interpretable Depression Detection · cs.CL · arXiv 2604.24126 · score 6llm, attention
  118. Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising · cs.CL · arXiv 2604.24104 · score 6llm, fine-tun
  119. Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B · cs.CL · arXiv 2604.24070 · score 6llm, fine-tun
  120. TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents · cs.LG · arXiv 2604.24005 · score 6agent, reasoning
  121. Hindsight Preference Optimization for Financial Time Series Advisory · cs.LG · arXiv 2604.23988 · score 6llm, reasoning
  122. Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity · cs.CL · arXiv 2604.23972 · score 6llm, reasoning
  123. Do Quantum Transformers Help? A Systematic VQC Architecture Comparison on Tabular Benchmarks · quant-ph · arXiv 2604.23931 · score 6rag, attention, transformer
  124. Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering · cs.LG · arXiv 2604.23912 · score 6rag, serving
  125. Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows · cs.CL · arXiv 2604.23855 · score 6llm, rag
  126. Contextual Linear Activation Steering of Language Models · cs.CL · arXiv 2604.24693 · score 5large language model
  127. MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining · cs.CL · arXiv 2604.24374 · score 5inference, attention
  128. Machine-Learning-Based Classification of Radio Frequency Building Loss · cs.LG · arXiv 2604.24143 · score 5rag, inference
  129. PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality · cs.CL · arXiv 2604.24071 · score 5large language model
  130. Integrative neurocybernetic modeling in the era of large-scale neuroscience · q-bio.NC · arXiv 2604.23903 · score 5rag, inference
  131. Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes · cs.LG · arXiv 2604.24745 · score 4serving
  132. Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling · cs.AI · arXiv 2604.24717 · score 4attention, transformer
  133. Fraud Detection in Cryptocurrency Markets with Spatio-Temporal Graph Neural Networks · cs.LG · arXiv 2604.24590 · score 4attention, transformer
  134. A systematic evaluation of vision-language models for observational astronomical reasoning tasks · cs.AI · arXiv 2604.24589 · score 4reasoning, attention
  135. Dialysis Risk Prediction and Treatment Effect Estimation for AKI patients using Longitudinal Electronic Health Records · cs.LG · arXiv 2604.24547 · score 4rag, transformer
  136. Understanding the Limits of Automated Evaluation for Code Review Bots in Practice · cs.SE · arXiv 2604.24525 · score 4llm
  137. ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet Data · cs.CV · arXiv 2604.24353 · score 4attention, transformer
  138. Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training · cs.LG · arXiv 2604.24350 · score 4rag, attention
  139. Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions · eess.IV · arXiv 2604.24347 · score 4rag, transformer
  140. Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems · cs.DC · arXiv 2604.24340 · score 4rag, latency
  141. Perfecting Aircraft Maneuvers with Reinforcement Learning · cs.LG · arXiv 2604.24338 · score 4agent
  142. X-NegoBox: An Explainable Privacy-Budget Negotiation Framework for Secure Peer-to-Peer Energy Data Exchange · cs.CR · arXiv 2604.24326 · score 4serving
  143. Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer · cs.CL · arXiv 2604.24302 · score 4retrieval, reasoning
  144. Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions · cs.LG · arXiv 2604.24293 · score 4serving
  145. RowHammer Vulnerability Counter (RVC): Redefining RowHammer Detection with Victim-Centric Tracking · cs.CR · arXiv 2604.24287 · score 4rag, latency
  146. Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring · eess.IV · arXiv 2604.24236 · score 4rag, transformer
  147. CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification · cs.LG · arXiv 2604.24201 · score 4rag, fine-tun
  148. IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning · cs.CL · arXiv 2604.24114 · score 4reasoning, fine-tun
  149. BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning · cs.CL · arXiv 2604.24089 · score 4serving
  150. How Sensitive Are Safety Benchmarks to Judge Configuration Choices? · cs.CL · arXiv 2604.24074 · score 4llm
  151. AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment · cs.AI · arXiv 2604.24038 · score 4agent
  152. FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection · cs.LG · arXiv 2604.24012 · score 4serving
  153. Adaptive-Distribution Randomized Neural Networks for PDEs: A Low-Dimensional Distribution-Learning Framework · math.NA · arXiv 2604.23999 · score 4serving
  154. DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting · cs.LG · arXiv 2604.23968 · score 4attention, transformer
  155. Crystal structure prediction using graph neural combinatorial optimization · cs.LG · arXiv 2604.23921 · score 4rag, gpu
  156. Cardiac Stability Theory: An Axiomatically Grounded Framework for Continuous Cardiac Health Monitoring via Smartphone Photoplethysmography · cs.LG · arXiv 2604.23876 · score 4transformer, latency
  157. Exploring Audio Hallucination in Egocentric Video Understanding · cs.CV · arXiv 2604.23860 · score 4llm
  158. Focus on What Matters: Two-Stage ROI-Aware Refinement for Anatomy-Preserving Fetal Ultrasound Reconstruction · cs.CV · arXiv 2604.23839 · score 4serving
  159. Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks · cs.LG · arXiv 2604.24637 · score 3inference
  160. MIMIC: A Generative Multimodal Foundation Model for Biomolecules · cs.AI · arXiv 2604.24506 · score 3inference
  161. Compilation and Execution of an Embeddable YOLO-NAS on the VTA · cs.AR · arXiv 2604.24455 · score 3compiler
  162. SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors · cs.RO · arXiv 2604.24449 · score 3inference
  163. Scaling Properties of Continuous Diffusion Spoken Language Models · cs.CL · arXiv 2604.24416 · score 3inference
  164. Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach · cs.LG · arXiv 2604.24280 · score 3inference
  165. Speech Enhancement Based on Drifting Models · cs.SD · arXiv 2604.24199 · score 3inference
  166. Learning to Think from Multiple Thinkers · cs.LG · arXiv 2604.24737 · score 2chain-of-thought
  167. Déjà Vu Packing: Optimizing FPGA Logic Clustering Runtime via Pattern Memoization · cs.AR · arXiv 2604.24649 · score 2rag
  168. NeSyCat: A Monad-Based Categorical Semantics of the Neurosymbolic ULLER Framework · cs.AI · arXiv 2604.24612 · score 2reasoning
  169. Hierarchical Behaviour Spaces · cs.AI · arXiv 2604.24558 · score 2reasoning
  170. SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances · cs.DC · arXiv 2604.24548 · score 2rag
  171. A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning · cs.LG · arXiv 2604.24532 · score 2rag
  172. SceneSelect: Selective Learning for Trajectory Scene Classification and Expert Scheduling · cs.LG · arXiv 2604.24514 · score 2rag
  173. Modeling Behavioral Intensity and Transitions for Generative Recommendation · cs.IR · arXiv 2604.24472 · score 2attention
  174. All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation · cs.SD · arXiv 2604.24401 · score 2rag
  175. Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware · quant-ph · arXiv 2604.24397 · score 2fine-tun
  176. PathMoG: A Pathway-Centric Modular Graph Neural Network for Multi-Omics Survival Prediction · cs.LG · arXiv 2604.24371 · score 2attention
  177. See Further, Think Deeper: Advancing VLM’s Reasoning Ability with Low-level Visual Cues and Reflection · cs.CV · arXiv 2604.24339 · score 2reasoning
  178. Mitigating Error Amplification in Fast Adversarial Training · cs.LG · arXiv 2604.24332 · score 2rag
  179. Unconstrained Multi-view Human Pose Estimation with Algebraic Priors · cs.CV · arXiv 2604.24312 · score 2transformer
  180. IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting · cs.LG · arXiv 2604.24224 · score 2attention
  181. Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk · cs.CL · arXiv 2604.24197 · score 2reasoning
  182. MemeScouts@LT-EDI 2026: Asking the Right Questions – Prompted Weak Supervision for Meme Hate Speech Detection · cs.CL · arXiv 2604.24179 · score 2reasoning
  183. A Divergence-Based Method for Weighting and Averaging Model Predictions · stat.ML · arXiv 2604.24172 · score 2rag
  184. Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales · cs.DC · arXiv 2604.24091 · score 2rag
  185. A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws · cs.LG · arXiv 2604.24037 · score 2rag
  186. KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances · cs.DC · arXiv 2604.24027 · score 2rag
  187. Geometry-Aware Offline-to-Online Learning in Linear Contextual Bandits · cs.LG · arXiv 2604.24016 · score 2rag
  188. SDSL-Solver: Scalable Distributed Sparse Linear Solvers for Large-Scale Interior Point Methods · cs.DC · arXiv 2604.23979 · score 2rag
  189. Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction · cs.LG · arXiv 2604.23964 · score 2attention
  190. Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach · cs.CV · arXiv 2604.23953 · score 2rag
  191. KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters · cs.CL · arXiv 2604.23948 · score 2rag
  192. Sliced-Regularized Optimal Transport · stat.ML · arXiv 2604.23944 · score 2rag
  193. Quasi-Quadratic Gradient: A New Direction for Accelerating the BFGS Method in Quasi-Newton Optimization · math.OC · arXiv 2604.23922 · score 2rag
  194. Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia’s National Electricity Market · cs.LG · arXiv 2604.23908 · score 2transformer
  195. Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity · cs.LG · arXiv 2604.23867 · score 2rag
  196. Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features · cs.AI · arXiv 2604.23829 · score 2reasoning