2026-04-28 论文速递

当日 agent / LLM / AI 基础设施方向共匹配到 206 篇 arXiv 论文，其中 10 篇由 Claude 从标题 + 作者 + 单位维度挑出并生成详细分析，其余 196 篇列在文末。

1. FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training

arXiv: 2604.24013 · cs.LG · Claude 精选

The rapid growth in the size of large language models has necessitated the partitioning of computational workloads across accelerators such as GPUs, TPUs, and NPUs. However, these parallelization strategies incur substantial data communication overhead significantly hindering computational efficiency.

阅读完整分析 →

2. Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling

arXiv: 2604.24715 · cs.CL · Claude 精选

Hybrid sequence models that combine efficient Transformer components with linear sequence modeling blocks are a promising alternative to pure Transformers, but most are still pretrained from scratch and therefore fail to reuse existing Transformer checkpoints. We study upcycling as a practical path to convert pretrained Transformer LLMs into hybrid architectures while preserving short-context quality and improving long-context capability.

阅读完整分析 →

3. The Chameleon’s Limit: Investigating Persona Collapse and Homogenization in Large Language Models

arXiv: 2604.24698 · cs.CL · Claude 精选

Applications based on large language models (LLMs), such as multi-agent simulations, require population diversity among agents. We identify a pervasive failure mode we term \emph{Persona Collapse}: agents each assigned a distinct profile nonetheless converge into a narrow behavioral mode, producing a homogeneous simulated population.

阅读完整分析 →

4. Stabilizing Efficient Reasoning with Step-Level Advantage Selection

arXiv: 2604.24003 · cs.CL · Claude 精选

在 4K 短上下文 GRPO 后训练中，用基于 token log-prob 的 step 级 confidence 对 rollout 内部做 advantage 零值遮罩，稳住训练并压缩推理长度。

阅读完整分析 →

5. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

arXiv: 2604.24443 · cs.AI · Claude 精选

PhysNote 让 VLM 通过自生成的 “Knowledge Notes” 外化并演化物理推理知识，结合时空规范化与 InfoAgent 迭代验证，在 PhysBench 测试集上达到 56.68% 准确率。

阅读完整分析 →

6. Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer

arXiv: 2604.24062 · cs.AI · Claude 精选

用 OpenLock 范式对比人类与 GPT-5.2/Claude-4.5/Gemini-3-Flash/DeepSeek-V3.2，发现模型在单环境内可匹敌或超越人类，但跨环境的因果结构迁移必须先"环境接地"才生效，呈现延迟迁移。

阅读完整分析 →

7. Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

arXiv: 2604.24512 · cs.AI · Claude 精选

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and formalize a systemic failure mode termed the Attention Latch in decoder-only autoregressive Transformers.

阅读完整分析 →

8. DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference

arXiv: 2604.24647 · cs.CL · Claude 精选

DepthKV 指出 Transformer 各层对 KV cache 剪枝敏感度差异显著，按 InfoNCE 等表征指标在固定全局预算下做层级非均匀分配，在摘要/QA/数学推理任务上一致优于 uniform 剪枝。

阅读完整分析 →

9. BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment

arXiv: 2604.24273 · cs.LG · Claude 精选

The deployment of intelligent reinforcement learning (RL) agents on resource-constrained edge devices remains a fundamental challenge due to the substantial memory, computational, and energy requirements of modern deep learning systems. While large language models (LLMs) have emerged as powerful architectures for decision-making agents, their multi-billion parameter scale confines them to cloud-based deployment, raising concerns about latency, privacy, and connectivity dependence.

阅读完整分析 →

10. AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

arXiv: 2604.24039 · cs.LG · Claude 精选

AgenticCache 利用 embodied 任务的「plan locality」，让 agent 通过 2-gram plan 缓存 + 后台异步 LLM 更新器避免逐步调用 LLM，在四个多 agent benchmark 上平均成功率 +22%、延迟 -65%、token -50%。

阅读完整分析 →

其他当日匹配论文

这些论文命中了同样的主题关键词，但未被 Claude 选入 top-N 深度分析。

Green Shielding: A User-Centric Approach Towards Trustworthy AI · cs.CL · arXiv 2604.24700 · score 27 — large language model, llm, agent, agentic, rag, serving
EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce · cs.CL · arXiv 2604.23993 · score 27 — llm, agent, agentic, multi-agent, retrieval, reasoning
JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training · cs.LG · arXiv 2604.23838 · score 25 — llm, agent, agentic, rag, parallelism, gpu
Kwai Summary Attention Technical Report · cs.CL · arXiv 2604.24432 · score 24 — large language model, agent, agentic, reasoning, inference, kv cache
Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing · cs.CR · arXiv 2604.24162 · score 24 — large language model, llm, rag, reasoning, inference, serving
RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation · cs.SE · arXiv 2604.24218 · score 23 — large language model, llm, agent, agentic, multi-agent, rag
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis · cs.CL · arXiv 2604.24198 · score 26 — large language model, llm, agent, agentic, reasoning, inference
Constraint-Guided Multi-Agent Decompilation for Executable Binary Recovery · cs.SE · arXiv 2604.23940 · score 21 — llm, agent, agentic, multi-agent, rag, compiler
GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems · cs.CR · arXiv 2604.24477 · score 20 — large language model, llm, agent, multi-agent, inference
Agentic Witnessing: Pragmatic and Scalable TEE-Enabled Privacy-Preserving Auditing · cs.CR · arXiv 2604.24203 · score 20 — llm, agent, agentic, rag, reasoning, serving
Skill Retrieval Augmentation for Agentic AI · cs.CL · arXiv 2604.24594 · score 19 — large language model, llm, agent, agentic, retrieval
DPEPO: Diverse Parallel Exploration Policy Optimization for LLM-based Agents · cs.CL · arXiv 2604.24320 · score 19 — large language model, llm, agent, rag, reasoning, fine-tun
FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data · cs.AI · arXiv 2604.24572 · score 18 — llm, agent, agentic, multi-agent, reasoning
Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus · cs.AI · arXiv 2604.24473 · score 26 — llm, agent, agentic, retrieval, rag, reasoning
Leveraging LLMs for Multi-File DSL Code Generation: An Industrial Case Study · cs.SE · arXiv 2604.24678 · score 21 — large language model, llm, rag, serving, fine-tun
Strategic Bidding in 6G Spectrum Auctions with Large Language Models · cs.GT · arXiv 2604.24156 · score 17 — large language model, llm, agent, rag, reasoning
MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation · cs.SE · arXiv 2604.24222 · score 24 — large language model, llm, retrieval, rag, inference
Latency and Cost of Multi-Agent Intelligent Tutoring at Scale · cs.CY · arXiv 2604.24110 · score 16 — llm, agent, multi-agent, throughput, latency
The Pragmatic Persona: Discovering LLM Persona through Bridging Inference · cs.CL · arXiv 2604.24079 · score 20 — large language model, llm, rag, reasoning, inference
LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People · cs.AI · arXiv 2604.23970 · score 16 — llm, agent, agentic, multi-agent
XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation · cs.AI · arXiv 2604.24623 · score 15 — large language model, llm, retrieval, rag, reasoning
SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering · cs.CL · arXiv 2604.24515 · score 15 — large language model, llm, retrieval, reasoning, fine-tun
OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents · cs.CL · arXiv 2604.24348 · score 15 — large language model, llm, agent, latency
Generating Place-Based Compromises Between Two Points of View · cs.CL · arXiv 2604.24536 · score 14 — large language model, llm, reasoning, inference
SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution · cs.CL · arXiv 2604.24372 · score 14 — llm, agent, retrieval, ai system
ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems · cs.AI · arXiv 2604.23878 · score 14 — llm, agent, rag, ai system
Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models · cs.IR · arXiv 2604.24608 · score 13 — large language model, llm, rag, attention
MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG · cs.CL · arXiv 2604.24564 · score 13 — large language model, llm, retrieval, rag
Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations · cs.AI · arXiv 2604.24562 · score 13 — large language model, llm, rag, reasoning
Why AI Harms Can’t Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality · cs.CY · arXiv 2604.24519 · score 13 — large language model, llm, ai system
A Multi-Dimensional Audit of Politically Aligned Large Language Models · cs.CL · arXiv 2604.24429 · score 13 — large language model, llm, reasoning, fine-tun
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion · cs.LG · arXiv 2604.24351 · score 13 — rag, inference, serving, kv-cache
MultiDx: A Multi-Source Knowledge Integration Framework towards Diagnostic Reasoning · cs.CL · arXiv 2604.24186 · score 13 — large language model, llm, rag, reasoning
Coverage-Based Calibration for Post-Training Quantization via Weighted Set Cover over Outlier Channels · cs.LG · arXiv 2604.24008 · score 13 — large language model, rag, quantization, gpu, post-train
Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning · cs.LG · arXiv 2604.23987 · score 13 — large language model, llm, rag, fine-tun
What Did They Mean? How LLMs Resolve Ambiguous Social Situations across Perspectives and Roles · cs.HC · arXiv 2604.23942 · score 13 — large language model, llm, serving
Generative Synthetic Data for Causal Inference: Pitfalls, Remedies, and Opportunities · stat.ME · arXiv 2604.23904 · score 13 — llm, rag, inference, serving
Evaluation of Prompt Injection Defenses in Large Language Models · cs.CR · arXiv 2604.23887 · score 13 — large language model, llm, ai system
Knowledge Vector of Logical Reasoning in Large Language Models · cs.CL · arXiv 2604.23877 · score 13 — large language model, llm, rag, reasoning
Scalable Hyperparameter-Divergent Ensemble Training with Automatic Learning Rate Exploration for Large Models · cs.LG · arXiv 2604.24708 · score 12 — rag, serving, attention, gpu, scheduler
Can Current Agents Close the Discovery-to-Application Gap? A Case Study in Minecraft · cs.AI · arXiv 2604.24697 · score 12 — llm, agent, ai system
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications · cs.AI · arXiv 2604.24668 · score 12 — llm, agent, agentic
Evaluating whether AI models would sabotage AI safety research · cs.AI · arXiv 2604.24618 · score 12 — llm, agent, rag, reasoning
Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models · cs.CR · arXiv 2604.24542 · score 12 — large language model, llm, inference
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills · cs.CL · arXiv 2604.24026 · score 12 — llm, agent, rag, reasoning
QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems · cs.AI · arXiv 2604.24021 · score 12 — llm, multi-agent, ai system
Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction · cs.LG · arXiv 2604.23989 · score 12 — large language model, llm, inference
TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment · cs.CL · arXiv 2604.23938 · score 12 — agent, agentic, multi-agent
Inverting Foundation Models of Brain Function with Simulation-Based Inference · cs.LG · arXiv 2604.23865 · score 12 — large language model, llm, inference
Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination · cs.CL · arXiv 2604.24690 · score 11 — large language model, llm, reasoning
Benchmarking Source-Sensitive Reasoning in Turkish: Humans and LLMs under Evidential Trust Manipulation · cs.CL · arXiv 2604.24665 · score 11 — large language model, llm, reasoning
K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology · cs.CL · arXiv 2604.24645 · score 11 — large language model, agent, reasoning
STELLAR-E: a Synthetic, Tailored, End-to-end LLM Application Rigorous Evaluator · cs.AI · arXiv 2604.24544 · score 11 — large language model, llm, rag
A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations · cs.CR · arXiv 2604.24468 · score 11 — large language model, llm, fine-tun
Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation · cs.CL · arXiv 2604.24361 · score 11 — large language model, llm, rag
AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models · cs.CL · arXiv 2604.24175 · score 11 — large language model, llm, reasoning
TACO: Efficient Communication Compression of Intermediate Tensors for Scalable Tensor-Parallel LLM Training · cs.DC · arXiv 2604.24088 · score 11 — llm, parallelism, quantization, throughput
QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering · cs.CV · arXiv 2604.24052 · score 11 — large language model, llm, rag
Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs · cs.AI · arXiv 2604.23949 · score 11 — large language model, llm, rag
SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems · cs.CR · arXiv 2604.23905 · score 11 — llm, retrieval, inference, fine-tun
LLM-Augmented Traffic Signal Control with LSTM-Based Traffic State Prediction and Safety-Constrained Decision Support · cs.AI · arXiv 2604.23902 · score 11 — large language model, llm, reasoning
ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation · cs.AI · arXiv 2604.23853 · score 11 — llm, agent, tool use
One Size Fits None: Heuristic Collapse in LLM Investment Advice · cs.CL · arXiv 2604.23837 · score 11 — large language model, llm, reasoning
Resource-Lean Lexicon Induction for German Dialects · cs.CL · arXiv 2604.23824 · score 11 — large language model, llm, retrieval
Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters · cs.AI · arXiv 2604.24710 · score 10 — llm, agent, rag
GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility · cs.LG · arXiv 2604.24549 · score 10 — agent, multi-agent, gpu
DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models · cs.LG · arXiv 2604.24357 · score 10 — llm, rag, reasoning, post-train
The Alignment Target Problem: Divergent Moral Judgments of Humans, AI Systems, and Their Designers · cs.CY · arXiv 2604.24155 · score 10 — agent, reasoning, ai system
Improving Robustness of Tabular Retrieval via Representational Stability · cs.CL · arXiv 2604.24040 · score 10 — retrieval, rag, serving, transformer
Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents · cs.AI · arXiv 2604.23990 · score 10 — agent, rag, serving
GAMED.AI: A Hierarchical Multi-Agent Framework for Automated Educational Game Generation · cs.AI · arXiv 2604.23947 · score 10 — agent, multi-agent, reasoning
Defective Task Descriptions in LLM-Based Code Generation: Detection and Analysis · cs.SE · arXiv 2604.24703 · score 9 — large language model, llm
AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents · cs.CR · arXiv 2604.24657 · score 9 — large language model, agent
Zero-shot Large Language Models for Automatic Readability Assessment · cs.CL · arXiv 2604.24470 · score 9 — large language model, llm
Can You Make It Sound Like You? Post-Editing LLM-Generated Text for Personal Style · cs.CL · arXiv 2604.24444 · score 9 — large language model, llm
Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment · cs.LG · arXiv 2604.24178 · score 9 — large language model, llm
Progressive Approximation in Deep Residual Networks: Theory and Validation · cs.LG · arXiv 2604.24154 · score 9 — llm, inference, transformer
An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress · cs.AI · arXiv 2604.24076 · score 9 — large language model, llm
A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees · cs.AI · arXiv 2604.24043 · score 9 — large language model, llm
Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents · cs.CR · arXiv 2604.24020 · score 9 — agent, rag, inference
IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models · cs.HC · arXiv 2604.24002 · score 9 — agent, reasoning, inference
When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models · cs.LG · arXiv 2604.23994 · score 9 — llm, inference, attention
Representational Curvature Modulates Behavioral Uncertainty in Large Language Models · cs.AI · arXiv 2604.23985 · score 9 — large language model, llm
Translate or Simplify First: An Analysis of Cross-lingual Text Simplification in English and French · cs.CL · arXiv 2604.23844 · score 9 — large language model, llm
Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs · cs.LG · arXiv 2604.23841 · score 9 — agent, inference, latency
Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application · cs.SE · arXiv 2604.24636 · score 8 — llm, rag, latency
Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence · cs.AI · arXiv 2604.24527 · score 8 — agent, ai system
Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales · cs.HC · arXiv 2604.24461 · score 8 — llm, agent
Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment · cs.RO · arXiv 2604.24447 · score 8 — inference, parallelism, gpu
Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering · cs.CL · arXiv 2604.24334 · score 8 — retrieval, rag, serving
Adaptive ToR: Complexity-Aware Tree-Based Retrieval for Pareto-Optimal Multi-Intent NLU · cs.AI · arXiv 2604.24219 · score 8 — llm, retrieval, latency
Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems · cs.AI · arXiv 2604.24153 · score 8 — serving, ai system
An Analysis of the Coordination Gap between Joint and Modular Learning for Job Shop Scheduling with Transportation Resources · cs.AI · arXiv 2604.24117 · score 8 — agent, multi-agent
FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost · cs.LG · arXiv 2604.24073 · score 8 — rag, distributed training, gpu
DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery · cs.CV · arXiv 2604.24029 · score 8 — retrieval, reasoning, chain-of-thought, fine-tun
Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions · cs.AI · arXiv 2604.23924 · score 8 — agent, agentic
MarketBench: Evaluating AI Agents as Market Participants · cs.AI · arXiv 2604.23897 · score 8 — llm, agent
Geometry Preserving Loss Functions Promote Improved Adaptation of Blackbox Generative Model · cs.LG · arXiv 2604.23888 · score 8 — rag, serving, fine-tun
Graph Memory Transformer (GMT) · cs.LG · arXiv 2604.23862 · score 8 — serving, attention, transformer
Does Machine Unlearning Preserve Clinical Safety? A Risk Analysis for Medical Image Classification · cs.AI · arXiv 2604.23854 · score 8 — serving, attention, fine-tun
The Last Human-Written Paper: Agent-Native Research Artifacts · cs.LG · arXiv 2604.24658 · score 7 — agent, compiler
CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies · cs.CV · arXiv 2604.24622 · score 7 — rag, inference, latency
Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation · cs.CV · arXiv 2604.24396 · score 7 — rag, inference, attention
AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation · cs.RO · arXiv 2604.24086 · score 7 — inference, latency, fine-tun
Architectural Isolation as a Timing Safety Primitive for Edge AI Medical Devices: Controlled Experimental Evidence on a Shared-Silicon Platform · cs.AR · arXiv 2604.23831 · score 7 — inference, gpu, latency
Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents · cs.AI · arXiv 2604.24686 · score 6 — agent, rag
Meta-CoT: Enhancing Granularity and Generalization in Image Editing · cs.CV · arXiv 2604.24625 · score 6 — rag, reasoning, chain-of-thought
GSC-QEMit: A Telemetry-Driven Hierarchical Forecast-and-Bandit Framework for Adaptive Quantum Error Mitigation · quant-ph · arXiv 2604.24551 · score 6 — rag, serving
Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI · cs.CV · arXiv 2604.24492 · score 6 — latency, fine-tun, post-train
Incisor: Ex Ante Cloud Instance Selection for HPC Jobs · cs.DC · arXiv 2604.24464 · score 6 — llm, reasoning
Certified geometric robustness – Super-DeepG · cs.AI · arXiv 2604.24379 · score 6 — rag, reasoning, gpu
Learning Evidence of Depression Symptoms via Prompt Induction · cs.CL · arXiv 2604.24376 · score 6 — llm, fine-tun
SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation · cs.LG · arXiv 2604.24368 · score 6 — llm, rag
SolarTformer: A Transformer Based Deep Learning Approach for Short Term Solar Power Forecasting · cs.LG · arXiv 2604.24306 · score 6 — rag, attention, transformer
Multi-Dimensional Evaluation of Sustainable City Trips with LLM-as-a-Judge and Human-in-the-Loop · cs.AI · arXiv 2604.24158 · score 6 — llm, reasoning
Leveraging Human Feedback for Semantically-Relevant Skill Discovery · cs.LG · arXiv 2604.24127 · score 6 — agent, rag
Psychologically-Grounded Graph Modeling for Interpretable Depression Detection · cs.CL · arXiv 2604.24126 · score 6 — llm, attention
Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising · cs.CL · arXiv 2604.24104 · score 6 — llm, fine-tun
Distilling Self-Consistency into Verbal Confidence: A Pre-Registered Negative Result and Post-Hoc Rescue on Gemma 3 4B · cs.CL · arXiv 2604.24070 · score 6 — llm, fine-tun
TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents · cs.LG · arXiv 2604.24005 · score 6 — agent, reasoning
Hindsight Preference Optimization for Financial Time Series Advisory · cs.LG · arXiv 2604.23988 · score 6 — llm, reasoning
Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity · cs.CL · arXiv 2604.23972 · score 6 — llm, reasoning
Do Quantum Transformers Help? A Systematic VQC Architecture Comparison on Tabular Benchmarks · quant-ph · arXiv 2604.23931 · score 6 — rag, attention, transformer
Gromov-Wasserstein Methods for Multi-View Relational Embedding and Clustering · cs.LG · arXiv 2604.23912 · score 6 — rag, serving
Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows · cs.CL · arXiv 2604.23855 · score 6 — llm, rag
Contextual Linear Activation Steering of Language Models · cs.CL · arXiv 2604.24693 · score 5 — large language model
MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining · cs.CL · arXiv 2604.24374 · score 5 — inference, attention
Machine-Learning-Based Classification of Radio Frequency Building Loss · cs.LG · arXiv 2604.24143 · score 5 — rag, inference
PeeriScope: A Multi-Faceted Framework for Evaluating Peer Review Quality · cs.CL · arXiv 2604.24071 · score 5 — large language model
Integrative neurocybernetic modeling in the era of large-scale neuroscience · q-bio.NC · arXiv 2604.23903 · score 5 — rag, inference
Conflict-Aware Harmonized Rotational Gradient for Multiscale Kinetic Regimes · cs.LG · arXiv 2604.24745 · score 4 — serving
Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling · cs.AI · arXiv 2604.24717 · score 4 — attention, transformer
Fraud Detection in Cryptocurrency Markets with Spatio-Temporal Graph Neural Networks · cs.LG · arXiv 2604.24590 · score 4 — attention, transformer
A systematic evaluation of vision-language models for observational astronomical reasoning tasks · cs.AI · arXiv 2604.24589 · score 4 — reasoning, attention
Dialysis Risk Prediction and Treatment Effect Estimation for AKI patients using Longitudinal Electronic Health Records · cs.LG · arXiv 2604.24547 · score 4 — rag, transformer
Understanding the Limits of Automated Evaluation for Code Review Bots in Practice · cs.SE · arXiv 2604.24525 · score 4 — llm
ARETE: Attention-based Rasterized Encoding for Topology Estimation using HSV-transformed Crowdsourced Vehicle Fleet Data · cs.CV · arXiv 2604.24353 · score 4 — attention, transformer
Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training · cs.LG · arXiv 2604.24350 · score 4 — rag, attention
Semantic Segmentation for Histopathology using Learned Regularization based on Global Proportions · eess.IV · arXiv 2604.24347 · score 4 — rag, transformer
Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems · cs.DC · arXiv 2604.24340 · score 4 — rag, latency
Perfecting Aircraft Maneuvers with Reinforcement Learning · cs.LG · arXiv 2604.24338 · score 4 — agent
X-NegoBox: An Explainable Privacy-Budget Negotiation Framework for Secure Peer-to-Peer Energy Data Exchange · cs.CR · arXiv 2604.24326 · score 4 — serving
Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer · cs.CL · arXiv 2604.24302 · score 4 — retrieval, reasoning
Latent-Hysteresis Graph ODEs: Modeling Coupled Topology-Feature Evolution via Continuous Phase Transitions · cs.LG · arXiv 2604.24293 · score 4 — serving
RowHammer Vulnerability Counter (RVC): Redefining RowHammer Detection with Victim-Centric Tracking · cs.CR · arXiv 2604.24287 · score 4 — rag, latency
Deep Learning-Enabled Dissolved Oxygen Sensing in Biofouling Environments for Ocean Monitoring · eess.IV · arXiv 2604.24236 · score 4 — rag, transformer
CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification · cs.LG · arXiv 2604.24201 · score 4 — rag, fine-tun
IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning · cs.CL · arXiv 2604.24114 · score 4 — reasoning, fine-tun
BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning · cs.CL · arXiv 2604.24089 · score 4 — serving
How Sensitive Are Safety Benchmarks to Judge Configuration Choices? · cs.CL · arXiv 2604.24074 · score 4 — llm
AgentPulse: A Continuous Multi-Signal Framework for Evaluating AI Agents in Deployment · cs.AI · arXiv 2604.24038 · score 4 — agent
FedSLoP: Memory-Efficient Federated Learning with Low-Rank Gradient Projection · cs.LG · arXiv 2604.24012 · score 4 — serving
Adaptive-Distribution Randomized Neural Networks for PDEs: A Low-Dimensional Distribution-Learning Framework · math.NA · arXiv 2604.23999 · score 4 — serving
DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting · cs.LG · arXiv 2604.23968 · score 4 — attention, transformer
Crystal structure prediction using graph neural combinatorial optimization · cs.LG · arXiv 2604.23921 · score 4 — rag, gpu
Cardiac Stability Theory: An Axiomatically Grounded Framework for Continuous Cardiac Health Monitoring via Smartphone Photoplethysmography · cs.LG · arXiv 2604.23876 · score 4 — transformer, latency
Exploring Audio Hallucination in Egocentric Video Understanding · cs.CV · arXiv 2604.23860 · score 4 — llm
Focus on What Matters: Two-Stage ROI-Aware Refinement for Anatomy-Preserving Fetal Ultrasound Reconstruction · cs.CV · arXiv 2604.23839 · score 4 — serving
Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks · cs.LG · arXiv 2604.24637 · score 3 — inference
MIMIC: A Generative Multimodal Foundation Model for Biomolecules · cs.AI · arXiv 2604.24506 · score 3 — inference
Compilation and Execution of an Embeddable YOLO-NAS on the VTA · cs.AR · arXiv 2604.24455 · score 3 — compiler
SPLIT: Separating Physical-Contact via Latent Arithmetic in Image-Based Tactile Sensors · cs.RO · arXiv 2604.24449 · score 3 — inference
Scaling Properties of Continuous Diffusion Spoken Language Models · cs.CL · arXiv 2604.24416 · score 3 — inference
Model-Free Inference of Investor Preferences: A Relative Entropy IRL Approach · cs.LG · arXiv 2604.24280 · score 3 — inference
Speech Enhancement Based on Drifting Models · cs.SD · arXiv 2604.24199 · score 3 — inference
Learning to Think from Multiple Thinkers · cs.LG · arXiv 2604.24737 · score 2 — chain-of-thought
Déjà Vu Packing: Optimizing FPGA Logic Clustering Runtime via Pattern Memoization · cs.AR · arXiv 2604.24649 · score 2 — rag
NeSyCat: A Monad-Based Categorical Semantics of the Neurosymbolic ULLER Framework · cs.AI · arXiv 2604.24612 · score 2 — reasoning
Hierarchical Behaviour Spaces · cs.AI · arXiv 2604.24558 · score 2 — reasoning
SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances · cs.DC · arXiv 2604.24548 · score 2 — rag
A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning · cs.LG · arXiv 2604.24532 · score 2 — rag
SceneSelect: Selective Learning for Trajectory Scene Classification and Expert Scheduling · cs.LG · arXiv 2604.24514 · score 2 — rag
Modeling Behavioral Intensity and Transitions for Generative Recommendation · cs.IR · arXiv 2604.24472 · score 2 — attention
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation · cs.SD · arXiv 2604.24401 · score 2 — rag
Few-Shot Cross-Device Transfer for Quantum Noise Modeling on Real Hardware · quant-ph · arXiv 2604.24397 · score 2 — fine-tun
PathMoG: A Pathway-Centric Modular Graph Neural Network for Multi-Omics Survival Prediction · cs.LG · arXiv 2604.24371 · score 2 — attention
See Further, Think Deeper: Advancing VLM’s Reasoning Ability with Low-level Visual Cues and Reflection · cs.CV · arXiv 2604.24339 · score 2 — reasoning
Mitigating Error Amplification in Fast Adversarial Training · cs.LG · arXiv 2604.24332 · score 2 — rag
Unconstrained Multi-view Human Pose Estimation with Algebraic Priors · cs.CV · arXiv 2604.24312 · score 2 — transformer
IMPA-Net: Meteorology-Aware Multi-Scale Attention and Dynamic Loss for Extreme Convective Radar Nowcasting · cs.LG · arXiv 2604.24224 · score 2 — attention
Seeing Is No Longer Believing: Frontier Image Generation Models, Synthetic Visual Evidence, and Real-World Risk · cs.CL · arXiv 2604.24197 · score 2 — reasoning
MemeScouts@LT-EDI 2026: Asking the Right Questions – Prompted Weak Supervision for Meme Hate Speech Detection · cs.CL · arXiv 2604.24179 · score 2 — reasoning
A Divergence-Based Method for Weighting and Averaging Model Predictions · stat.ML · arXiv 2604.24172 · score 2 — rag
Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales · cs.DC · arXiv 2604.24091 · score 2 — rag
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws · cs.LG · arXiv 2604.24037 · score 2 — rag
KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances · cs.DC · arXiv 2604.24027 · score 2 — rag
Geometry-Aware Offline-to-Online Learning in Linear Contextual Bandits · cs.LG · arXiv 2604.24016 · score 2 — rag
SDSL-Solver: Scalable Distributed Sparse Linear Solvers for Large-Scale Interior Point Methods · cs.DC · arXiv 2604.23979 · score 2 — rag
Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction · cs.LG · arXiv 2604.23964 · score 2 — attention
Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach · cs.CV · arXiv 2604.23953 · score 2 — rag
KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters · cs.CL · arXiv 2604.23948 · score 2 — rag
Sliced-Regularized Optimal Transport · stat.ML · arXiv 2604.23944 · score 2 — rag
Quasi-Quadratic Gradient: A New Direction for Accelerating the BFGS Method in Quasi-Newton Optimization · math.OC · arXiv 2604.23922 · score 2 — rag
Machine Learning and Deep Learning Models for Short Term Electricity Price Forecasting in Australia’s National Electricity Market · cs.LG · arXiv 2604.23908 · score 2 — transformer
Learning Interpretable PDE Representations for Generative Reconstructions with Structured Sparsity · cs.LG · arXiv 2604.23867 · score 2 — rag
Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features · cs.AI · arXiv 2604.23829 · score 2 — reasoning

四月 28, 2026 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
四月 28, 2026 BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment
四月 28, 2026 DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
四月 28, 2026 Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
四月 28, 2026 Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
四月 28, 2026 PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
四月 28, 2026 Stabilizing Efficient Reasoning with Step-Level Advantage Selection
四月 28, 2026 The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models
四月 28, 2026 Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
四月 28, 2026 FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training