2026-04-27 论文速递

当日 agent / LLM / AI 基础设施方向共匹配到 307 篇 arXiv 论文，其中 10 篇由 Claude Code 生成了详细分析，其余 297 篇列在文末。

1. Guess-Verify-Refine: Data-Aware Top-K for Sparse-Attention Decoding on Blackwell via Temporal Correlation

arXiv: 2604.22312 · cs.DC · 相关度分数 27

GVR 是面向 Blackwell GPU 的数据感知精确 Top-K 算法，利用解码步间时间相关性加速 DeepSeek Sparse Attention 的 Top-K 选择，单算子平均提速 1.88×，端到端 TPOT 最多提升 7.52%。

阅读完整分析 →

2. How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks

arXiv: 2604.22750 · cs.CL · 相关度分数 24

首个系统研究 agentic coding 任务 token 消耗的工作：在 SWE-bench Verified 上分析 8 个前沿 LLM，发现 agent 任务耗 token 是普通代码任务的 1000 倍，且模型无法准确预测自身消耗。

阅读完整分析 →

3. Behavioral Canaries: Auditing Private Retrieved Context Usage in RL Fine-Tuning

arXiv: 2604.22191 · cs.CR · 相关度分数 23

提出 Behavioral Canaries：针对 RL 微调（RLFT）的新型审计机制，通过注入"文档触发器+风格化反馈"偏好对，检测被法律保护的检索文档是否被非授权用于训练。

阅读完整分析 →

4. Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

arXiv: 2604.22748 · cs.AI · 相关度分数 24

提出"levels × laws"二维分类框架，把 agentic world model 划分为 L1/L2/L3 三级能力与物理、数字、社会、科学四类规律域，综述 400+ 工作并给出评测与治理建议。

阅读完整分析 →

5. Preference Heads in Large Language Models: A Mechanistic Framework for Interpretable Personalization

arXiv: 2604.22345 · cs.CL · 相关度分数 30

论文提出 Differential Preference Steering (DPS)，通过因果分析定位 LLM 中稀疏的 Preference Heads，并在推理时对比有无这些头的 logits 来实现无需训练的可解释个性化。

阅读完整分析 →

6. Aligning Dense Retrievers with LLM Utility via DistillationAligning Dense Retrievers with LLM Utility via Distillation

arXiv: 2604.22722 · cs.IR · 相关度分数 16

UAE 把 dense retriever 通过 distillation 对齐到 LLM 的 utility 分布，用 Utility-Modulated InfoNCE 让 bi-encoder 模仿 perplexity reduction 信号，在 QASPER 上大幅超过 BGE-Base，且比 LLM re-ranking 快 180 倍。

阅读完整分析 →

7. Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework

arXiv: 2604.22119 · cs.AI · 相关度分数 23

论文提出 ESRR 风险分类法与 ESRRSim 自动评测框架，系统性度量 LLM 的策略性推理风险，在 11 个模型上检测率差异从 14.45% 到 72.72%。

阅读完整分析 →

8. QuantClaw: Precision Where It Matters for OpenClaw

arXiv: 2604.22577 · cs.AI · 相关度分数 21

QuantClaw 是针对 OpenClaw 智能体系统的即插即用精度路由插件，依据任务特征动态分配量化精度，在 GLM-5 的 FP8 基线上最多节省 21.4% 成本、降低 15.7% 延迟。

阅读完整分析 →

9. Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

arXiv: 2604.21952 · cs.LG · 相关度分数 23

提出面向多模态基础模型（MFM）的软硬件协同加速方法论，覆盖压缩、推理优化到专用加速器，在医疗 MFM 与代码生成任务上验证。

阅读完整分析 →

10. Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems

arXiv: 2604.21794 · cs.AI · 相关度分数 22

DiffMAS 把多智能体之间的 latent 通信（KV cache）当作可学习组件，用参数高效监督训练端到端优化推理链，在数学、科学 QA、代码、常识基准上均优于单智能体与文本式多智能体。

阅读完整分析 →

其他当日匹配论文

这些论文命中了同样的主题关键词，但未进入 top-N 深度分析。按相关度分数降序排列。

Nemobot Games: Crafting Strategic AI Gaming Agents for Interactive Learning with Large Language Models · cs.AI · arXiv 2604.21896 · score 23 — large language model, llm, agent, agentic, rag, reasoning
Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows · cs.AI · arXiv 2604.21816 · score 23 — large language model, llm, agent, agentic, reasoning, attention
Sovereign Agentic Loops: Decoupling AI Reasoning from Execution in Real-World Systems · cs.CR · arXiv 2604.22136 · score 21 — large language model, llm, agent, agentic, reasoning, latency
Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents · cs.AI · arXiv 2604.22085 · score 20 — large language model, agent, agentic, retrieval, inference, latency
Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation · cs.IR · arXiv 2604.21536 · score 20 — large language model, llm, reasoning, inference, serving, fine-tun
MambaCSP: Hybrid-Attention State Space Models for Hardware-Efficient Channel State Prediction · cs.IT · arXiv 2604.21957 · score 20 — large language model, llm, inference, attention, transformer, throughput
GR-Evolve: Design-Adaptive Global Routing via LLM-Driven Algorithm Evolution · cs.AR · arXiv 2604.22234 · score 19 — large language model, llm, agent, agentic, rag
Lightweight Retrieval-Augmented Generation and Large Language Model-Based Modeling for Scalable Patient-Trial Matching · cs.CL · arXiv 2604.22061 · score 19 — large language model, llm, retrieval, reasoning, serving, fine-tun
LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs · cs.LG · arXiv 2604.22050 · score 19 — llm, inference, serving, attention, transformer, throughput
Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation · cs.AI · arXiv 2604.21264 · score 23 — large language model, llm, rag, chain-of-thought, mixture of experts, moe
A Task Decomposition and Planning Framework for Efficient LLM Inference in AI-Enabled WiFi-Offload Networks · cs.DC · arXiv 2604.21399 · score 18 — large language model, llm, rag, reasoning, inference, latency
SparKV: Overhead-Aware KV Cache Loading for Efficient On-Device LLM Inference · cs.NI · arXiv 2604.21231 · score 18 — large language model, llm, inference, kv cache, latency
Bridging the Long-Tail Gap: Robust Retrieval-Augmented Relation Completion via Multi-Stage Paraphrase Infusion · cs.CL · arXiv 2604.22261 · score 17 — large language model, llm, retrieval, rag, reasoning, fine-tun
Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers · cs.CR · arXiv 2604.21700 · score 17 — large language model, llm, rag, serving, fine-tun
Reasoning Primitives in Hybrid and Non-Hybrid LLMs · cs.CL · arXiv 2604.21454 · score 17 — large language model, llm, retrieval, reasoning, attention, transformer
Large Language Models Decide Early and Explain Later · cs.CL · arXiv 2604.22266 · score 16 — large language model, rag, reasoning, chain-of-thought, inference, latency
ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression · cs.IR · arXiv 2604.22180 · score 16 — large language model, llm, retrieval, inference, latency
Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems · cs.LG · arXiv 2604.22154 · score 16 — llm, agent, multi-agent, ai system
HiCrew: Hierarchical Reasoning for Long-Form Video Understanding via Question-Aware Multi-Agent Collaboration · cs.AI · arXiv 2604.21444 · score 16 — agent, multi-agent, rag, reasoning, serving
Spatial Metaphors for LLM Memory: A Critical Analysis of the MemPalace Architecture · cs.AI · arXiv 2604.21284 · score 16 — large language model, llm, retrieval, rag, inference
Can QPP Choose the Right Query Variant? Evaluating Query Variant Selection for RAG Pipelines · cs.IR · arXiv 2604.22661 · score 15 — large language model, llm, retrieval, rag, latency
SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference · cs.LG · arXiv 2604.22575 · score 15 — llm, inference, quantization, attention, transformer, gpu
CGC: Compositional Grounded Contrast for Fine-Grained Multi-Image Understanding · cs.CV · arXiv 2604.22498 · score 15 — large language model, llm, reasoning, chain-of-thought, attention
Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations · cs.SE · arXiv 2604.22207 · score 15 — large language model, llm, retrieval, rag, chain-of-thought
How Large Language Models Balance Internal Knowledge with User and Document Assertions · cs.CL · arXiv 2604.22193 · score 15 — large language model, llm, rag, fine-tun, post-train
Voice Under Revision: Large Language Models and the Normalization of Personal Narrative · cs.CL · arXiv 2604.22142 · score 15 — large language model, llm, reasoning, serving
PrivUn: Unveiling Latent Ripple Effects and Shallow Forgetting in Privacy Unlearning · cs.LG · arXiv 2604.22076 · score 15 — large language model, llm, retrieval, rag, fine-tun
Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models · cs.CR · arXiv 2604.21860 · score 15 — large language model, llm, agent, rag
Language as a Latent Variable for Reasoning Optimization · cs.CL · arXiv 2604.21593 · score 15 — llm, reasoning, chain-of-thought, inference, serving
AgenticQwen: Training Small Agentic Language Models with Dual Data Flywheels for Industrial-Scale Tool Use · cs.CL · arXiv 2604.21590 · score 15 — agent, agentic, tool use, reasoning, latency
OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving · cs.CL · arXiv 2604.21510 · score 15 — large language model, llm, agent, reasoning
Efficient Agent Evaluation via Diversity-Guided User Simulation · cs.AI · arXiv 2604.21480 · score 15 — large language model, llm, agent, rag
ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs · cs.AI · arXiv 2604.21357 · score 15 — large language model, llm, retrieval, reasoning, chain-of-thought
CARE: Counselor-Aligned Response Engine for Online Mental-Health Support · cs.CL · arXiv 2604.21352 · score 15 — large language model, llm, agent, fine-tun
From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation · cs.AI · arXiv 2604.21910 · score 14 — llm, agent, agentic, rag
Process Supervision via Verbal Critique Improves Reasoning in Large Language Models · cs.CL · arXiv 2604.21611 · score 14 — large language model, llm, reasoning, inference
Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought · cs.CL · arXiv 2604.22709 · score 13 — rag, reasoning, chain-of-thought, inference, fine-tun, post-train
SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning · cs.LG · arXiv 2604.22558 · score 13 — large language model, llm, agent
RouteLMT: Learned Sample Routing for Hybrid LLM Translation Deployment · cs.CL · arXiv 2604.22520 · score 13 — large language model, llm, serving
CNSL-bench: Benchmarking the Sign Language Understanding Capabilities of MLLMs on Chinese National Sign Language · cs.CL · arXiv 2604.22367 · score 13 — large language model, llm, rag, reasoning
CAP: Controllable Alignment Prompting for Unlearning in LLMs · cs.LG · arXiv 2604.21251 · score 13 — large language model, llm, serving
Ethics Testing: Proactive Identification of Generative AI System Harms · cs.SE · arXiv 2604.22089 · score 13 — large language model, llm, ai system
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale · cs.CL · arXiv 2604.21889 · score 13 — large language model, llm, throughput, latency
A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents · cs.CL · arXiv 2604.21885 · score 13 — large language model, llm, reasoning, attention
GS-Quant: Granular Semantic and Generative Structural Quantization for Knowledge Graph Completion · cs.AI · arXiv 2604.21649 · score 13 — large language model, llm, reasoning, quantization
DryRUN: On the Role of Public Tests in LLM-Driven Code Generation · cs.SE · arXiv 2604.21598 · score 13 — large language model, llm, multi-agent
CoFEE: Reasoning Control for LLM-Based Feature Discovery · cs.AI · arXiv 2604.21584 · score 13 — large language model, llm, rag, reasoning
A Metamorphic Testing Approach to Diagnosing Memorization in LLM-Based Program Repair · cs.SE · arXiv 2604.21579 · score 13 — large language model, llm, serving
Measuring Opinion Bias and Sycophancy via LLM-based Coercion · cs.CL · arXiv 2604.21564 · score 13 — large language model, llm, agent
Job Skill Extraction via LLM-Centric Multi-Module Framework · cs.CL · arXiv 2604.21525 · score 13 — large language model, llm, retrieval, fine-tun
When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors · cs.CL · arXiv 2604.21255 · score 13 — llm, agent, tool-use, reasoning
VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation · cs.CL · arXiv 2604.21375 · score 12 — llm, agent, agentic
Introducing Background Temperature to Characterise Hidden Randomness in Large Language Models · cs.AI · arXiv 2604.22411 · score 12 — large language model, llm, inference
When Does LLM Self-Correction Help? A Control-Theoretic Markov Diagnostic and Verify-First Intervention · cs.AI · arXiv 2604.22273 · score 12 — llm, agent, agentic
A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism, Governance, and Dynamics in Complex Societies · cs.CY · arXiv 2604.22227 · score 12 — rag, serving, transformer, ai system
Sound Agentic Science Requires Adversarial Experiments · cs.AI · arXiv 2604.22080 · score 12 — llm, agent, agentic
Read the Paper, Write the Code: Agentic Reproduction of Social-Science Results · cs.AI · arXiv 2604.21965 · score 12 — llm, agent, agentic
StructMem: Structured Memory for Long-Horizon Behavior in LLMs · cs.CL · arXiv 2604.21748 · score 12 — llm, agent, rag, reasoning
AI-Gram: When Visual Agents Interact in a Social Network · cs.AI · arXiv 2604.21446 · score 12 — llm, agent, multi-agent
Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning · cs.LG · arXiv 2604.21327 · score 12 — large language model, rag, reasoning, inference
CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents · cs.CR · arXiv 2604.21308 · score 12 — llm, agent, retrieval, reasoning
GraphLeap: Decoupling Graph Construction and Convolution for Vision GNN Acceleration on FPGA · cs.CV · arXiv 2604.21290 · score 12 — inference, parallelism, transformer, gpu, fine-tun
Strategic Heterogeneous Multi-Agent Architecture for Cost-Effective Code Vulnerability Detection · cs.CR · arXiv 2604.21282 · score 12 — llm, agent, multi-agent
Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition · cs.CL · arXiv 2604.21276 · score 12 — large language model, llm, inference
Hyperloop Transformers · cs.LG · arXiv 2604.21254 · score 12 — llm, quantization, transformer, latency, post-train
Rethinking Math Reasoning Evaluation: A Robust LLM-as-a-Judge Framework Beyond Symbolic Rigidity · cs.AI · arXiv 2604.22597 · score 11 — large language model, llm, reasoning
Learning Evidence Highlighting for Frozen LLMs · cs.CL · arXiv 2604.22565 · score 11 — large language model, llm, reasoning
FeatEHR-LLM: Leveraging Large Language Models for Feature Engineering in Electronic Health Records · cs.LG · arXiv 2604.22534 · score 11 — large language model, llm, rag
Superminds Test: Actively Evaluating Collective Intelligence of Agent Society via Probing Agents · cs.AI · arXiv 2604.22452 · score 11 — large language model, agent, reasoning
SSG: Logit-Balanced Vocabulary Partitioning for LLM Watermarking · cs.CR · arXiv 2604.22438 · score 11 — large language model, llm, reasoning
Context-Fidelity Boosting: Enhancing Faithful Generation through Watermark-Inspired Decoding · cs.CL · arXiv 2604.22335 · score 11 — large language model, llm, attention
Dynamically Acquiring Text Content to Enable the Classification of Lesser-known Entities for Real-world Tasks · cs.CL · arXiv 2604.22325 · score 11 — large language model, llm, rag
BLAST: Benchmarking LLMs with ASP-based Structured Testing · cs.LO · arXiv 2604.22306 · score 11 — large language model, llm, attention
Tell Me Why: Designing an Explainable LLM-based Dialogue System for Student Problem Behavior Diagnosis · cs.CL · arXiv 2604.22237 · score 11 — large language model, llm, fine-tun
Recognition Without Authorization: LLMs and the Moral Order of Online Advice · cs.CY · arXiv 2604.22143 · score 11 — large language model, llm, rag
Reliability Auditing for Downstream LLM tasks in Psychiatry: LLM-Generated Hospitalization Risk Scores · cs.LG · arXiv 2604.22063 · score 11 — large language model, llm, reasoning
Call-Chain-Aware LLM-Based Test Generation for Java Projects · cs.SE · arXiv 2604.22046 · score 11 — large language model, llm, rag
Shared Lexical Task Representations Explain Behavioral Variability In LLMs · cs.CL · arXiv 2604.22027 · score 11 — large language model, llm, attention
Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions · cs.CL · arXiv 2604.21871 · score 11 — large language model, llm, reasoning
Thinking with Reasoning Skills: Fewer Tokens, More Accuracy · cs.AI · arXiv 2604.21764 · score 11 — llm, reasoning, chain-of-thought, inference
Promoting Simple Agents: Ensemble Methods for Event-Log Prediction · cs.LG · arXiv 2604.21629 · score 11 — agent, inference, transformer, latency
Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies · cs.AI · arXiv 2604.21571 · score 11 — llm, inference, serving
Unbiased Prevalence Estimation with Multicalibrated LLMs · cs.AI · arXiv 2604.21549 · score 11 — large language model, llm, rag
VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution · cs.CV · arXiv 2604.21450 · score 11 — rag, inference, attention, transformer, fine-tun
Decoupled Travel Planning with Behavior Forest · cs.LG · arXiv 2604.21354 · score 11 — large language model, llm, reasoning
Symbolic Grounding Reveals Representational Bottlenecks in Abstract Visual Reasoning · cs.AI · arXiv 2604.21346 · score 11 — large language model, llm, reasoning
Ideological Bias in LLMs’ Economic Causal Reasoning · cs.AI · arXiv 2604.21334 · score 11 — large language model, llm, reasoning
Can MLLMs “Read” What is Missing? · cs.AI · arXiv 2604.21277 · score 11 — large language model, llm, rag
Unlocking the Power of Large Language Models for Multi-table Entity Matching · cs.CL · arXiv 2604.21238 · score 11 — large language model, llm, rag
Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling · cs.CL · arXiv 2604.21724 · score 10 — retrieval, rag, serving, attention
Navigating Large-Scale Document Collections: MuDABench for Multi-Document Analytical QA · cs.CL · arXiv 2604.22239 · score 10 — multi-agent, retrieval, rag, reasoning
Sum-of-Checks: Structured Reasoning for Surgical Safety with Large Vision-Language Models · cs.LG · arXiv 2604.22156 · score 10 — rag, reasoning, chain-of-thought, ai system
Removing Sandbagging in LLMs by Training with Weak Supervision · cs.LG · arXiv 2604.22082 · score 10 — llm, ai system, fine-tun
Source-Modality Monitoring in Vision-Language Models · cs.CL · arXiv 2604.22038 · score 10 — agent, agentic, retrieval
AEL: Agent Evolving Learning for Open-Ended Environments · cs.CL · arXiv 2604.21725 · score 10 — llm, agent, retrieval
GeoMind: An Agentic Workflow for Lithology Classification with Reasoned Tool Invocation · cs.AI · arXiv 2604.21501 · score 10 — agent, agentic, reasoning
FairQE: Multi-Agent Framework for Mitigating Gender Bias in Translation Quality Estimation · cs.AI · arXiv 2604.21420 · score 10 — llm, multi-agent, reasoning
Representational Harms in LLM-Generated Narratives Against Global Majority Nationalities · cs.CL · arXiv 2604.22749 · score 9 — large language model, llm
Dharma, Data and Deception: An LLM-Powered Rhetorical Analysis of Cow-Urine Health Claims on YouTube · cs.CL · arXiv 2604.22606 · score 9 — large language model, llm
From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification · cs.SE · arXiv 2604.22601 · score 9 — large language model, llm
Controllable Spoken Dialogue Generation: An LLM-Driven Grading System for K-12 Non-Native English Learners · cs.CL · arXiv 2604.22542 · score 9 — large language model, llm
HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference · cs.AR · arXiv 2604.22293 · score 9 — inference, quantization, gpu, latency
How LLMs Detect and Correct Their Own Errors: The Role of Internal Confidence Signals · cs.LG · arXiv 2604.22271 · score 9 — large language model, llm
A Probabilistic Framework for Hierarchical Goal Recognition · cs.SC · arXiv 2604.22256 · score 9 — agent, reasoning, inference
When AI Speaks, Whose Values Does It Express? A Cross-Cultural Audit of Individualism-Collectivism Bias in Large Language Models · cs.CL · arXiv 2604.22153 · score 9 — large language model, ai system
SHAPE: Unifying Safety, Helpfulness and Pedagogy for Educational LLMs · cs.CL · arXiv 2604.22134 · score 9 — large language model, llm
PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training · cs.LG · arXiv 2604.22117 · score 9 — large language model, llm
Spontaneous Persuasion: An Audit of Model Persuasiveness in Everyday Conversations · cs.HC · arXiv 2604.22109 · score 9 — large language model, llm
When Cow Urine Cures Constipation on YouTube: Limits of LLMs in Detecting Culture-specific Health Misinformation · cs.CL · arXiv 2604.22002 · score 9 — large language model, llm
Evaluation of Automatic Speech Recognition Using Generative Large Language Models · cs.CL · arXiv 2604.21928 · score 9 — large language model, llm
Revisiting Non-Verbatim Memorization in Large Language Models: The Role of Entity Surface Forms · cs.CL · arXiv 2604.21882 · score 9 — large language model, llm
Leveraging SIMD for Accelerating Large-number Arithmetic · cs.DC · arXiv 2604.21566 · score 9 — rag, parallelism, throughput, latency
MISTY: High-Throughput Motion Planning via Mixer-based Single-step Drifting · cs.RO · arXiv 2604.21489 · score 9 — inference, attention, throughput, latency
Differentially Private De-identification of Dutch Clinical Notes: A Comparative Evaluation · cs.CR · arXiv 2604.21421 · score 9 — large language model, llm
Time, Causality, and Observability Failures in Distributed AI Inference Systems · cs.AI · arXiv 2604.21361 · score 9 — inference, ai system, throughput
When Bigger Isn’t Better: A Comprehensive Fairness Evaluation of Political Bias in Multi-News Summarisation · cs.CL · arXiv 2604.21309 · score 9 — large language model, llm
ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures · cs.AI · arXiv 2604.21232 · score 9 — large language model, agent
EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval · cs.CL · arXiv 2604.21229 · score 9 — large language model, retrieval, reasoning
Aggregate vs. Personalized Judges in Business Idea Evaluation: Evidence from Expert Disagreement · cs.CL · arXiv 2604.22517 · score 8 — llm, rag, reasoning
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company · cs.AI · arXiv 2604.22446 · score 8 — agent, multi-agent
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets · cs.CL · arXiv 2604.22294 · score 8 — llm, rag, reasoning
ReLeVAnT: Relevance Lexical Vectors for Accurate Legal Text Classification · cs.CL · arXiv 2604.22292 · score 8 — llm, retrieval, rag
An LLM-Driven Closed-Loop Autonomous Learning Framework for Robots Facing Uncovered Tasks in Open Environments · cs.RO · arXiv 2604.22199 · score 8 — llm, rag, reasoning
Hardware-Software Co-Design for Event-Driven SNN Deployment on Low-Cost Neuromorphic FPGAs · cs.AR · arXiv 2604.22179 · score 8 — serving, gpu, latency
FlashSpread: IO-Aware GPU Simulation of Non-Markovian Epidemic Dynamics via Kernel Fusion · cs.DC · arXiv 2604.22092 · score 8 — rag, gpu, cuda, throughput
When Quotes Crumble: Detecting Transient Mechanical Liquidity Erosion in Limit Order Books · cs.LG · arXiv 2604.21993 · score 8 — agent, multi-agent
When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs · cs.CV · arXiv 2604.21911 · score 8 — rag, serving, fine-tun
Low-Rank Adaptation Redux for Large Models · cs.LG · arXiv 2604.21905 · score 8 — serving, fine-tun, post-train
TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication · cs.CR · arXiv 2604.21840 · score 8 — llm, agent
Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs · cs.CL · arXiv 2604.21751 · score 8 — llm, rag, fine-tun
Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development · cs.SE · arXiv 2604.21744 · score 8 — agent, agentic
Risk-Aware and Stable Edge Server Selection Under Network Latency SLOs · cs.DC · arXiv 2604.21483 · score 8 — rag, serving, latency
CSC: Turning the Adversary’s Poison against Itself · cs.CR · arXiv 2604.21416 · score 8 — rag, serving, fine-tun
Adaptive Head Budgeting for Efficient Multi-Head Attention · cs.LG · arXiv 2604.22583 · score 7 — inference, attention, transformer
From Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables · cs.LG · arXiv 2604.22416 · score 7 — rag, reasoning, inference
Fast Neural-Network Approximation of Active Target Search Under Uncertainty · cs.LG · arXiv 2604.22254 · score 7 — agent, inference
GenMatter: Perceiving Physical Objects with Generative Matter Models · cs.CV · arXiv 2604.22160 · score 7 — inference, serving
Who Audits the Auditor? Tamper-Proof Fraud Detection with Blockchain-Anchored Explainable ML · cs.CR · arXiv 2604.22096 · score 7 — rag, inference, latency
Incentivizing Neuro-symbolic Language-based Reasoning in VLMs via Reinforcement Learning · cs.CL · arXiv 2604.22062 · score 7 — reasoning, inference, gpu
Kernel Contracts: A Specification Language for ML Kernel Correctness Across Heterogeneous Silicon · cs.LG · arXiv 2604.22032 · score 7 — attention, cuda, compiler
EVENT5Ws: A Large Dataset for Open-Domain Event Extraction from Documents · cs.CL · arXiv 2604.21890 · score 7 — large language model, rag
From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation · cs.CL · arXiv 2604.21716 · score 7 — large language model, rag
BioMiner: A Multi-modal System for Automated Mining of Protein-Ligand Bioactivity Data from Literature · cs.AI · arXiv 2604.21508 · score 7 — large language model, reasoning
How English Print Media Frames Human-Elephant Conflicts in India · cs.AI · arXiv 2604.21496 · score 7 — large language model, transformer
How Supply Chain Dependencies Complicate Bias Measurement and Accountability Attribution in AI Hiring Applications · cs.CY · arXiv 2604.22679 · score 6 — rag, ai system
BERAG: Bayesian Ensemble Retrieval-Augmented Generation for Knowledge-based Visual Question Answering · cs.CL · arXiv 2604.22678 · score 6 — retrieval, rag, fine-tun
Chamelio: A Fast Shared Cloud Network Stack for Isolated Tenant-Defined Protocols · cs.NI · arXiv 2604.22603 · score 6 — serving, latency
HubRouter: A Pluggable Sub-Quadratic Routing Primitive for Hybrid Sequence Models · cs.LG · arXiv 2604.22442 · score 6 — attention, transformer, throughput
AgentSearchBench: A Benchmark for AI Agent Search in the Wild · cs.AI · arXiv 2604.22436 · score 6 — agent, retrieval
STEM: Structure-Tracing Evidence Mining for Knowledge Graphs-Driven Retrieval-Augmented Generation · cs.CL · arXiv 2604.22282 · score 6 — retrieval, rag, reasoning
Accelerating Intra-Node GPU-to-GPU Communication Through Multi-Path Transfers with CUDA Graphs · cs.DC · arXiv 2604.22228 · score 6 — rag, gpu, cuda
Verbal Confidence Saturation in 3-9B Open-Weight Instruction-Tuned LLMs: A Pre-Registered Psychometric Validity Screen · cs.CL · arXiv 2604.22215 · score 6 — llm, reasoning
Where Should LoRA Go? Component-Type Placement in Hybrid Language Models · cs.CL · arXiv 2604.22127 · score 6 — attention, transformer, fine-tun
An End-to-End Ukrainian RAG for Local Deployment. Optimized Hybrid Search and Lightweight Generation · cs.CL · arXiv 2604.22095 · score 6 — retrieval, rag, fine-tun
Outcome Rewards Do Not Guarantee Verifiable or Causally Important Reasoning · cs.CL · arXiv 2604.22074 · score 6 — reasoning, chain-of-thought, post-train
Universal Transformers Need Memory: Depth-State Trade-offs in Adaptive Recursive Reasoning · cs.LG · arXiv 2604.21999 · score 6 — reasoning, attention, transformer
A Multi-Stage Warm-Start Deep Learning Framework for Unit Commitment · eess.SY · arXiv 2604.21891 · score 6 — rag, attention, transformer
SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning · cs.CL · arXiv 2604.21782 · score 6 — llm, fine-tun
PrismaDV: Automated Task-Aware Data Unit Test Generation · cs.LG · arXiv 2604.21765 · score 6 — rag, ai system
A-IC3: Learning-Guided Adaptive Inductive Generalization for Hardware Model Checking · cs.LO · arXiv 2604.21688 · score 6 — agent, attention
Task-specific Subnetwork Discovery in Reinforcement Learning for Autonomous Underwater Navigation · cs.LG · arXiv 2604.21640 · score 6 — agent, rag
A-THENA: Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding and Network-Specific Augmentation · cs.CR · arXiv 2604.21623 · score 6 — rag, transformer, latency
A Kernel Nonconformity Score for Multivariate Conformal Prediction · stat.ML · arXiv 2604.21595 · score 6 — rag, serving
Attention-based multiple instance learning for predominant growth pattern prediction in lung adenocarcinoma wsi using foundation models · cs.CV · arXiv 2604.21530 · score 6 — rag, attention, fine-tun
Sub-Token Routing in LoRA for Adaptation and Query-Aware KV Compression · cs.LG · arXiv 2604.21335 · score 6 — serving, transformer
Planning Beyond Text: Graph-based Reasoning for Complex Narrative Generation · cs.CL · arXiv 2604.21253 · score 6 — llm, reasoning
Microarchitectural Co-Optimization for Sustained Throughput of RISC-V Multi-Lane Chaining Vector Processors · cs.AR · arXiv 2604.22314 · score 5 — parallelism, throughput
Exploiting pre-optimized kernels with polyhedral transformations for CGRA compilation · cs.AR · arXiv 2604.22297 · score 5 — rag, parallelism
Anatomy-Aware Unsupervised Detection and Localization of Retinal Abnormalities in Optical Coherence Tomography · cs.CV · arXiv 2604.22139 · score 5 — rag, inference
Learning Coverage- and Power-Optimal Transmitter Placement from Building Maps: A Comparative Study of Direct and Indirect Neural Approaches · cs.LG · arXiv 2604.22056 · score 5 — rag, inference
Null-Space Flow Matching for MIMO Channel Estimation in Latency-Constrained Systems · cs.IT · arXiv 2604.22005 · score 5 — inference, latency
Efficient Logic Gate Networks for Video Copy Detection · cs.CV · arXiv 2604.21694 · score 5 — inference, throughput
Fine-Grained Perspectives: Modeling Explanations with Annotator-Specific Rationales · cs.CL · arXiv 2604.21667 · score 5 — inference, fine-tun
Cross-Domain Data Selection and Augmentation for Automatic Compliance Detection · cs.CL · arXiv 2604.21469 · score 5 — retrieval, inference
Channel-Free Human Activity Recognition via Inductive-Bias-Aware Fusion Design for Heterogeneous IoT Sensor Environments · cs.LG · arXiv 2604.21369 · score 5 — rag, inference
Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings · cs.LG · arXiv 2604.22662 · score 4 — rag, latency
Quality-Driven Selective Mutation for Deep Learning · cs.SE · arXiv 2604.22640 · score 4 — serving
Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations · cs.CR · arXiv 2604.22639 · score 4 — serving
Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting · stat.ML · arXiv 2604.22580 · score 4 — rag, reasoning
ArmSSL: Adversarial Robust Black-Box Watermarking for Self-Supervised Learning Pre-trained Encoders · cs.CR · arXiv 2604.22550 · score 4 — serving
Different Strokes for Different Folks: Writer Identification for Historical Arabic Manuscripts · cs.CV · arXiv 2604.22515 · score 4 — attention, fine-tun
Hidden Failure Modes of Gradient Modification under Adam in Continual Learning, and Adaptive Decoupled Moment Routing as a Repair · cs.LG · arXiv 2604.22407 · score 4 — serving
SOC-ICNN: From Polyhedral to Conic Geometry for Learning Convex Surrogate Functions · cs.LG · arXiv 2604.22355 · score 4 — serving
A Nationwide Japanese Medical Claims Foundation Model: Balancing Model Scaling and Task-Specific Computational Efficiency · cs.LG · arXiv 2604.22348 · score 4 — rag, transformer
TabSCM: A practical Framework for Generating Realistic Tabular Data · cs.LG · arXiv 2604.22337 · score 4 — llm
CLARITY: A Framework and Benchmark for Conversational Language Ambiguity and Unanswerability in Interactive NL2SQL Systems · cs.CL · arXiv 2604.22313 · score 4 — llm
Semantic Error Correction and Decoding for Short Block Channel Codes · cs.IT · arXiv 2604.22269 · score 4 — transformer, latency
Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset · cs.CV · arXiv 2604.22260 · score 4 — reasoning, attention
Algorithmic Feature Highlighting for Human-AI Decision-Making · cs.GT · arXiv 2604.22236 · score 4 — agent
UniSonate: A Unified Model for Speech, Music, and Sound Effect Generation with Text Instructions · eess.AS · arXiv 2604.22209 · score 4 — rag, transformer
Optimal sequential decision-making for error propagation mitigation in digital twins · cs.LG · arXiv 2604.22168 · score 4 — serving
Logistic Bandits with $\tilde{O}(\sqrt{dT})$ Regret without Context Diversity Assumptions · cs.LG · arXiv 2604.22161 · score 4 — agent
Dissociating Decodability and Causal Use in Bracket-Sequence Transformers · cs.CL · arXiv 2604.22128 · score 4 — attention, transformer
GICC: A High-Performance Runtime for GPU-Initiated Communication and Coordination in Modern HPC Systems · cs.DC · arXiv 2604.22126 · score 4 — gpu, latency
Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement · cs.LG · arXiv 2604.22110 · score 4 — agent
Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation · cs.CL · arXiv 2604.22098 · score 4 — retrieval, rag
Optimal Question Selection from a Large Question Bank for Clinical Field Recovery in Conversational Psychiatric Intake · cs.CL · arXiv 2604.22067 · score 4 — llm
Foundation models for discovering robust biomarkers of neurological disorders from dynamic functional connectivity · q-bio.NC · arXiv 2604.22018 · score 4 — attention, fine-tun
Seeing Fast and Slow: Learning the Flow of Time in Videos · cs.CV · arXiv 2604.21931 · score 4 — reasoning, attention
MathDuels: Evaluating LLMs as Problem Posers and Solvers · cs.CL · arXiv 2604.21916 · score 4 — llm
SPAC: Automating FPGA-based Network Switches with Protocol Adaptive Customization · cs.NI · arXiv 2604.21881 · score 4 — throughput, latency
Locating acts of mechanistic reasoning in student team conversations with mechanistic machine learning · physics.ed-ph · arXiv 2604.21870 · score 4 — rag, reasoning
Alignment has a Fantasia Problem · cs.AI · arXiv 2604.21827 · score 4 — ai system
Who Defines “Best”? Towards Interactive, User-Defined Evaluation of LLM Leaderboards · cs.AI · arXiv 2604.21769 · score 4 — llm
AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA · cs.CL · arXiv 2604.21766 · score 4 — rag, reasoning
Bridging the Training-Deployment Gap: Gated Encoding and Multi-Scale Refinement for Efficient Quantization-Aware Image Enhancement · cs.AI · arXiv 2604.21743 · score 4 — quantization, post-train
Fairness under uncertainty in sequential decisions · cs.LG · arXiv 2604.21711 · score 4 — serving
Evaluating Post-hoc Explanations of the Transformer-based Genome Language Model DNABERT-2 · cs.LG · arXiv 2604.21690 · score 4 — attention, transformer
To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning · cs.AI · arXiv 2604.21632 · score 4 — reasoning, transformer
On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification · cs.NE · arXiv 2604.21602 · score 4 — quantization, attention
A systematic review of generative AI usage for IT project management · cs.SE · arXiv 2604.21958 · score 4 — agent
UKP_Psycontrol at SemEval-2026 Task 2: Modeling Valence and Arousal Dynamics from Text · cs.CL · arXiv 2604.21534 · score 4 — llm
Architectures for Robust Self-Organizing Energy Systems under Information and Control Constraints · cs.MA · arXiv 2604.21529 · score 4 — agent
From Tokens to Concepts: Leveraging SAE for SPLADE · cs.IR · arXiv 2604.21511 · score 4 — retrieval, rag
Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning · cs.LG · arXiv 2604.21495 · score 4 — reasoning, fine-tun
Dynamical Priors as a Training Objective in Reinforcement Learning · cs.LG · arXiv 2604.21464 · score 4 — agent
Brief chatbot interactions produce lasting changes in human moral values · cs.AI · arXiv 2604.21430 · score 4 — agent
SemanticAgent: A Semantics-Aware Framework for Text-to-SQL Data Synthesis · cs.AI · arXiv 2604.21414 · score 4 — reasoning, fine-tun
VG-CoT: Towards Trustworthy Visual Reasoning via Grounded Chain-of-Thought · cs.CV · arXiv 2604.21396 · score 4 — reasoning, chain-of-thought
Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair · cs.LG · arXiv 2604.21395 · score 4 — rag, fine-tun
mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code · cs.LG · arXiv 2604.21365 · score 4 — llm
Beyond Single Plots: A Benchmark for Question Answering on Multi-Charts · cs.CL · arXiv 2604.21344 · score 4 — llm
Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation · cs.CV · arXiv 2604.21291 · score 4 — serving
Optimizing High-Throughput Distributed Data Pipelines for Reproducible Deep Learning at Scale · cs.DC · arXiv 2604.21275 · score 4 — gpu, throughput
Trustworthy Clinical Decision Support Using Meta-Predicates and Domain-Specific Languages · cs.AI · arXiv 2604.21263 · score 4 — serving
Towards Adaptive Continual Model Merging via Manifold-Aware Expert Evolution · cs.LG · arXiv 2604.22464 · score 3 — moe
Preserve Support, Not Correspondence: Dynamic Routing for Offline Reinforcement Learning · cs.LG · arXiv 2604.22229 · score 3 — inference
Multimodal Diffusion to Mutually Enhance Polarized Light and Low Resolution EBSD Data · eess.IV · arXiv 2604.22212 · score 3 — inference
Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning · cs.LG · arXiv 2604.22031 · score 3 — inference
Bounding the Black Box: A Statistical Certification Framework for AI Risk Regulation · cs.AI · arXiv 2604.21854 · score 3 — inference
Ramen: Robust Test-Time Adaptation of Vision-Language Models with Active Sample Selection · cs.CV · arXiv 2604.21728 · score 3 — inference
Causal Disentanglement for Full-Reference Image Quality Assessment · cs.CV · arXiv 2604.21654 · score 3 — inference
Suppressing the Erasure Error of Fusion Operation in Photonic Quantum Computing · quant-ph · arXiv 2604.21475 · score 3 — compiler
Tempered Sequential Monte Carlo for Trajectory and Policy Optimization with Differentiable Dynamics · cs.LG · arXiv 2604.21456 · score 3 — inference
Even More Guarantees for Variational Inference in the Presence of Symmetries · cs.LG · arXiv 2604.21407 · score 3 — inference
Cross-Entropy Is Load-Bearing: A Pre-Registered Scope Test of the K-Way Energy Probe on Bidirectional Predictive Coding · cs.CL · arXiv 2604.21286 · score 3 — inference
Calibeating Prediction-Powered Inference · stat.ML · arXiv 2604.21260 · score 3 — inference
Neural Recovery of Historical Lexical Structure in Bantu Languages from Modern Data · cs.LG · arXiv 2604.22730 · score 2 — transformer
Zero-Shot Morphological Discovery in Low-Resource Bantu Languages via Cross-Lingual Transfer and Unsupervised Clustering · cs.LG · arXiv 2604.22723 · score 2 — rag
CRAFT: Clustered Regression for Adaptive Filtering of Training data · cs.CL · arXiv 2604.22693 · score 2 — fine-tun
Operational Feature Fingerprints of Graph Datasets via a White-Box Signal-Subspace Probe · cs.LG · arXiv 2604.22676 · score 2 — rag
Detecting Concept Drift in Evolving Malware Families Using Rule-Based Classifier Representations · cs.CR · arXiv 2604.22629 · score 2 — rag
Adversarial Co-Evolution of Malware and Detection Models: A Bilevel Optimization Perspective · cs.CR · arXiv 2604.22569 · score 2 — rag
Cross-Stage Coherence in Hierarchical Driving VQA: Explicit Baselines and Learned Gated Context Projectors · cs.CV · arXiv 2604.22560 · score 2 — reasoning
QDTraj: Exploration of Diverse Trajectory Primitives for Articulated Objects Robotic Manipulation · cs.RO · arXiv 2604.22551 · score 2 — rag
Multi-output Extreme Spatial Model for Complex Aircraft Production Systems · stat.AP · arXiv 2604.22548 · score 2 — rag
Measuring and Mitigating Persona Distortions from AI Writing Assistance · cs.CL · arXiv 2604.22503 · score 2 — rag
Decoding High-Dimensional Finger Motion from EMG Using Riemannian Features and RNNs · cs.LG · arXiv 2604.22499 · score 2 — rag
FedSPDnet: Geometry-Aware Federated Deep Learning with SPDnet · stat.ML · arXiv 2604.22494 · score 2 — rag
On the Hybrid Nature of ABPMS Process Frames and its Implications on Automated Process Discovery · cs.AI · arXiv 2604.22455 · score 2 — rag
Beyond Land Surface Temperature: Explainable Spatial Machine Learning Reveals Urban Morphology Effects on Human-Centric Heat Stress · cs.LG · arXiv 2604.22433 · score 2 — gpu
A comprehensive evaluation of spatial co-execution on GPUs using MPS and MIG technologies · cs.DC · arXiv 2604.22430 · score 2 — gpu
CognitiveTwin: Robust Multi-Modal Digital Twins for Predicting Cognitive Decline in Alzheimer’s Disease · cs.AI · arXiv 2604.22428 · score 2 — transformer
Distance-Misaligned Training in Graph Transformers and Adaptive Graph-Aware Control · cs.LG · arXiv 2604.22413 · score 2 — transformer
Conformalized Super Learner · stat.ML · arXiv 2604.22391 · score 2 — rag
Pack only the essentials: Adaptive dictionary learning for kernel ridge regression · stat.ML · arXiv 2604.22386 · score 2 — rag
Revisiting Neural Activation Coverage for Uncertainty Estimation · cs.LG · arXiv 2604.22360 · score 2 — rag
ChangeQuery: Advancing Remote Sensing Change Analysis for Natural and Human-Induced Disasters from Visual Detection to Semantic Understanding · cs.CV · arXiv 2604.22333 · score 2 — reasoning
AutoINV: Automated Invariant Generation Framework for Formal Verification on High-Level Synthesis Designs · cs.AR · arXiv 2604.22285 · score 2 — rag
TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis · cs.CL · arXiv 2604.22225 · score 2 — reasoning
From Global to Local: Rethinking CLIP Feature Aggregation for Person Re-Identification · cs.CV · arXiv 2604.22190 · score 2 — rag
FixV2W: Correcting Invalid CVE-CWE Mappings with Knowledge Graph Embeddings · cs.CR · arXiv 2604.22176 · score 2 — rag
Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models · cs.CL · arXiv 2604.22166 · score 2 — attention
Wiggle and Go! System Identification for Zero-Shot Dynamic Rope Manipulation · cs.RO · arXiv 2604.22102 · score 2 — rag
Generating Synthetic Malware Samples Using Generative AI · cs.LG · arXiv 2604.22084 · score 2 — rag
Shard the Gradient, Scale the Model: Serverless Federated Aggregation via Gradient Partitioning · cs.DC · arXiv 2604.22072 · score 2 — rag
EgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms · cs.CV · arXiv 2604.22036 · score 2 — rag
Fine-Tuning Regimes Define Distinct Continual Learning Problems · cs.LG · arXiv 2604.21927 · score 2 — fine-tun
A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models · cs.LG · arXiv 2604.21903 · score 2 — attention
GiVA: Gradient-Informed Bases for Vector-Based Adaptation · cs.CL · arXiv 2604.21901 · score 2 — fine-tun
Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors · stat.ML · arXiv 2604.21893 · score 2 — transformer
Addressing Image Authenticity When Cameras Use Generative AI · cs.CV · arXiv 2604.21879 · score 2 — rag
Replay-buffer engineering for noise-robust quantum circuit optimization · quant-ph · arXiv 2604.21863 · score 2 — rag
On the algebra of Koopman eigenfunctions and on some of their infinities · math.DS · arXiv 2604.21825 · score 2 — rag
Divide-then-Diagnose: Weaving Clinician-Inspired Contexts for Ultra-Long Capsule Endoscopy Videos · cs.CV · arXiv 2604.21814 · score 2 — reasoning
Inferring High-Level Events from Timestamped Data: Complexity and Medical Applications · cs.AI · arXiv 2604.21793 · score 2 — reasoning
Compliance Moral Hazard and the Backfiring Mandate · cs.GT · arXiv 2604.21789 · score 2 — rag
Enabling and Inhibitory Pathways of University Students’ Willingness to Disclose AI Use: A Cognition-Affect-Conation Perspective · cs.AI · arXiv 2604.21733 · score 2 — rag
Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks · cs.LG · arXiv 2604.21696 · score 2 — retrieval
Geometric Monomial (GEM): a family of rational 2N-differentiable activation functions · cs.LG · arXiv 2604.21677 · score 2 — transformer
Large-Scale Data Parallelization of Product Quantization and Inverted Indexing Using Dask · cs.LG · arXiv 2604.21645 · score 2 — quantization
Geometric Characterisation and Structured Trajectory Surrogates for Clinical Dataset Condensation · cs.LG · arXiv 2604.21638 · score 2 — rag
Finding Meaning in Embeddings: Concept Separation Curves · cs.CL · arXiv 2604.21555 · score 2 — rag
The CriticalSet problem: Identifying Critical Contributors in Bipartite Dependency Networks · cs.AI · arXiv 2604.21537 · score 2 — rag
Seeing Isn’t Believing: Uncovering Blind Spots in Evaluator Vision-Language Models · cs.CV · arXiv 2604.21523 · score 2 — reasoning
Satisfying Rationality Postulates of Structured Argumentation Through Deductive Support – Technical Report · cs.AI · arXiv 2604.21515 · score 2 — reasoning
Drug Synergy Prediction via Residual Graph Isomorphism Networks and Attention Mechanisms · cs.LG · arXiv 2604.21473 · score 2 — attention
Research on the efficiency of data loading and storage in Data Lakehouse architectures for the formation of analytical data systems · cs.DC · arXiv 2604.21449 · score 2 — rag
Decoupled DiLoCo for Resilient Distributed Pre-training · cs.CL · arXiv 2604.21428 · score 2 — rag
A Green-Integral-Constrained Neural Solver with Stochastic Physics-Informed Regularization · cs.LG · arXiv 2604.21411 · score 2 — gpu
Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation · cs.SE · arXiv 2604.21380 · score 2 — retrieval
MKJ at SemEval-2026 Task 9: A Comparative Study of Generalist, Specialist, and Ensemble Strategies for Multilingual Polarization · cs.CL · arXiv 2604.21370 · score 2 — rag
Evaluating AI Meeting Summaries with a Reusable Cross-Domain Pipeline · cs.AI · arXiv 2604.21345 · score 2 — rag
MiMIC: Mitigating Visual Modality Collapse in Universal Multimodal Retrieval While Avoiding Semantic Misalignment · cs.CV · arXiv 2604.21326 · score 2 — retrieval
Listen and Chant Before You Read: The Ladder of Beauty in LM Pre-Training · cs.CL · arXiv 2604.21265 · score 2 — transformer
Improving Performance in Classification Tasks with LCEN and the Weighted Focal Differentiable MCC Loss · cs.LG · arXiv 2604.21252 · score 2 — rag
Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness · cs.LG · arXiv 2604.21235 · score 2 — rag

四月 27, 2026 Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems
四月 27, 2026 Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models
四月 27, 2026 QuantClaw: Precision Where It Matters for OpenClaw
四月 27, 2026 Emergent Strategic Reasoning Risks in AI: A Taxonomy-Driven Evaluation Framework
四月 27, 2026 Aligning Dense Retrievers with LLM Utility via DistillationAligning Dense Retrievers with LLM Utility via Distillation
四月 27, 2026 Preference Heads in Large Language Models: A Mechanistic Framework for Interpretable Personalization
四月 27, 2026 Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
四月 27, 2026 Behavioral Canaries: Auditing Private Retrieved Context Usage in RL Fine-Tuning
四月 27, 2026 How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks
四月 27, 2026 Guess-Verify-Refine: Data-Aware Top-K for Sparse-Attention Decoding on Blackwell via Temporal Correlation