-
May 29, 2026
When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems
-
May 29, 2026
SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search
-
May 29, 2026
RewardFlow: Topology-Aware Reward Propagation on State Graphs for Agentic RL with Large Language Models
-
May 29, 2026
ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding
-
May 29, 2026
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding
-
May 29, 2026
GrepSeek: Training Search Agents for Direct Corpus Interaction
-
May 29, 2026
RTP-LLM: High-Performance Alibaba LLM Inference Engine
-
May 29, 2026
Tiny Brains, Giant Impact: Uncovering the Keystone Neurons of LLM with Just a Few Prompts
-
May 29, 2026
The Curse of Helpfulness: Inverse Scaling Law in Robustness to Distractor Instructions via DistractionIF
-
May 29, 2026
Reasoning and Tool-use Compete in Agentic RL:From Quantifying Interference to Disentangled Tuning
-
April 28, 2026
AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
-
April 28, 2026
BitRL: Reinforcement Learning with 1-bit Quantized Language Models for Resource-Constrained Edge Deployment
-
April 28, 2026
DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference
-
April 28, 2026
Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
-
April 28, 2026
Grounding Before Generalizing: How AI Differs from Humans in Causal Transfer
-
April 28, 2026
PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model
-
April 28, 2026
Stabilizing Efficient Reasoning with Step-Level Advantage Selection
-
April 28, 2026
The Chameleon's Limit: Investigating Persona Collapse and Homogenization in Large Language Models
-
April 28, 2026
Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
-
April 28, 2026
FlashOverlap: Minimizing Tail Latency in Communication Overlap for Distributed LLM Training