Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation Paper • 2606.02684 • Published 14 days ago • 16
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents Paper • 2606.04703 • Published 12 days ago • 23
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents Paper • 2606.04703 • Published 12 days ago • 23
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents Paper • 2606.04703 • Published 12 days ago • 23
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 110
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 110
AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents Paper • 2603.14465 • Published Mar 15 • 23
AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents Paper • 2603.14465 • Published Mar 15 • 23
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation Paper • 2602.12125 • Published Feb 12 • 67
Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning Paper • 2506.07851 • Published Jun 9, 2025