8 24 1

Wenkai Yang

Keven16

https://keven980716.github.io/

keven980716

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

authored a paper 9 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

upvoted a paper 10 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

View all activity

Organizations

None yet

upvoted a paper 6 days ago

Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation

Paper • 2606.02684 • Published 14 days ago • 16

authored a paper 9 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 12 days ago • 23

upvoted a paper 10 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 12 days ago • 23

submitted a paper to Daily Papers 10 days ago

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Paper • 2606.04703 • Published 12 days ago • 23

New activity in Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500 about 1 month ago

What is the data source used for training this model?

#1 opened about 1 month ago by

KouShi2

authored a paper about 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 110

upvoted a paper 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 110

authored a paper 3 months ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published Mar 15 • 23

updated a dataset 3 months ago

Keven16/OPSD-Example-Data

Viewer • Updated Mar 18 • 49.1k • 45

published a dataset 3 months ago

Keven16/OPSD-Example-Data

Viewer • Updated Mar 18 • 49.1k • 45

upvoted a paper 3 months ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published Mar 15 • 23

updated 2 models 3 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step300

4B • Updated Mar 16 • 150

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500

4B • Updated Mar 16 • 1.31k

published 2 models 3 months ago

Keven16/Qwen3-4B-Non-Thinking-RL-Code-Step300

4B • Updated Mar 16 • 150

Keven16/Qwen3-4B-Non-Thinking-RL-Math-Step500

4B • Updated Mar 16 • 1.31k

liked a dataset 3 months ago

LulaCola/AgentProcessBench

Viewer • Updated Mar 18 • 1k • 256 • 15

authored 2 papers 4 months ago

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

Paper • 2602.12125 • Published Feb 12 • 67

Learning to Focus: Causal Attention Distillation via Gradient-Guided Token Pruning

Paper • 2506.07851 • Published Jun 9, 2025

updated a dataset 4 months ago

Keven16/G-OPD-Training-Data

Viewer • Updated Feb 17 • 134k • 1.44k • 2

published a dataset 4 months ago

Keven16/G-OPD-Training-Data

Viewer • Updated Feb 17 • 134k • 1.44k • 2

Wenkai Yang

AI & ML interests

Recent Activity

Organizations

Keven16's activity

What is the data source used for training this model?