8 12 10

Zhiding Yu

Zhiding

https://research.nvidia.com/person/zhiding-yu

Chrisding

AI & ML interests

None yet

Recent Activity

updated a model 11 days ago

nvidia/LocateAnything-3B

liked a model 12 days ago

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

authored a paper 15 days ago

Neural Eulerian Scene Flow Fields

View all activity

Organizations

upvoted a paper 23 days ago

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published 24 days ago • 143

upvoted a paper 4 months ago

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published Feb 11 • 55

upvoted a paper 5 months ago

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published Jan 14 • 56

upvoted a paper 7 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 128

upvoted 2 articles 8 months ago

Article

Welcome the NVIDIA Llama Nemotron Nano VLM to HF Mirror Hub

nvidia

•

Jun 27, 2025

• 31

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

nvidia

•

Aug 11, 2025

• 76

upvoted 2 papers about 1 year ago

AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

Paper • 2506.05328 • Published Jun 5, 2025 • 21

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Paper • 2504.15271 • Published Apr 21, 2025 • 69

upvoted 2 collections over 1 year ago

Deepseek Papers

Collection

Deepseek papers collection • 31 items • Updated 4 days ago • 351

QLIP

Collection

QLIP is a family of image tokenizers with SOTA reconstruction quality and zero-shot image understanding. • 3 items • Updated 7 days ago • 12

upvoted a paper over 1 year ago

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Paper • 2501.14818 • Published Jan 20, 2025 • 9

upvoted a paper almost 2 years ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 88