Yin Cui

Research Scientist at NVIDIA

About

I am Yin Cui (崔崟 in Chinese, pronounced as /yin tsui/), a research scientist at NVIDIA. Before joining NVIDIA, I was a research scientist at Google. I obtained my Ph.D. in Computer Science from Cornell University and Cornell Tech in 2019, advised by Professor Serge Belongie. Together with the team, I received the PAMI Mark Everingham Prize (2023) for the COCO dataset. My current research interests are Generative AI and Multimodal.

Industry Research

Cosmos World Foundation Model Platform for Physical AI
NVIDIA: Yin Cui (core contributor)
[Paper] [Website] [Code] [Hugging Face] [Project Page] [Blog] [Video] [Model API]
[Jensen Huang Keynote at CES 2025]

Edify 3D: Scalable High-Quality 3D Asset Generation
NVIDIA: Yin Cui (core contributor)
[Paper] [Website] [Video] [Model API]

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
NVIDIA: Yin Cui (core contributor)
[Paper] [Website] [Video] [Image Model API] [360 HDRi Model API]

GenUSD: 3D Scene Generation Made Easy
Jiashu Xu, Yunhao Ge, Yifan Ding, Yin Cui, Chen-Hsuan Lin, Xiaohui Zeng, Zekun Hao, Zhaoshuo Li, Donglai Xiang, Qianli Ma, Fangyin Wei, JP Lewis, Qinsheng Zhang, Seungjun Nah, Arun Mallya, Jingyi Jin, Hanzi Mao, Yen-Chen Lin, Pooya Jannaty, Tsung-Yi Lin, Ming-Yu Liu
ACM SIGGRAPH Real-Time Live! 2024
[Live Demo at SIGGRAPH 2024] [Blog] [Paper] [Video]

Selected Publications

VideoGLUE: Video General Understanding Evaluation of Foundation Models
Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong
TMLR 2024
[Paper] [Code]

Why Fine-grained Labels in Pretraining Benefit Generalization?
Guan Zhe Hong, Yin Cui, Ariel Fuxman, Stanley Chan, Enming Luo
TMLR 2024
[Paper]

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui
CVPR 2024
[Paper] [Website]

Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
Hassan Akbari, Dan Kondratyuk, Yin Cui, Rachel Hornung, Huisheng Wang, Hartwig Adam
NeurIPS 2023
[Paper] [Code]

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Xiuye Gu, Yin Cui, Jonathan Huang, Abdullah Rashwan, Xuan Yang, Xingyi Zhou, Golnaz Ghiasi, Weicheng Kuo, Huizhong Chen, Liang-Chieh Chen, David A Ross
NeurIPS 2023
[Paper] [Objects365 Instance Segmentation Dataset]

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Jeremiah Zhe Liu, Xiuye Gu, Yin Cui, Dustin Tran, Balaji Lakshminarayanan
ICML 2023
[Paper] [Code]

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Weicheng Kuo, Yin Cui, Xiuye Gu, AJ Piergiovanni, Anelia Angelova
ICLR 2023
[Paper] [Website] [Code]

Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin
ECCV 2022
[Paper] [Code]

Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui
ICLR 2022
[Paper] [Code]

Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha C. Dvornek, Sekhar Tatikonda, James S. Duncan, Ting Liu
ICLR 2022
[Paper] [Website] [Code in PyTorch] [Code in JAX]

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong
NeurIPS 2021
[Paper] [Code]

Spatiotemporal Contrastive Video Representation Learning
Rui Qian*, Tianjian Meng*, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
CVPR 2021
[Paper] [Code]

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi*, Yin Cui*, Aravind Srinivas*, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph
CVPR 2021 (Oral)
[Paper] [Code]

Rethinking Pre-training and Self-training
Barret Zoph*, Golnaz Ghiasi*, Tsung-Yi Lin*, Yin Cui, Hanxiao Liu, Ekin D. Cubuk, Quoc V. Le
NeurIPS 2020 (Oral)
[Paper] [Code]

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Menglin Jia*, Mengyun Shi*, Mikhail Sirotenko*, Yin Cui*, Claire Cardie, Bharath Hariharan, Hartwig Adam, Serge Belongie
ECCV 2020 (Oral)
[Paper] [Website] [Code] [Kaggle Challenge]

Class-Balanced Loss Based on Effective Number of Samples
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, Serge Belongie
CVPR 2019
[Paper] [Code]

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
CVPR 2018
[Paper] [Code] [Data] [Tensorflow Hub]

The iNaturalist Species Classification and Detection Dataset
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie
CVPR 2018 (Spotlight)
[Paper] [Code and Data] [Tensorflow Object Detection API] [Blog]

Collaborative Metric Learning
Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, Deborah Estrin
WWW 2017
[Paper] [Code]

Learning Deep Representations for Ground-to-Aerial Geolocalization
Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays
CVPR 2015 (Oral)
[Paper] [Data]

Miscellaneous

Professional Activities

  • Area Chair of ICCV 2025, ICLR 2025, NeurIPS 2024, ICLR 2024, NeurIPS 2023, ICCV 2023, WACV 2023
  • Action Editor of TMLR
  • Senior Program Committee (SPC) Member of AAAI 2022, AAAI 2023
  • Guest Editor of IJCV Special Issue on Open-World Visual Recognition
  • Reviewer of CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV
  • Organizing Committee of ImageNet and COCO Visual Recognition Workshop at ICCV 2015, ECCV 2016
  • Organizing Committee of Joint Workshop of the COCO and Places Challenges at ICCV 2017
  • Organizing Committee of Joint COCO and Mapilary Recognition Challenge Workshop at ECCV 2018, ICCV 2019
  • Organizing Committee of Joint COCO and LVIS Recognition Challenge Workshop at ECCV 2020
  • Organizing Committee of Fine-Grained Visual Categorization Workshop at CVPR 2017, CVPR 2018, CVPR 2019
  • Organizing Committee of Large-scale Scene Understanding Workshop (COCO Captioning Challenge) at CVPR 2015

Selected Honors

  • PAMI Mark Everingham Prize (2023)
  • McMullen Fellowship (2014 - 2015)
  • Edwin Howard Armstrong Memorial Award (2014)
  • Wei Family Private Foundation Special Scholarship (2013)
  • National Scholarship (2010)