I am Yin Cui (崔崟 in Chinese, pronounced as /yin tsui/), a research scientist at NVIDIA. Before joining NVIDIA, I was a research scientist at Google. I obtained my Ph.D. in Computer Science from Cornell University and Cornell Tech in 2019, advised by Professor Serge Belongie. Together with the team, I received the PAMI Mark Everingham Prize (2023) for the COCO dataset. My current research interests are Generative AI and Multimodal.
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui
CVPR 2024
[arXiv]
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
Hassan Akbari, Dan Kondratyuk, Yin Cui, Rachel Hornung, Huisheng Wang, Hartwig Adam
NeurIPS 2023
[arXiv]
DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Xiuye Gu, Yin Cui, Jonathan Huang, Abdullah Rashwan, Xuan Yang, Xingyi Zhou, Golnaz Ghiasi, Weicheng Kuo, Huizhong Chen, Liang-Chieh Chen, David A Ross
NeurIPS 2023
[arXiv]
Module-wise Adaptive Distillation for Multimodality Foundation Models
Chen Liang, Jiahui Yu, Ming-Hsuan Yang, Matthew Brown, Yin Cui, Tuo Zhao, Boqing Gong, Tianyi Zhou
NeurIPS 2023
[arXiv]
Unified Visual Relationship Detection with Vision and Language Models
Long Zhao, Liangzhe Yuan, Boqing Gong, Yin Cui, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu
ICCV 2023
[arXiv]
A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models
James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Jeremiah Zhe Liu, Xiuye Gu, Yin Cui, Dustin Tran, Balaji Lakshminarayanan
ICML 2023
[arXiv]
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Weicheng Kuo, Yin Cui, Xiuye Gu, AJ Piergiovanni, Anelia Angelova
ICLR 2023
[arXiv] [Website]
Scaling Open-Vocabulary Image Segmentation with Image-Level Labels
Golnaz Ghiasi, Xiuye Gu, Yin Cui, Tsung-Yi Lin
ECCV 2022
[arXiv] [Code]
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui
ICLR 2022
[arXiv] [Code] [Demo]
Surrogate Gap Minimization Improves Sharpness-Aware Training
Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha C. Dvornek, Sekhar Tatikonda, James S. Duncan, Ting Liu
ICLR 2022
[arXiv] [Website] [Code (in PyTorch)] [Models (in JAX)]
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong
NeurIPS 2021
[arXiv] [Code]
Spatiotemporal Contrastive Video Representation Learning
Rui Qian*, Tianjian Meng*, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
CVPR 2021
[arXiv] [Code]
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi*, Yin Cui*, Aravind Srinivas*, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph
CVPR 2021 (Oral)
[arXiv] [Code]
Rethinking Pre-training and Self-training
Barret Zoph*, Golnaz Ghiasi*, Tsung-Yi Lin*, Yin Cui, Hanxiao Liu, Ekin D. Cubuk, Quoc V. Le
NeurIPS 2020 (Oral)
[arXiv] [Code]
Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset
Menglin Jia*, Mengyun Shi*, Mikhail Sirotenko*, Yin Cui*, Claire Cardie, Bharath Hariharan, Hartwig Adam, Serge Belongie
ECCV 2020 (Oral)
[Website] [arXiv] [Code] [Kaggle Challenge]
SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song
CVPR 2020
[arXiv] [Code] [Google AI Blog]
Class-Balanced Loss Based on Effective Number of Samples
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, Serge Belongie
CVPR 2019
[arXiv] [Code] [Poster]
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
CVPR 2018
[arXiv] [Data] [Code] [Poster] [Tensorflow Hub]
The iNaturalist Species Classification and Detection Dataset
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, Serge Belongie
CVPR 2018 (Spotlight)
[arXiv] [Data] [Tensorflow Object Detection API] [Google AI Blog] [TechCrunch]
Collaborative Metric Learning
Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, Deborah Estrin
WWW 2017
[PDF] [Code] [Slides]
Learning Deep Representations for Ground-to-Aerial Geolocalization
Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays
CVPR 2015 (Oral)
[PDF] [Data] [Extended Abstract] [Poster]