Projects

Amazon Nova Multimodal Embeddings

Vision Modality Lead | Amazon AGI | 2025

Amazon Nova Multimodal Embeddings

Amazon Nova Multimodal Embeddings is Amazon's first unified embedding model supporting text, documents, images, video, and audio through a single model, enabling customers to unlock insights from unstructured data. Led the design and implementation of the vision modality training pipeline across all stages, achieving state-of-the-art performance on image, video, and document retrieval tasks.

Amazon Nova 2.0

Video Pretraining Data Lead | Amazon AGI | 2025

Amazon Nova 2.0

Amazon Nova 2.0 Lite and Omni are Amazon's improved flagship multimodal large language models with enhanced performance on reasoning and multimodal processing. Led the development of a video synthetic caption generation pipeline, producing diverse video pretraining data that contributed to significant video understanding improvements over Nova 1.0.

Amazon Nova

Contributor | Amazon AGI | 2024–2025

Amazon Nova

Amazon Nova is Amazon's family of foundation models. Contributed to the development of the Nova model family as part of the AGI Foundations team.

Academic Service

Peer reviewer for major conferences including CVPR, ICCV, ECCV, NeurIPS, AAAI, WACV, and ACL. Recognized as an Outstanding Reviewer (top 5%) at CVPR 2025.