Projects

Amazon Nova Multimodal Embeddings

Vision Modality Lead | Amazon AGI | 2025

Amazon Nova Multimodal Embeddings is Amazon’s first unified embedding model supporting text, documents, images, video, and audio through a single model, enabling customers to unlock insights from unstructured data. Led the design and implementation of the vision modality training pipeline across all stages, achieving state-of-the-art performance on image, video, and document retrieval tasks.

Amazon Nova 2.0

Video Pretraining Data Lead | Amazon AGI | 2025

Amazon Nova 2.0 Lite and Omni are Amazon’s improved flagship multimodal large language models with enhanced performance on reasoning and multimodal processing. Led the development of a video synthetic caption generation pipeline, producing diverse video pretraining data that contributed to significant video understanding improvements over Nova 1.0.

Amazon Nova

Contributor | Amazon AGI | 2024–2025

Amazon Nova is Amazon’s family of foundation models. Contributed to the development of the Nova model family as part of the AGI Foundations team. Co-author of The Amazon Nova Family of Models: Technical Report and Model Card.

Academic Service

Peer reviewer for major conferences including CVPR, ICCV, ECCV, NeurIPS, AAAI, WACV, and ACL. Recognized as an Outstanding Reviewer (top 5%) at CVPR 2025.

Zhikang Zhang(张智康)

Projects

Amazon Nova Multimodal Embeddings

Amazon Nova 2.0

Amazon Nova

Academic Service