Job Description
Are you excited about developing agentic AI, LLM and computer vision models that revolutionize Amazon's Fulfillment network? Are you looking for opportunities to apply state-of-the-art AI on real-world - 5+ years of relevant, broad research experience after a PhD degree or equivalent qualification - Track record of first-author publications at top-tier peer-reviewed conferences (NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL, EMNLP) or patents in machine learning domains - Expert-level programming proficiency in Python with production-quality code standards, plus working knowledge of C++ for performance-critical applications; deep technical expertise with PyTorch and proficiency with the modern ML stack (Pandas, NumPy, scikit-learn, Hugging Face Transformers) - Proven ability to independently scope, design, and execute end-to-end ML projects from research through production deployment, including ownership of model monitoring, maintenance, and iterative improvement - Proven expertise in modern deep learning architecture design including transformers, diffusion models, and neural architecture search, with hands-on experience in designing and training self-supervised learning paradigms, training optimization techniques (distributed training across multi-node GPU clusters, mixed precision, gradient accumulation, parallelism strategies using DeepSpeed, FSDP, or Megatron-LM), and model compression methods (quantization, pruning, distillation) - Proven experience pre-training and fine-tuning large language models (GPT, LLaMA, Claude) and vision-language models (CLIP, LLaVA, Qwen) - Proven experience developing agentic AI systems deployed to production, using state-of-the-art frameworks (LangChain, Strands, etc.) with proven ability to design multi-agent workflows, tool-augmented reasoning systems, RAG systems and advanced prompt engineering techniques (chain-of-thought, few-shot, RLHF, DPO) - Extensive knowledge and proven production experience across multiple ML domains including computer vision (object detection, segmentation, 3D vision, depth estimation, point cloud processing), natural language processing (text generation, information extraction), and multimodal learning - Strong understanding of ML systems design including model serving infrastructure, A/B testing frameworks, feature stores, and MLOps best practices, such as annotation pipeline design, active learning pipelines, and AutoML/hyperparameter optimization techniques