Senior Machine Learning Engineer (Diffusion Models)

arrow

/ $150000 - $200000 annum

INFO

Salary
SALARY:

$150000 - $200000

Location

LOCATION

Job Type
JOB TYPE

Permanent

Senior Multimodal Engineer (Visual Language Models & Diffusion Models)
ONSITE NEW YORK (5 days/week)
$200,000 + Equity + Benefits

Are you passionate about building state-of-the-art AI systems that tackle real-world problems? A rapidly growing, venture-backed Series A eCommerce AI startup is looking for a Senior Multimodal Engineer with deep experience in computer vision and diffusion models to help revolutionize the way product data, images, and user interactions are understood and leveraged in online retail.

THE COMPANY

This company is reinventing how large-scale eCommerce platforms utilize AI to optimize customer journeys, automate catalog management, and generate rich product experiences. With $30M+ in funding, a team of 50+ (including 20+ engineers), and partnerships with several Fortune 500 retailers, they're using advanced ML and multi-modal AI to streamline everything from image-based search to intelligent product tagging and auto-generated marketing content.

Their AI agents tackle some of the most pressing challenges in the space, including:

  • Visual product understanding and search

  • Automated attribute extraction and catalog enrichment

  • AI-powered content generation and personalization

  • Visual Q&A and recommendation systems

‍ THE ROLE

As a Senior Multimodal Engineer, you'll be at the forefront of designing and deploying cutting-edge models that combine visual and textual understanding to power a new generation of intelligent eCommerce tools. A key area of focus will be diffusion models-developing, fine-tuning, and scaling them for both generative and discriminative tasks.

You'll work closely with a cross-functional team of ML engineers, backend developers, and product leaders to embed multimodal intelligence into real-world, user-facing applications.

YOU WILL:

  • Design, train, and deploy CV and generative models (with a focus on diffusion models) for a wide variety of use cases, including:

    • Product image generation/enhancement

    • Visual similarity search

    • Automated product tagging/classification

    • Visual Q&A and content synthesis

  • Build and optimize scalable ML pipelines for training, evaluation, and deployment

  • Work with proprietary eCommerce datasets and develop intelligent data labeling and augmentation strategies

  • Implement the latest advancements in transformer-based CV models and vision-language architectures (e.g., CLIP, BLIP, DINO, SAM)

  • Stay current with innovations in LLMs, multi-modal learning, and RAG pipelines, integrating them where applicable

  • Collaborate on architectural decisions, contribute to research efforts, and shape the long-term AI roadmap

✅ YOUR SKILLS & EXPERIENCE

Must-haves:

  • MS or PhD in Computer Science, Computer Vision, Machine Learning, or related field

  • 5+ years of experience developing and deploying computer vision systems in production

  • Expertise in diffusion models (e.g., Stable Diffusion, Imagen, DALL·E) and their application to visual data

  • Strong programming skills in Python, with deep experience in PyTorch, TensorFlow, and OpenCV

  • Solid understanding of feature extraction, model evaluation, and scalable training practices

  • Familiarity with vision transformers, multi-modal architectures, and image-text pretraining

Nice-to-haves:

  • Experience working with eCommerce data, product catalogs, or user-behavior datasets

  • Exposure to LLM integrations, RAG systems, or hybrid generative-discriminative pipelines

  • Knowledge of synthetic data generation and augmentation strategies

BENEFITS & PERKS

  • Competitive salary ($200K) + meaningful equity

  • 5-day/week in-office role based in New York City (Midtown Manhattan HQ)

  • Comprehensive medical, dental, and vision insurance (100% covered plan available)

  • Unlimited PTO and generous parental leave

  • Annual L&D budget and support for conference attendance

  • Daily catered lunches and regular team offsites

  • A collaborative, fast-paced environment where you'll work with some of the brightest minds in AI and eCommerce

READY TO APPLY?

If you're excited to build next-gen AI systems that redefine how consumers interact with online retail, and you thrive in an innovative, high-performance environment, we'd love to hear from you!

CONTACT

Luc Simpson-Kent

Senior recruitment Consultant

SIMILAR
JOB RESULTS

4k-Harnham_DA copy
CAN’T FIND THE RIGHT OPPORTUNITY?

STILL
LOOKING?

If you can’t see what you’re looking for right now, send us your CV anyway – we’re always getting fresh new roles through the door.