+91 80748 68174 contactoffcampusjob@gmail.com

Senior Ai Engineer Data Infrastructure Multimodal Models 100% Remote

Framework Ventures United States, United States, US

About the Role

About the job We’re seeking experienced AI infrastructure Engineers to design and implement robust, scalable pipelines for massive data workloads. Join Tether’s applied research team, where you’ll contribute to high‑impact projects that run across thousands of GPUs and drive cutting‑edge video generation foundation development. Responsibilities Build and scale high‑throughput data infrastructure optimized for video and multimodal content processing across large GPU clusters (e.g., H100/H200). Design core preprocessing algorithms for video, audio, text, and image modalities, enabling efficient extraction, synchronization, and normalization of temporal data. Build automated acquisition pipelines for sourcing large‑scale video datasets, handling diverse formats, frame rates, annotations, and embedded audio. Architect robust systems for scalable evaluation and annotation, including prompt‑based scoring, perceptual metrics, caption generation, and retrieval‑based diagnostics. Collaborate with model researchers to co‑design video model architectures (e.g., DiTs, VAEs, spatio‑temporal transformers) and training schedules across pretraining and fine‑tuning stages. Optimize distributed data loading and pipeline throughput for training at scale, ensuring robustness across model variants and modality combinations. Manage infrastructure to support experiment tracking, model versioning, and cross‑team deployment workflows, integrating with production and research platforms. Support backend engineering across research, product, and creative teams to ensure seamless integration of data and model workflows from prototyping to inference. Qualifications Proficient in Python with strong programming skills across backend, infrastructure, and data tooling domains. Strong software engineering experience, including 2+ years working with petabyte‑scale data pipelines and systems across thousands of GPUs. Proven ability to architect and maintain large‑scale distributed systems for data processing and delivery. Deep expertise in orchestration frameworks such as Kubernetes and SLURM with hands‑on experience deploying and managing high‑throughput workloads. Preferred Qualifications Practical experience building pipelines and infrastructure with visual and multimodal datasets, including image/video pipelines. Experience in building video foundation infrastructure pipelines and workflows with collaboration of LLM and/or video foundation research and engineering teams is a strong advantage. #J-18808-Ljbffr

Responsibilities

  • Build and scale high-throughput data infrastructure for video and multimodal processing
  • Architect robust systems for scalable evaluation and annotation
  • Collaborate with model researchers on video model architectures and training schedules

Qualifications

  • Bachelor's/Master's in CS or related field
  • 2+ years experience with petabyte-scale data pipelines
  • Strong Python and backend engineering skills

Required Skills

Python distributed systems Kubernetes SLURM data pipelines

Keywords

ai infrastructure multimodal data video processing distributed systems GPU clusters

Interested in this role?

Apply now and take the next step in your career.

Apply Now

Job Overview

Date Posted 4 days ago
Location United States, United States, US
Job Type Full-time
Work Mode Remote
Experience 2+ years
Category Health information technology, Artificial intelligence, Ai infrastructure, Multimodal data processing

About the Company

Framework Ventures venture capital & private equity
San Francisco, United States
23 employees