ML Ops Engineer

At Voio, we’re rebuilding medical imaging for the people who rely on it every day. The data is immense, the challenges are real, and the impact is direct. If you want your work to matter — to clinicians, to patients, to the field itself — you’ll fit right in.

Apply Here

Role Description

At Voio, we’re redefining how radiologists work. Today, medical imaging is slowed by fragmented tools — one system to view scans, another to dictate, and another to search patient context. We’re building a unified system that connects it all: fast, intelligent, and deeply intuitive.

Our AI models originated from years of research at UC Berkeley and UCSF, but our mission goes far beyond the lab — we’re now building real-world systems that push the frontier of applied medical AI. Every line of code here helps doctors move faster, see clearer, and focus on care, not clicks.

Responsibilities

We’re looking for an ML Ops Engineer to build and scale the systems that power Voio’s medical AI infrastructure. You’ll design reliable, high-performance pipelines for model training, inference, and deployment — ensuring our foundation models move seamlessly from research to production.

You’ll work closely with ML researchers, backend engineers, and clinical teams to develop efficient, secure, and observable production environments for real-world use.

What You’ll Do

Deploy and maintain GPU inference systems using Triton Inference Server and TensorRT.
Build and optimize CI/CD pipelines for model training, testing, validation, and rollout.
Tune PyTorch deployments for latency, memory, and throughput efficiency.
Design observability and monitoring tools to track inference performance, drift, and uptime.
Benchmark and profile models in production, driving continuous improvements across the stack.
Partner with research and product teams to operationalize models for real-time clinical workflows.

Qualifications & Requirements

4+ years of experience in ML Ops, infrastructure, or distributed systems.
Proficiency with Triton Inference Server, TensorRT, and PyTorch optimization.
Strong background in GPU-based debugging, profiling, and system tuning.
Experience with Docker, Kubernetes, and cloud deployment (AWS or GCP).
Ability to operate independently and make clear, impact-driven tradeoffs.

Desired Characteristics & Attributes

Experience with edge inference on Jetson, Orin, or equivalent hardware.
Familiarity with DICOM, HL7, or healthcare data standards.
Prior exposure to regulated or safety-critical ML systems.

What We Offer

We hire for clarity, ownership, and judgment.
The ideal engineer:

Thinks in systems. Sees beyond individual tasks to how everything connects.
Executes with precision. Moves quickly without sacrificing long-term quality.
Owns outcomes. Takes responsibility across design, build, and delivery.
Builds with purpose. Writes code that improves lives, not just benchmarks.

Why Join Us

You’ll work directly with leading engineers, clinicians, and researchers from UC Berkeley and UCSF — building products that didn’t exist before. If you want to shape how AI enters the clinic, and you care about craft as much as impact, this is your team.

Apply Here