Location: San Francisco, CA (Onsite preferred, Remote considered for exceptional candidates)
Type: Full-time
Visa Sponsorship: Available for candidates already based in the U.S.
Compensation: Competitive salary + 0.5%–2% equity
Start Date: ASAP
Outspeed powers emotionally intelligent voice companions and agents with emotion and memory—redefining the way humans interact with machines through real-time, expressive, and persistent voice interfaces.
We’re solving some of the toughest problems in machine learning and systems engineering—from low-latency inference and scaling, to multi-user conversational memory and emotion modeling. If you’re excited by the frontier of conversational AI, you’ll be at home here.
Founded in 2024 and based in San Francisco, we're a tight-knit team of 4 building at the intersection of speech, emotion, and intelligence.
Learn more: outspeed.com
Own and scale key ML systems: speech models, memory, emotional synthesis, real-time transformers
Optimize inference latency and throughput for streaming models
Architect data pipelines for fine-tuning and continual learning
Collaborate across voice UX, product, and backend infra to ship intelligent, responsive agents
Push the limits of what's possible in conversational AI—from prototype to production
2–7 years of experience as an ML engineer, especially in real-time ML systems (voice, video, or interactive apps)
Hands-on fluency with PyTorch, CUDA, and pre-trained transformer models
Proven experience optimizing streaming model inference performance
Experience with voice interfaces or emotion-aware synthesis (bonus for Bark, Tortoise, etc.)
Strong data engineering instincts—architecting + processing pipelines
Familiarity with tools like VLLM, SGLang, or similar inference engines
Deep interest in expressive AI, latency-sensitive systems, and emotional computing
A degree in CS or related field from a top-tier university (preferred)
Prior experience at AI-native companies like Runway ML, Descript, AnyScale, DeepMind, etc.
Open-source contributions (share your GitHub!)
Previous startup/founding experience or hunger for 0→1 building
Passion for real-time voice UX, multi-modal agents, and persistent memory architectures
15–20 year veterans with unclear startup intent or urgency
Big tech lifers (e.g. 7+ years at FAANG with no startup exposure)
Candidates with certification-heavy resumes and minimal build experience
Anyone not ready to start within 1 month
Location: Preferably SF-based or open to relocating. Remote is OK for exceptional global candidates.
Pace: Expect ~60 hour weeks. We value high energy and deep focus, with flexibility when it matters.
Team: Tiny but mighty. You’ll be the 5th team member.