Subquadratic

API Engineering Lead

About Subquadratic

Subquadratic is an AI infrastructure and research company building a new class of LLMs on a proprietary post-transformer architecture. While the major labs focus on incremental transformer improvements, we're pushing foundational change at the model architecture level - enabling large-context, multimodal inference that scales efficiently where transformers can't.

About the Role

Subquadratic is looking for a highly technical API Engineering Lead to own and drive our API roadmap. You’ll lead architecture, implementation, and performance improvements across our real-time and batch speech APIs. This role sits at the intersection of backend engineering, real-time streaming, and ML productionization.


What You’ll Do

  • Own end-to-end design, architecture, and evolution of Subquadratic’s customer facing and backend APIs
  • Architect for Scale & Speed: Design and evolve the global control plane for Subquadratic’s real-time (WebSocket/gRPC) and batch APIs.
  • Productionize ML at the Edge: Build the high-throughput inference layer that wraps our speech models. Optimize for millisecond-level cold starts and efficient GPU utilization.
  • Reliability & Observability: implement distributed tracing (OpenTelemetry) and define strict SLOs for streaming stability (e.g., jitter, connection drop rates).
  • Security by Design: Lead the implementation of enterprise-grade security (API keys, OAuth2, rate limiting) and compliance controls (SOC2, data retention) for sensitive audio data.
  • Define API standards, versioning, authentication, usage metering
  • Partner with Research, Product, and Infra to deliver new capabilities
  • Mentor engineers and set best practices for API engineering


What You Bring

  • 5–8 years backend/systems engineering experience
  • Strong experience building and running production APIs
  • Deep Proficiency in Go or Rust. (Python is great for modeling, but our hot path needs systems-level performance).
  • Experience with streaming systems (websockets, gRPC, real-time audio/video)
  • ML Infrastructure Experience: Familiarity with model serving (Triton, TorchServe, Ray) or orchestrating GPU workloads on Kubernetes/AWS.
  • Deep understanding of distributed systems, performance tuning, async I/O
  • Cloud Native Fluency: Hands-on experience with AWS (EKS/ECS, Lambda, ElastiCache).


Bonus

  • Background in speech, audio, DSP, or ML inference pipelines
  • Experience building SDKs, developer tooling, or API billing/usage systems
  • Early-stage startup experience


Success Looks Like

  • APIs that are fast, reliable, and developer-friendly
  • Seamless integration of new models and features
  • Strong engineering foundation that supports rapid iteration
  • <300ms End-to-End Latency (P95) for voice-to-voice interactions.
  • A "One-Line Integration" experience that developers rave about.


Compensation & Benefits

  • Competitive base salary
  • Equity participation
  • Comprehensive health, dental, and vision coverage
  • Flexible paid time off


Subquadratic is proud to be an equal-opportunity employer. We are committed to building a diverse and inclusive culture that celebrates authenticity to win as one. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, disability, protected veteran status, citizenship or immigration status, or any other legally protected characteristics.


Subquadratic uses E-Verify to confirm employment eligibility in compliance with federal law. For more information please visit: https://www.e-verify.gov.


Please note: We do not accept unsolicited resumes from recruiters or employment agencies and will not be responsible for any fees related to unsolicited resumes.

Engineering

Remote (United States)

Share on:

Terms of servicePrivacyCookiesPowered by Rippling