About Dispensed
At Dispensed, we are passionate about empowering individuals to reach their full potential by supporting better health outcomes. We believe that access to innovative and alternative therapies can transform lives. Through our telehealth platform, we facilitate patient access to affordable, efficient, and reliable alternative medicine services across Australia, NZ and the UK.
About The Role
Dispensed delivers prescriptions and clinical consultations to patients across Australia, New Zealand, and the UK, and the reliability of that platform is not an abstract engineering concern: when it degrades, patients lose access to healthcare. As a Senior SRE, you will own the operational health of a platform in active transition, consolidating a legacy Django system into a modern Next.js and Supabase architecture on AWS, and you will have genuine influence over how reliability is designed into that new foundation from the start. This is a role where you will define SLO frameworks, shape observability architecture, and lead the kind of post-incident work that produces lasting systemic change rather than tactical patches. If you want to work at the intersection of serious engineering craft and meaningful patient outcomes, and to build practices that a growing team will rely on for years, this is that role.
What You'll Own
- Define and maintain SLO and error budget frameworks across multiple services, working directly with product engineers to make reliability expectations concrete and actionable rather than aspirational.
- Design and evolve the observability architecture across the platform, ensuring the engineering team has genuine insight into system behaviour during the Django-to-Next.js migration and beyond.
- Identify systemic gaps in monitoring, alerting, and incident response before they surface as patient-facing incidents, and drive the work required to close them.
- Lead post-incident reviews that go beyond immediate fixes, producing changes to architecture, runbooks, on-call processes, or delivery practices that reduce the likelihood and impact of recurrence.
- Write infrastructure-as-code and automation that sets a quality bar for the team, reviewing infrastructure contributions from product engineers and junior SREs with direct, specific feedback.
- Keep product engineering teams unblocked on reliability concerns by being a visible, proactive partner in delivery: attending design conversations, raising reliability risks early, and pushing back constructively when decisions create patient risk without a conscious trade-off.
- Improve how the team operates on reliability over time, including on-call processes, reliability review checkpoints in the delivery cycle, and the quality of documentation product engineers use to understand what is expected of their services.
What You’ll Need
Required:
- 6+ years in SRE, DevOps, or backend roles with production ownership.
- Experience operating and improving reliability of distributed, customer-facing systems.
- Strong cloud and infrastructure-as-code experience (AWS, Terraform, or similar).
- Hands-on experience with SLOs, SLIs, and error budgets.
- Solid observability experience (metrics, logging, tracing).
- Experience leading incidents and post-incident reviews that drive systemic change.
- Strong scripting/programming skills (e.g. Python, Go, TypeScript).
- Ability to identify risks early and influence cross-team engineering decisions.
- Clear communication and documentation skills.
Highly Valued:
- Experience supporting system migrations or major architectural changes.
- Experience in regulated or high-availability environments.
- Experience improving on-call practices or mentoring engineers.
What We Offer
- Work From Anywhere in Australia. 🌍
- A competitive salary and awesome benefits package. 💰
- A supportive and positive work environment. 🌟
- Opportunities to grow and develop your career. 📈
- Opportunity to transform lives through alternative medicine. 💡