Site Reliability Engineer Lead

About PDQ

PDQ, founded in Salt Lake City, UT, USA, makes device management simple, secure, and Pretty Damn Quick through our products Deploy, Inventory, Connect, Detect, SimpleMDM and SmartDeploy. IT teams use our products to reduce complexity, improve efficiency, and enhance control in their unique environments. We are backed by TA Associates and Berkshire Partners, top-tier global private equity firms.

 

PDQ's Core Values: Honesty, Ownership, Collaboration and Improvement 


Job Description:


Before you apply, please note:

  • At this time, qualified candidates for this role may reside in any of the following US states: AR, AZ, CO, CT, FL, GA, ID, IL, IN, KY, MD, MI, MN, MO, NC, NH, OK, OR, TN, TX, UT, VA, WA, WI.

As PDQ's first Lead Site Reliability Engineer, you’ll define and scale the foundations of observability, reliability, and performance engineering across our platform. This is a high-impact, hands-on leadership role that blends technical depth with organizational influence.


You will:

  • Design, implement, and maintain observability and monitoring systems that ensure application stability, performance, and scale. High ownership; greenfield opportunity.
  • Establish and own service level objectives (SLOs), SLIs, and SLAs across key systems.
  • Collaborate with engineering leaders to develop scalable, proactive monitoring and alerting for new and existing features.
  • Drive incident management best practices — tooling, runbooks, on-call processes, incident response coordination, and executive communication.
  • Lead synthetic testing and load testing initiatives to ensure production scale and stability.
  • Advocate for performance, reliability, and operational excellence across the engineering org.
  • Mentor engineers and influence architectural decisions related to system resiliency and uptime.


What you'll be doing:

  • Design, implement, and maintain observability and monitoring systems that ensure application stability, performance, and scale.
  • Establish and own service level objectives (SLOs), SLIs, and SLAs across key systems.
  • Collaborate with engineering leaders to develop scalable, proactive monitoring and alerting for new and existing features.
  • Drive incident management best practices — tooling, runbooks, on-call processes, incident response coordination, and executive communication.
  • Lead synthetic testing and load testing initiatives to ensure production scale and stability.
  • Advocate for performance, reliability, and operational excellence across the engineering org.
  • Mentor engineers and influence architectural decisions related to system resiliency and uptime.

We're looking for people who have:


You’re a builder and a leader, who thrives on improving systems at scale and setting the standard for reliability across teams.


Who you are:

  • You’re a builder and a leader, who thrives on improving systems at scale and setting the standard for reliability across teams.
  • 5+ years of experience in SRE, DevOps, or Infrastructure Engineering, with at least 2+ years in a lead or strategic role.
  • Proven experience scaling observability platforms and driving SRE principles org-wide.
  • Deep experience with Prometheus, PromQL, Grafana, and ideally GroundCover.
  • Strong familiarity with Google Cloud Platform (GCP) or similar cloud environments.
  • A track record of creating robust incident response and postmortem practices.
  • The ability to plan for scale, reduce toil, and prioritize reliability as a shared responsibility across engineering.
  • Excellent collaboration and communication skills — you can work across teams and influence without authority.


PDQ Perks & Benefits:


PDQ offers all of the great perks and benefits you'd expect from working at a very cool tech company, and even some you might not expect, including:

 

  • 4-Day Work Week
  • Managers who champion professional development 
  • 100% Premium Coverage for medical, dental and vision for you and your dependents
  • 100% Premium Coverage for Short Term Disability, Long Term Disability, Life, and AD&D Insurance
  • Company Match of the first 6% of your employee deferrals 
  • Flexible Paid Time Off Policy that treats you like the adult that you are
  • Health Savings Account (HSA) and wellness incentives
  • Quarterly Company Values Award (team member nominated)


PDQ is proud to be an equal opportunity workplace and do not discriminate on the basis of sex, race, color, age, pregnancy, sexual orientation, gender identity or expression, religion, national origin, ancestry, citizenship, marital status, military or veteran status, genetic information, disability status, or any other characteristic protected by federal, provincial, state, or local law. If you would like to request reasonable accommodation for a medical condition or disability during any part of the application process, please contact hr@pdq.com.


The majority of PDQ's full-time roles do not qualify for sponsorship of employment visas such as the H-1B visa. This applies to scenarios where a candidate might possess temporary work authorization during their schooling or after graduation (e.g., CPT, OPT), but would require H-1B visa sponsorship within a few years of employment to retain eligibility for employment. 


*Currently, candidates who are eligible for fully remote positions can live in any of the following US states: AR, AZ, CO, CT, FL, GA, ID, IL, IN, KY, MD, MI, MN, MO, NC, NH, OK, OR, TN, TX, UT, VA, WA, WI.

Engineering

Remote (United States)

Share on:

Terms of servicePrivacyCookiesPowered by Rippling