About Albert Invent
Albert Invent is a cutting-edge AI-driven software company headquartered in Oakland, California, on a mission to empower scientists and innovators in chemistry and materials science to invent the future faster. Every day, scientists in 30+ countries use Albert to accelerate R&D with AI trained like a chemist, bringing better products to market, faster
Job Description
Drive the design, automation, and reliability of Albert Invent’s core platform to support scalable, high-performance AI applications. Collaborate with Product Engineering and SRE teams to ensure security, resiliency, and developer productivity. Own end-to-end service operability and mentor engineers in building robust, automated systems
Responsibilities:
- Act as a passionate representative of the Albert product and brand.
- Work closely with Product Engineering and other stakeholders to plan and deliver core platform capabilities that enable scalability, reliability, and developer productivity.
- Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
- Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of all the micro-services.
- Responsible for the design and delivery of the mission-critical stack, with a focus on security, resiliency, scale, and performance.
- Authority for end-to-end performance and operability.
- Demonstrate a clear understanding of automation and orchestration principles.
- Act as an ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
- Very strong coding expertise in Python or Node.js with 4+ years of hands-on experience.
- 4+ years of software engineering experience, with at least 2 year in SRE role focused on automation.
- Solid in IAC (Infrastructure as Code), preferably using terraform.
- Solid expertise in cloud infrastructure (AWS) and platform technologies, including microservices, APIs, and distributed systems.
- Hands-on experience with observability stack including centralized log management, metrics & tracing.
- Familiarity with CI/CD tools like CircleCI and performance testing using K6.
- A desire to bring more automation and standards to an Engineering organization.
- A desire to build high-performance APIs with lower latencies (< 200 ms).
- Ability to work in a fast-paced environment and learn from peers and leaders.
- Ability to lead technically, mentor other engineers, and help facilitate the growth of the team through active participation in recruiting and related activities
Good to Have:
- Experience with Kubernetes and container orchestration.
- Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry, Datadog, etc.).
- Experience building internal developer platforms (IDPs) or reusable frameworks for engineering teams.
- Exposure to ML infrastructure or data engineering workflows.
- Experience working in compliance-heavy environments (SOC2, HIPAA, etc.)
Why Join Albert Invent
- Joining Albert Invent means becoming part of a mission-driven, fast-growing global team at the intersection of AI, data, and advanced materials science.
- You will collaborate with world-class scientists and technologists to redefine how new materials are discovered, developed, and brought to market.
- The culture is built on curiosity, collaboration, and ownership, with a strong focus on learning and impact.
- You will enjoy the opportunity to work on cutting-edge AI tools that accelerate real-world R&D and solve global challenges from sustainability to advanced manufacturing while growing your careers in a high-energy environment