FlexAI

Senior DevOps Engineer/SRE

About FlexAI


Build and Deploy AI the right way, anywhere.


The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute layer" for running and managing workloads across any cloud, any GPU, and any deployment model (public, hybrid, or on-prem). It brings together "1-click simplicity" for users with "enterprise-grade orchestration, security, and automation" under the hood.


Founded by Brijesh Tripathi, who bring experience from Nvidia, Apple, Tesla, Intel and Zoox, FlexAI is not just building a product – we’re shaping the future of AI. Our teams are strategically distributed across Paris, Silicon Valley, and Bangalore, united by a shared mission: to deliver more compute with less complexity.

 If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem, FlexAI is the place for you !

Position Overview:

FlexAI is seeking a skilled and motivated Senior DevOps/SRE Engineer to join our PaaS Team. As part of this innovative team, you will play a pivotal role in building and maintaining the infrastructure that powers FlexAI's cutting-edge PaaS (Platform as a Service) system. Our PaaS Cloud Service is designed to enable customers to run workloads seamlessly across various architectures, providing unparalleled reliability and efficiency. Our PaaS product is currently in Beta testing with select clients, offering a unique opportunity to contribute to a cutting-edge platform that is poised to redefine industry standards. Join us in this critical phase as we refine and perfect our solution for broader release.


What you’ll do:

• Design, implement, and maintain CI/CD pipelines to support the efficient delivery and deployment of our Beta product, ensuring seamless customer experience.

• Develop and manage infrastructure as code (IaC) using tools like Terraform, enabling scalable and repeatable infrastructure that supports our PaaS goals.

• Implement and manage containerization and orchestration tools (e.g., Docker, Kubernetes) to ensure scalable deployment across various architectures.

• Monitor and optimize system performance, proactively identifying and resolving bottlenecks to maintain reliability and efficiency during Beta testing and beyond.

• Collaborate with software developers and backend engineers to ensure the seamless integration and performance of backend services within our PaaS infrastructure.

• Ensure system reliability and availability by implementing best practices in monitoring, alerting, and incident response, particularly as we scale our Beta product.

• Troubleshoot and resolve infrastructure issues promptly to minimize downtime and maintain customer trust.

• Collaborate with security teams to ensure infrastructure meets security best practices and compliance requirements, especially in a multi-architecture environment.

• Automate routine tasks to improve efficiency and reduce manual intervention, focusing on maintaining the flexibility and reliability of our PaaS offerings.


What you’ll need:

• Bachelor's or higher degree in Computer Science, Software Engineering, or a related field.

• Proven experience as a DevOps or SRE Engineer, with a strong focus on automation, scalability, and reliability within PaaS environments.

• Familiarity with cloud-native technologies including container runtimes such as Docker and cluster schedulers such Kubernetes is a must

• Strong proficiency in scripting languages (e.g.,Python, Bash) and familiarity with programming languages such as Go or Rust.

• Experience with cloud platforms (AWS, Azure, GCP) and infrastructure services, especially in supporting PaaS solutions.

• Proficiency in containerization and orchestration tools (e.g., Docker, Kubernetes) with experience in managing multi-architecture deployments.

• Hands-on experience with infrastructure as code (IaC) tools like Terraform, supporting scalable and reliable infrastructure.

• Strong understanding of CI/CD pipelines and automated testing methodologies.

• Excellent problem-solving and troubleshooting skills, especially in the context of Beta testing and production environments.

• Excellent collaboration and communication skills to work effectively with cross functional teams.

• Entrepreneurial & start-up mindset!

Note: Familiarity with AI model training is a significant advantage!


Engineering

Bangalore, India

Share on:

Terms of servicePrivacyCookiesPowered by Rippling