Job Description

What You Will DoCloud infrastructure architecture and operations (Azure and AWS)Container orchestration at scale (Kubernetes)Data platform infrastructure (Spark/Databricks)Platform observability, performance optimization, and cost managementInfrastructure automation and self-service capabilitiesIncident response and system resilience Major Initiatives You'll DriveObservability: Build comprehensive monitoring, logging, and tracing systems that provide actionable insights into platform health and performance.Production Scalability & Reliability: Ensure our platform scales efficiently to meet customer demand while maintaining high availability and performance.Optimization: Drive FinOps practices and cost-aware architecture decisions across our Cloud infrastructure. ResponsibilitiesCloud infrastructure architecture and operations (Azure and AWS)Container orchestration at scale (Kubernetes)Data platform infrastructure (Spark/Databricks)Platform observability, performance optimization, and cost managementInfrastructure automation and self-service capabilitiesIncident response and system resilience Qualifications4-6+ years working with production systems at scaleHands-on experience with cloud infrastructure (Azure or AWS preferred)Programming ability (Python, Rust, Go, Bash, or others)Understanding of distributed systems, reliability, and observabilityExperience with infrastructure automation and configuration managementDemonstrated ability to work effectively in remote environments Required SkillsExperience operating Kubernetes in production environmentsFamiliarity with data platform infrastructure (Spark, Databricks, or similar)Multi-cloud experience (Azure and AWS)Hands-on experience with observability tools and practicesTrack record of improving system reliability and scalabilityExperience with cloud cost optimization and FinOps practicesInfrastructure as Code tools (Terraform, Pulumi, or similar) Preferred SkillsLeadership experience or potentialML/AI infrastructure experienceSecurity and compliance knowledgeIncident management and on-call practicesPerformance tuning and capacity planning

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In