Monday, October 27, 2025
XM

Site Reliability Engineer

Posted: 4 days ago

Job Description

Site Reliability Engineers (SRE) - Multiple Openings The Role:You will join a team working with Observability, Escalations, Post-mortems, Correction of Errors, and other practices that will contribute to the company's goal of cloud resiliency. You will be responsible for driving processes around reliability, best practices, cultural change, and enforcement of these practices.The main responsibilities of the position include:Honor and practice the Resiliency pillar of the Well Architected Framework in all tasks and responsibilitiesConduct Chaos Engineering experiments and relevant exercises to improve resiliency and fault-toleranceResearch workloads for migrating to the cloud with minimal disruption and impactMonitor cloud migration projects to ensure seamless transitionsDesign, consult, re-platform, and re-factor the observability of current cloud infrastructureCoordinate with other IT departments and teams regarding observability for both individual and organizational needsRegularly assess cloud deployments for compliance with the company’s standards and best practicesInvestigate and correct areas where observability is laggingStay up to date and provide training on new and current technologies, services, tools, methodologies, and practicesOccasionally participate in service capacity planning, software performance analysis, and system tuningMentor colleagues in technical skills and knowledgeAnalyze, oversee, and remediate the company’s resiliencyParticipate in on-call support 24/7 based on a rotation scheduleMain requirements:BSc/MSc degree in Computer Science or related field5+ years of cloud services experience, with at least 3 years on AWS cloud3+ years of experience in SRE or a similar roleExperience with monitoring, APM, logging, and notification toolsFamiliarity with incident, problem and change management procedures and practicesAdvanced knowledge of SRE practices and methodsUnderstanding and practice of Service LevelsStrong troubleshooting skills and the ability to mentor othersExtensive experience with Kubernetes and related technologies, services, and ecosystemAdvanced knowledge of CI/CD, Infrastructure as Code (IaC) concepts and tools, especially HCL Terraform and AWS CloudFormationExperience with versioning tools like GitStrong organizational and documentation skillsExceptional time management and research abilitiesAdvanced Linux, networking, and scripting skillsThe following will be considered an advantage: Experience with platforms like Kafka (MSK)Experience with RDBMSs, particularly Postgres and MySQLKnowledge of scripting languages such as Python or GoBenefit from: Attractive remuneration packagePrivate health insuranceCorporate pension fundIntellectually stimulating work environmentContinuous personal development and international training opportunitiesThe Hiring Experience: What Awaits YouShow Your Skills – Online Technical ChallengeLet’s Connect – Intro Chat with Talent AcquisitionDeep Dive – First Interview with Your Future TeamFinal Connection – Final InterviewAll applications will be treated with strict confidentiality!

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

Related Jobs