Job Description

Join our team as a Site Reliability Engineer, where you will focus on cloud infrastructure, containerization, and monitoring using Kubernetes and Microsoft Azure.You will work closely with clients to ensure robust observability and efficient deployment pipelines. Apply now to contribute to maintaining and improving distributed systems at scale. ResponsibilitiesCreate and manage containerized applications using Docker or PodmanDeploy and maintain Kubernetes resource manifests in clusters such as Kind, GKE, or AKSImplement and monitor Prometheus agents to observe infrastructure and application metricsTroubleshoot and analyze logs to identify and resolve system events and issuesDevelop and maintain Azure DevOps CI/CD pipelines and GitOps deployment workflowsCollaborate with teams to improve system reliability and deployment automationManage infrastructure as code using Terraform and other toolsConfigure and maintain observability tools and alerting systemsEnsure compliance with client constraints and security standardsParticipate in incident response and root cause analysisDocument system configurations, processes, and proceduresSupport continuous improvement of deployment and monitoring practices RequirementsHands-on programming experience of at least 2 yearsProficiency in at least one scripting languageExperience with Kubernetes container orchestrationKnowledge of at least one cloud provider including Microsoft Azure or Google Cloud PlatformFamiliarity with Prometheus or similar monitoring tools for observabilityExperience with Azure DevOps CI/CD pipelines or GitOps tools like Helm and ArgoCDUnderstanding of distributed systems troubleshooting and log analysisPractical skills in containerization using Docker or PodmanExperience creating and managing Kubernetes resource manifestsAbility to deploy and monitor Prometheus agentsKnowledge of infrastructure as code tools such as TerraformStrong problem-solving and analytical skillsEffective communication and teamwork abilitiesEnglish proficiency at B2 level or higher We offerInternational projects with top brandsWork with global teams of highly skilled, diverse peersHealthcare benefitsEmployee financial programsPaid time off and sick leaveUpskilling, reskilling and certification coursesUnlimited access to the LinkedIn Learning library and 22,000+ coursesGlobal career opportunitiesVolunteer and community involvement opportunitiesEPAM Employee GroupsAward-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

Related Jobs