Omilia

Senior Site Reliability Engineer

Posted: 1 minutes ago

Job Description

We are looking for a Senior Site Reliability Engineer with Cloud platform experience. This individual will be part of a team responsible for operating and maintaining production clusters and developing our observability solutions; they will collaborate with team members to develop automation strategies, monitoring & alerting, and ensuring overall platform reliability. Your goal will be to become an integral part of the team, making every challenge of the platform - your own challenge, and solving them accordingly.ResponsibilitiesEnsure platform reliability and availability across production and pre-production environments through proactive monitoring, alerting, and automationFirst response for incidents, contribute to problem management and root cause analysisSupporting the development team's effort towards reliability, creating a solid reliability culture within the development lifecycleDevelop troubleshooting documentation for production support resourcesCollaborate with Engineering teams to develop optimised and productive runbooks, operational documentation and automation of operational tasksCollaborate with development and cloud engineering teams to embed reliability and performance into the software delivery lifecycleDesign, implement, and evolve observability solutions (metrics, logs, traces, dashboards) using tools such as Prometheus, Grafana, and ELKParticipate in on-call rotations and continuously improve alert quality and response processesChampion a culture of reliability, performance, and continuous improvement across teamsRequirementsBachelor's Degree or MS in Engineering or equivalentExperience in operating at least one container orchestration cluster (Kubernetes, Docker Swarm)Experience developing or maintaining software for production services at scaleExperience with ELKExperience with AWSExperience with Grafana/Prometheus stackStrong scripting skills (Bash, Python or Go)Excellent communication skillsThinking out of the box and anticipating challenges. It is imperative we are not simply reactive; we must expect challenges and question technologies, procedures and thinking already in place. You will be expected to constantly review and challenge at all levelsVersatility. We work with agile/lean methods. We'd much rather iterate and learn than assume we know all the answersBeing a team player. You don't (always) work in isolation and are excited by the thought of using your team whilst involving product, experience design, engineering, and more in the processWill be considered as a plus: Telephony knowledge (SIP, VoIP); Experience in Linux Administration (RedHat, CentOS, AL); Working knowledge in Configuration Management tools (Terraform, Ansible); Experience with TCP/IP and general networking concepts; RDBMS knowledge (MySQL, Postgres); NoSQL knowledge (Redis)BenefitsFixed compensation;Long-term employment with the working days vacation;Development in professional growth (courses, training, etc);Being part of successful cutting-edge technology products that are making a global impact in the service industry;Proficient and fun-to-work-with colleagues;Apple gearOmilia is proud to be an equal opportunity employer and is dedicated to fostering a diverse and inclusive workplace. We believe that embracing diversity in all its forms enriches our workplace and drives our collective success. We are committed to creating an environment where everyone feels welcomed, valued, and empowered to contribute their unique perspectives without regard to factors such as race, color, religion, gender, gender identity or expression, sexual orientation, national origin, heredity, disability, age, or veteran status, all eligible candidates will be given consideration for employment.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In