SQUIRE

Senior Site Reliability Engineer (Sre)

Posted: Oct 29, 2025

Job Description

Senior AI Site Reliability EngineerOverviewAs a Senior AI Site Reliability Engineer, you will bring an AI-first mindset to solving classic reliability challenges.You'll design, prototype, and deploy intelligent automation that improves observability, incident response, performance tuning, and operational efficiency across SQUIRE's platform.This role is highly cross-functional; you'll collaborate with engineering, infrastructure, and product teams to identify where AI can create leverage, then build and scale those solutions into production.Who we areSQUIRE is the leading business management system designed for the needs of barbers, shop owners, and their communities.We believe the pursuit of artistry and autonomy should not be restricted by the complexities of running a business.With SQUIRE, we provide custom-branded tools, resources, and guidance to help barbers of all stages and experience levels attract and retain more customers, efficiently manage their shop operations, and increase their revenue.Founded in ****, SQUIRE is trusted by barbers in 4,000+ shops in more than a thousand cities around the globe.From streamlined booking and opening new shops to real-time earning dashboards and building lasting customer relationships, SQUIRE supports shop owners in seamlessly bridging the gap between their personal craft and business goals.SQUIRE enables barbers everywhere to unlock their full potential both as artists and as entrepreneurs.For more information, please visit getsquire.com or download the SQUIRE app from the App or Play Store.ResponsibilitiesDevelop and deploy AI/ML-driven solutions for monitoring, anomaly detection, and predictive alerting to improve system reliability and reduce MTTRUse AI techniques to optimize capacity planning, autoscaling, and resource utilization across distributed systemsAutomate repetitive operational tasks with intelligent agents and large-scale data analysisIntegrate LLMs and generative AI into incident response, post-mortem analysis, and business continuityPartner with platform and product engineering teams to embed AI-based observability into services from the ground upContinuously evaluate new AI/ML methods and tools to expand SQUIRE's AI-driven SRE capabilitiesDrive a culture of experimentation: build prototypes, run pilots, measure results, and productionize what worksMentor engineers on applying AI approaches to reliability problems; help establish standards and best practicesRequirements And Qualifications5+ years of experience in Site Reliability Engineering, DevOps, or related rolesProven experience using AI/ML (supervised learning, anomaly detection, LLMs, etc.) to solve operational or reliability problemsStrong background in distributed systems, cloud infrastructure (AWS Preferred), and container orchestration (Docker, ECS, Elastic Beanstalk)Proficiency with observability stacks (Datadog, Sentry, Prometheus, etc.)Solid programming/scripting skills in Python, Go, or similar — with experience integrating ML/AI libraries and APIsHands-on with automation frameworks and infrastructure as code (Terraform, CloudFormation, etc.)Excellent analytical and problem-solving skills, with the ability to innovate in operational domainsStrong communication and collaboration skills across technical and non-technical stakeholdersEnglish proficiency is a must.You will be interacting with English-speaking coworkersMust be based in Buenos AiresAvailability to work on-site in our office in CABA two days a week (Tuesdays and Thursdays)Nice to haveFamiliarity with generative AI/LLM deployment (e.g., for operational assistants, automated runbooks)Experience with predictive scaling, proactive fault detection, or automated incident management systemsContributions to AI-Ops / MLOps tooling or open source reliability projectsBackground in applying AI to security operations or compliance monitoringInterview AccommodationsSQUIRE is committed to working with and providing reasonable assistance to individuals with physical and mental disabilities.If you are an individual with a disability requiring an accommodation to apply for an open position, please email your request to ****** and someone on our team will respond to your request.Equal Employment OpportunitySQUIRE provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.This applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.Pay Transparency Nondiscrimination ProvisionSQUIRE will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant.However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is in response to a formal complaint or charge, in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or consistent with the contractor's legal duty to furnish information.#J-*****-Ljbffr

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In