Cloudinary

Staff Backend Engineer – Platform

Posted: 2 minutes ago

Job Description

We are looking for a Staff Backend Engineer who is passionate about building platforms at scale, loves challenging engineering problems, and enjoys empowering other engineers to move faster with confidence.As a member of Cloudinary’s Backend Platform Team, you will own and evolve the backend platform: core shared services (for example, service mesh, shared locking, multi-tenant fairness and rate-limiting), the tooling and processes behind our Software Development Lifecycle (CI/CD pipelines, development environments, release workflows), and AI-powered operational tooling for production visibility and incident response. You’ll work closely with DevOps, Architecture and Product engineering teams to turn these needs into reliable, scalable, and reusable platform capabilities.ResponsibilitiesDesign, build, and own core platform services that support Cloudinary’s backend services at scale (e.g., service mesh, shared locking, fairness and rate-limiting, and other shared infrastructure components)Lead end-to-end engineering initiatives - from discovery and architecture, through implementation and rollout, to observability and ongoing operationsImprove developer experience and productivity, evolving development environments, Blueprint projects, and frameworks that streamline service creation (e.g., Go services, AWS Lambda)Drive the evolution of our AI-powered operational tooling and agents, helping design, build, and maintain systems that analyze, evaluate, and assist in resolving production and on-call issuesAdvance deployment and operational excellence, driving improvements in reliability, performance, and safety of our deployment and release lifecycleProvide technical leadership and mentorship, influencing platform strategy and engineering best practices across teams and partnering closely with DevOps and product engineeringTechnical Skills & Experience10+ years of experience in backend or platform engineering, including designing and building production systems at scaleStrong hands-on experience with Golang, or significant experience with another backend language with a strong desire and ability to ramp up on GoExperience working with AWS and cloud-native architectures, including services such as EC2, S3, SQS, Kinesis, EKS, Lambda, Aurora, and core concepts like IAM, VPC, networking, and autoscalingProven experience designing and operating systems at scale - thinking about high availability, multi-tenancy, throughput, latency, cost, and graceful degradation rather than just correctness in small environmentsPractical experience with Docker and containerized workloadsExperience with distributed systems and service-to-service communication (e.g., service meshes, RPC, concurrency, resiliency patterns)Experience with monitoring and observability tools, such as Kibana, Coralogix, Datadog, CloudWatch, CloudTrail, Rollbar, Athena, or similarProven technical leadership: driving complex projects, making architectural decisions, and aligning stakeholders across teamsComfortable both designing and building services from scratch and working productively in large, existing codebasesComfortable working with Ruby in production systems as part of a multi-language backend stackSoft Skills & Ways Of WorkingGreat team player and communicator - easy to collaborate with, able to explain complex technical topics clearly to different audiencesAutodidact and curious, not shy about asking questions to fully understand ideas, requirements, and systemsAble to actively engage with other teams, understand their workflows and pain points, and translate them into practical platform solutionsOpen-minded and collaborative, able to consider and accept other people’s ideas, even when they contradict your ownGrowth mindset - driven to learn and improve rather than assume you already know it allComfortable mentoring and guiding other engineers, giving constructive feedback and helping raise the bar for engineering qualityEmbraces the use of AI tools and workflows in day-to-day work, looking for ways to leverage AI to increase productivity, quality, and operational excellence rather than resisting itNice to haveExperience with at least two of Ruby, Node.js or Python and their relevant web frameworks in production systemsExperience with Kubernetes and container orchestrationExperience with ArgoCD or other GitOps deployment toolsBackground in building internal developer platforms or shared infrastructure for other teamsExperience designing or integrating AI/LLM-powered operational tools or agents (for observability, incident response, or developer productivity)We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In