Senior/ Lead Platform Engineer (Databrick)
Posted: 2 days ago
Job Description
Role OverviewWe are seeking a Senior/Lead Platform Engineer who will take ownership of the design, implementation and operation of our core data, analytics and ML infrastructure. This role spans across platform architecture, DevSecOps, DataOps, and ML infrastructure, and requires a combination of strategic thought leadership and hands-on execution. You will build, integrate and operate platforms on AWS and Databricks, enabling scalable, secure, production-grade ML/AI solutions.Key ResponsibilitiesArchitect and implement end-to-end data and ML platforms: data lakes, warehouses, streaming and batch pipelines, model training/deployment infrastructure, on AWS + Databricks. Lead DevSecOps and DataOps practices: infrastructure as code (IaC), CI/CD pipelines for data & ML workflows, secure multi-account/multi-region cloud operations. Integrate AWS services (e.g., S3, Redshift, Kinesis, Lambda, EKS/ECS) with Databricks runtime, Delta Lake, Unity Catalog etc to build scalable, performant pipelines. Build and operate ML infrastructure: training clusters, model versioning, MLOps toolchain (e.g., MLflow), model monitoring and observability, automatic retraining workflows. Establish data governance, lineage, quality, observability standards across data pipelines and ML workflows. Mentor engineering teams, define architectural best practices and guide implementation of high-scale data/ML systems. Optimize system performance, cost and scalability; diagnose and resolve large-scale production issues. Continuously evaluate new tools and technologies in the areas of cloud, data platform, DevSecOps, ML infrastructure and apply them to drive innovation. Requirements7+ years of experience in data platform architecture, cloud/ML infrastructure engineering or related roles. Deep technical expertise in Databricks and AWS: demonstrated ability to design, integrate and operate solutions spanning both platforms. Strong hands-on implementation skills: you will not just design but build, deploy and operate the platform. Proven track record of building and operating scalable ML/AI platforms in production (model training & deployment). Expertise in Apache Spark, Delta Lake, modern data pipeline frameworks (batch + streaming). Strong background in infrastructure as code (Terraform, CloudFormation), CI/CD for data/ML, and DevSecOps practices. Proficiency in Python and SQL; familiarity with Scala or equivalent is a plus. Experience with data governance, data lineage, observability and MLOps frameworks (e.g., MLflow, Airflow, dbt). Bonus: Experience in fintech, regulated industries or high-security environments. BenefitsPerformance bonus up to 2 months13th month salary pro-rata15-day annual leave+ 3-day sick leave + 1 birthday leave + 1 Christmas leaveMeal and parking allowance are covered by the companyFull benefits and salary rank during probationInsurances as Vietnamese labor law and premium health care for you and your family without seniority compulsorySMART goals and clear career opportunities (technical seminar, conference, and career talk) - we focus on your developmentValues-driven, international working environment, and agile cultureOverseas travel opportunities for training and working relatedInternal Hackathons and company's events (team building, coffee run, blue card...)Work-life balance 40-hr per week from Mon to Fri
Job Application Tips
- Tailor your resume to highlight relevant experience for this position
- Write a compelling cover letter that addresses the specific requirements
- Research the company culture and values before applying
- Prepare examples of your work that demonstrate your skills
- Follow up on your application after a reasonable time period