Job Description

We are seeking a Principal Data Engineer with 5-7 years of hands-on experience and a strong background in real-time and batch data processing, containerization, and cloud-based data orchestration. This role is ideal for someone passionate about building robust, scalable, and efficient data pipelines, and who thrives in agile, collaborative environments.Key ResponsibilitiesDesign, build, and maintain real-time data pipelines using streaming frameworks such as Kafka, Apache Flink, and Spark Structured Streaming.Develop batch processing workflows with Apache Spark (PySpark)Orchestrate and schedule data workflows using orchestration frameworks such as Apache Airflow and Azure Data FactoryContainerize applications using Docker, manage deployments with Helm, and run them on KubernetesImplement modern storage solutions using open formats such as Parquet, Delta Lake, and Apache IcebergBuild high-performance analytics engines using tools like Trino or PrestoCollaborate with DevOps to manage infrastructure with Terraform and integrate with CI/CD pipelines via Azure DevOpsEnsure data quality and consistency using tools like Great ExpectationsWrite modular, well-tested, and maintainable Python and SQL codeDevelop an observability layer to monitor and optimize performance across data pipelinesParticipate in agile ceremonies and contribute to sprint planning and reviewsRequired Skills & ExperienceAdvanced Python programming with a strong focus on modular and testable codeStrong knowledge of SQL and experience working with large-scale datasetsHands-on experience with at least one major cloud platform (Azure preferred)Solid experience with real-time data processing (Kafka, Flink, or Spark Streaming)Expertise in Apache Spark (PySpark) for batch processingExperience implementing lakehouse architectures and working with columnar storage (e.g., ClickHouse)Proficient in using Azure Data Factory or Apache Airflow for data orchestrationExperience in building APIs to expose large datasetsSolid experience with Docker, Kubernetes, and HelmFamiliarity with data lake open formats such as Parquet, Delta Lake, and IcebergBasic experience with Terraform for infrastructure provisioningPractical experience with data quality frameworks (e.g., Great Expectations)Comfortable working in agile development teamsProven ability in debugging and performance tuning of streaming and batch data jobsExperience with AI-driven tools (e.g., text-to-SQL) is a plusWe have an amazing team of 700+ individuals working on highly innovative enterprise projects & products. Our customer base includes Fortune 100 retail and CPG companies, leading store chains, fast-growth fintech, and multiple Silicon Valley startups.What makes Confiz stand out is our focus on processes and culture. Confiz is ISO 9001:2015 (QMS), ISO 27001:2022 (ISMS), ISO 20000-1:2018 (ITSM) and ISO 14001:2015 (EMS) Certified. We have a vibrant culture of learning via collaboration and making workplace fun.People who work with us work with cutting-edge technologies while contributing success to the company as well as to themselves.To know more about Confiz Limited, visit: https://www.linkedin.com/company/confiz-pakistan/

Principal Data Engineer

Job Description

Job Application Tips

Related Jobs

LuCky fit 👟

Senior React Native Developer

AWS/Kubernetes DevOps Engineer - Remote - Pakistan

Senior Backend NestJS & NodeJs & Postgres Experts with 5+ T...

Job Description

Job Application Tips

Share this job

Apply for this Job

Related Jobs

LuCky fit 👟

Senior React Native Developer

AWS/Kubernetes DevOps Engineer - Remote - Pakistan

Senior Backend NestJS & NodeJs & Postgres Experts with 5+ T...