Ingeniero de datos
Posted: 21 hours ago
Job Description
Location: Hybrid role based in Colombia, preferably in Bogotá, Medellín, or CartagenaYears of experience: 6+ years of experience in data engineering with cloud technologiesRequirement: Minimum English level B2JOB OVERVIEWWe are seeking an experienced Data Engineer with strong skills in Python, SQL, AWS, and CI/CD practices, particularly Infrastructure as Code using AWS CDK. This role focuses on designing and implementing serverless data pipelines and cloud infrastructure to build end-to-end data solutions that extract, transform, and catalog data from various sources. You will collaborate closely with data science teams to ensure data accessibility and quality for their analytics and modeling initiatives.RESPONSIBILITIESInfrastructure DevelopmentAWS CDK Implementation: Design and deploy complete data infrastructure using AWS CDK with PythonStorage Solutions: Implement optimized S3 data lakes with appropriate partitioning and file formatsData Pipeline EngineeringAPI Integration: Develop robust data extraction processes from external APIs and data sourcesETL Processes: Create efficient data transformation workflows using AWS GlueData Cataloging: Implement automated schema discovery and metadata management with AWS Glue CrawlersSpring Boot: Build and maintain backend services and APIs to support data pipelines and integration needsDevOps & CI/CDCI/CD: Implement automated deployment pipelines for data infrastructureInfrastructure as Code: Maintain version-controlled, reproducible infrastructure deploymentsData Governance & SecurityLake Formation: Implement fine-grained access controls and data governance policiesPermissions Management: Design secure, role-based access patterns for data assetsTECHNICAL REQUIREMENTSCore Skills (Mandatory)AWS CDK/Cloudformation (Critical): 3+ years of hands-on experience building production infrastructurePython: 5+ years of development experience, particularly for data processingAWS Glue: ETL job creation, crawlers, data catalog managementCI/CD: Experience with automated deployment pipelinesDocker & Kubernetes: Containerization and orchestration for scalable deployment environmentsAWS Services Expertise RequiredAWS Lambda: Serverless function development for data processingAmazon S3: Data lake design, partitioning strategies, lifecycle managementAmazon Athena: Query optimization, performance tuningAWS Lake Formation: Data governance, permissions, and access controlEventBridge & Step Functions: Workflow orchestration (preferred)Data Engineering SkillsData Cataloging: Experience with metadata management and automated schema discoveryETL/ELT Processes: Design and implementation of efficient data transformation workflowsAPI Integration: REST API consumption and error handlingData Science Collaboration: Experience working with data science teams to provide clean, accessible datasets and support model infrastructure deployment to productionSOFT SKILLSStrong analytical thinking and problem-solving abilities.Excellent communication and collaboration skills across technical and non-technical teams.Ability to self-manage and thrive in dynamic environments.Curiosity and a mindset of continuous learning.
Job Application Tips
- Tailor your resume to highlight relevant experience for this position
- Write a compelling cover letter that addresses the specific requirements
- Research the company culture and values before applying
- Prepare examples of your work that demonstrate your skills
- Follow up on your application after a reasonable time period