High Five

Artificial Intelligence Engineer

Posted: 1 days ago

Job Description

HighFive is hiring on behalf of a Singapore-based AI startup, looking for a versatile Data & AI Engineer with 4–7 years of experience. The role involves building, deploying, and maintaining end-to-end data pipelines to support downstream GenAI applications. You will design data models and transformations, develop scalable ETL/ELT workflows, and work in a fast-paced environment within the AI agent ecosystem.Key ResponsibilitiesData Modeling & Pipeline developmentAutomate data ingestion from diverse sources (Databases, APIs, files, Sharepoint/ document management tools, URLs). Most files are expected to be unstructured documents with different file formats, tables, charts, process flows, schedules, construction layouts/drawings, etc.Own chunking strategy, embedding, indexing all unstructured & structured data for efficient retrieval by downstream RAG/agent systemsBuild, test, and maintain robust ETL/ELT workflows using Spark (batch & streaming)Define and implement logical/physical data models and schemas. Develop schema mapping and data dictionary artifacts for cross-system consistencyGen AI IntegrationInstrument data pipelines to surface real-time context into LLM promptsImplement prompt engineering and RAG for varied workflows within the RE/Construction industry verticalObservability & GovernanceImplement monitoring, alerting, and logging (data quality, latency, errors)Apply access controls and data privacy safeguards (e.g., Unity Catalog, IAM)CI/CD & AutomationDevelop automated testing, versioning, and deployment (Azure DevOps, GitHub Actions, Prefect/Airflow)Maintain reproducible environments with infrastructure as code (Terraform, ARM templates)Required Skills & Experience5 years in Data Engineering or similar role, with at least 12-24 months of exposure to building pipelines for unstructured data extraction including document processing with OCR, cloud-native solutions and chunking, indexing etc. for downstream consumption by RAG/ Gen AI applications.Proficiency in Python, dlt for ETL/ELT pipeline, duckDB or equivalent tools for analytical in-process analysis, dvc for managing large files efficiently.Solid SQL skills and experience designing and scaling relational databases. Familiarity with non-relational column based databases is preferred.Familiarity with Prefect is preferred or others (e.g. Azure Data Factory)Proficiency with the Azure ecosystem. Should have worked on Azure services in production.Familiarity with RAG indexing, chunking and storage across file types for efficient retrieval.Strong Dev Ops/Git workflows and CI/CD (CircleCI / Azure DevOps)Experience deploying ML artifacts using MLflow, Docker, or Kubernetes is good to have.Bonus skillsets:Experience with Computer vision based extraction or experience in building ML models for productionKnowledge of agentic AI system design - memory, tools, context, orchestrationKnowledge of data governance, privacy laws (GDPR) and enterprise security patternsWe are an early-stage startup, so you are expected to wear many hats, working with things out of your comfort zone, but with real and direct impact in production. If you think you are a good fit for this fast-paced environment, please apply.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In