Monday, October 27, 2025
Meril

Data Engineer – Financial Infrastructure & Analytics

Posted: 6 days ago

Job Description

About the RoleAs a Quantitative Data Engineer, you will be the backbone of the data ecosystem powering our quantitative research, trading, and AI-driven strategies. You will design, build, and maintain the high-performance data infrastructure that enables low-latency, high-fidelity access to market, fundamental, and alternative data across multiple asset classes.This role bridges quant engineering, data systems, and research enablement, ensuring that our researchers and traders have fast, reliable, and well-documented datasets for analysis and live trading. You’ll be part of a cross-functional team working at the intersection of finance, machine learning, and distributed systems.ResponsibilitiesArchitect and maintain scalable ETL pipelines for ingesting and transforming terabytes of structured, semi-structured, and unstructured market and alternative data.Design time-series optimized data stores and streaming frameworks to support low-latency data access for both backtesting and live trading.Develop ingestion frameworks integrating vendor feeds (Bloomberg, Refinitiv, Polygon, Quandl, etc.), exchange data, and internal execution systems.Collaborate with quantitative researchers and ML teams to ensure data accuracy, feature availability, and schema evolution aligned with modeling needs.Implement data quality checks, validation pipelines, and version control mechanisms for all datasets.Monitor and optimize distributed compute environments (Spark, Flink, Ray, or Dask) for performance and cost efficiency.Automate workflows using orchestration tools (Airflow, Prefect, Dagster) for reliability and reproducibility.Establish best practices for metadata management, lineage tracking, and documentation.Contribute to internal libraries and SDKs for seamless data access by trading and research applications.In Trading Firms, Data Engineers Typically:Build real-time data streaming systems to capture market ticks, order books, and execution signals.Manage versioned historical data lakes for backtesting and model training.Handle multi-venue data normalization (different exchanges and instruments).Integrate alternative datasets (satellite imagery, news sentiment, ESG, supply-chain data).Work closely with quant researchers to convert raw data into research-ready features.Optimize pipelines for ultra-low latency where milliseconds can impact P&L.Implement data observability frameworks to ensure uptime and quality.Collaborate with DevOps and infra engineers to scale storage, caching, and compute.Tech StackLanguages: Python, SQL, Scala, Go, Rust (optional for HFT pipelines)Data Processing: Apache Spark, Flink, Ray, Dask, Pandas, PolarsWorkflow Orchestration: Apache Airflow, Prefect, DagsterDatabases & Storage: PostgreSQL, ClickHouse, DuckDB, ElasticSearch, RedisData Lakes: Delta Lake, Iceberg, Hudi, ParquetStreaming: Kafka, Redpanda, PulsarCloud & Infra: AWS (S3, EMR, Lambda), GCP, Azure, KubernetesVersion Control & Lineage: DVC, MLflow, Feast, Great ExpectationsVisualization / Monitoring: Grafana, Prometheus, Superset, DataDogTools for Finance: kdb+/q (for tick data), InfluxDB, QuestDBWhat You Will GainEnd-to-end ownership of core data infrastructure in a high-impact, mission-critical domain.Deep exposure to quantitative research workflows, market microstructure, and real-time trading systems.Collaboration with elite quantitative researchers, traders, and ML scientists.Hands-on experience with cutting-edge distributed systems and time-series data technologies.A culture that emphasizes technical excellence, autonomy, and experimentation.QualificationsBachelor’s or Master’s in Computer Science, Data Engineering, or related field.2+ years of experience building and maintaining production-grade data pipelines.Proficiency in Python, SQL, and frameworks like Airflow, Spark, or Flink.Familiarity with cloud storage and compute (S3, GCS, EMR, Dataproc) and versioned data lakes (Delta, Iceberg).Experience with financial datasets, tick-level data, or high-frequency time series is a strong plus.Strong understanding of data modeling, schema design, and performance optimization.Excellent communication skills with an ability to support multidisciplinary teams.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

Related Jobs