Monday, October 27, 2025

Job Description

Digital Brain builds AI agents and a digital workforce platform that automates complex business workflows end-to-end. We design, ship, and operate agentic assistants and tool-using systems that deliver measurable impact for real customers. You’ll join an engineering-led team where autonomy, learning, and delivery define how we work.Role PurposeYou’ll design, build, and operate multi-agent systems that run at scale—low latency, high reliability, and strong accuracy. This is an end-to-end ownership role: from planning, tool use, and retrieval to evaluation and production operations.What You’ll DoDevelop stateful agentic services in Python (FastAPI): planning, memory, routing, and tool use/orchestration.Design multi-step flows with LangGraph/LangChain; build robust fallback and error-handling strategies.Architect RAG pipelines with hybrid retrieval, re-ranking, advanced chunking and optimized vector DBs (Qdrant).Operate and optimize LLM model stacks (API or self-hosted), balancing latency, quality, and cost.Implement smart model routing, caching, and observability (logging, tracing, rate-limiting, retries, circuit breakers, and SLOs).Build evaluation and feedback loops using LLM-as-judge and scenario testing.Ship production systems with CI/CD, testing, and infrastructure as code.Collaborate closely with product, ML, and customer teams to turn ambiguous goals into real outcomes.Must-Have Qualifications2–4+ years building and shipping ML/NLP or backend systems.Strong Python fundamentals and clean, test-driven, production-ready development.Hands-on LangGraph/LangChain (or similar) experience.Deep understanding of RAG architecture and experience with vector databases (Qdrant).Expertise in FastAPI and modern backend design.Cloud experience (Azure preferred; AWS/GCP a plus) with Docker/Kubernetes.Proficiency in CI/CD pipelines and automated testing for LLM applications.Strong grasp of monitoring, performance tracking, and production reliability.Up-to-date on LLM research and emerging tools; bias towards action and shipping.Nice to HaveBuilt and operated multi-agent systems in production (coordination, planning, memory, tool orchestration).Familiar with vLLM or SGLang for high-throughput inference.Experience with Langfuse (tracing, evaluation dashboards).MLOps: GitHub Actions / GitLab CI / Jenkins; MLflow or similar.Voice systems: ASR/TTS, telephony, and real-time streaming pipelines.Hybrid retrieval (BM25 + dense) and re-ranking pipelines.Guardrails and safety: prompt injection, policy validation, and redaction.Fine-tuning or PEFT (LoRA), distillation.Interview ProcessIntro Chat (15–30 min)Technical Deep Dive (90 min)Final Decision & OfferIf you want to work on agentic systems that power real businesses, apply now and help shape the next generation of AI infrastructure.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

Related Jobs