Ceva, Inc.

Deep Learning Software Engineer (Quantization & Pruning)

Posted: just now

Job Description

About The Team & ProductLiteML is CEVA’s Python-based package for model quantization and pruning that enables efficient deployment of neural networks on CEVA NPUs and embedded targets. The team builds algorithms, tooling, and developer experience that bridge research prototypes and production-ready SDK components used by internal groups and customers.Role OverviewYou will design, implement, and productize LiteML features (post-training quantization, QAT hooks, structured/unstructured pruning, calibration, model-accuracy recovery) with a strong emphasis on correctness, performance, and developer ergonomics. You’ll collaborate closely with Architecture, Embedded SW/SDK, and Research, and engage externally with academia and customers to validate approaches and guide the roadmap.What You’ll Do (Responsibilities)Own LiteML features end-to-end: design APIs, implement core algorithms in Python, write robust tests, package and release to internal and external users.Quantization & pruning R&D → product: evaluate techniques (e.g., PTQ, QAT, per-tensor/per-channel, mixed-precision policies, structured pruning), benchmark on reference models, and harden for production.Model accuracy & performance: build calibration flows, sensitivity analyses, and fallback strategies; create reproducible experiments and dashboards to quantify accuracy/latency/size trade-offs.Developer experience: craft clean CLI/SDK surfaces, docs, and examples; streamline install/upgrade paths and Python packaging (wheels, dependency management, versioning, release notes).Cross-functional integration: align with Architecture on kernel/ops capabilities and numeric formats; with Embedded SW on runtime constraints; and with Research on algorithm selection and evaluation.External collaboration: support academic partnerships and key customers—gather requirements, reproduce issues, propose fixes, and fold learnings back into the product.Quality & compliance: uphold Ceva coding standards, CI/CD, code reviews, and documentation templates; contribute to internal competency forums and knowledge sharing.What Success Looks Like (6–12 Months)Product impact: LiteML releases add measurable value—reduced model size/latency with maintained accuracy on internal reference suites and early-access customer models.Engineering excellence: high test coverage, stable APIs, clean deprecation policy, and well-documented examples that lower time-to-first-success for partners.Collaboration: strong cadence with Architecture, Embedded SW, and Research; external feedback loops from academia/customers regularly inform your backlog.Operational maturity: streamlined packaging and CI release flow; clear benchmark reports and artifacts per release.Requirements:3–6 years hands-on software experience building ML tooling, frameworks, or ML-adjacent infrastructure.Strong Python engineering skills (type hints, packaging, testing with pytest, performance profiling, virtual environments).C/C++ knowledge for operator kernels, performance-critical paths, or runtime integration.Solid understanding of deep learning fundamentals and model deployment concerns (numerical effects of quantization, pruning criteria, calibration, accuracy/performance trade-offs).Experience working with Git and modern review/CI workflows on Linux.B.Sc. or higher in Computer Science / Software Engineering / Computer Engineering from a university.Nice to Have (Big Advantages)Practical experience with quantization (PTQ/QAT) and pruning (magnitude, channel/filter, structured sparsity) in PyTorch or TensorFlow; hands-on with ONNX or TFLite conversions.Exposure to embedded / NPU deployment flows (e.g., operator coverage, calibration sets, integer-only inference, memory/layout constraints).Python packaging at scale: building wheels, manylinux, dependency pinning, semantic versioning, release automation.Familiarity with model-profiling and accuracy tooling; experience creating reproducible benchmarks and internal dashboards/notebooks.

Job Application Tips

  • Tailor your resume to highlight relevant experience for this position
  • Write a compelling cover letter that addresses the specific requirements
  • Research the company culture and values before applying
  • Prepare examples of your work that demonstrate your skills
  • Follow up on your application after a reasonable time period

You May Also Be Interested In