Job Description
About the Role
Phonologies is seeking a hands-on ML Engineer who bridges data engineering and machine learning designing and implementing production-ready ML pipelines that are reliable, scalable, and automation-driven.
You'll own the end-to-end workflow: from data transformation and pipeline design to model packaging, fine-tuning preparation, and deployment automation.
This is not a Data Scientist or MLOps role its data-focused engineering position for someone who understands ML systems deeply, builds robust data workflows, and develops the platforms that power AI in production.
Role & responsibilities
Machine Learning Pipelines & Automation:
- Design, deploy and maintain end-to-end ML pipelines.
- Build data transformation tiers (Bronze, Silver, Gold) to enable structured, reusable workflows.
- Automate retraining, validation, and deployment using Airflow, Kubeflow, or equivalent tools.
- Implement Rules-Based Access Control (RBAC) and enterprise-grade authorization.
API & Platform Architecture:
- Develop robust APIs for model serving, metadata access, and pipeline orchestration.
- Participate in platform design and architecture reviews, contributing to scalable ML system blueprints.
- Create monitoring and observability frameworks for performance and reliability.
Cloud & Deployment:
- Deploy pipelines across cloud (AWS, Azure, GCP) and on-prem environments, ensuring scalability and reproducibility.
- Collaborate with DevOps and platform teams to optimize compute, storage, and workflow orchestration.
Collaboration & Integration:
- Work with Data Scientists to productionize models, and with Data Engineers to optimize feature pipelines.
- Integrate Firebase workflows or data event triggers where required.
Preferred candidate profile
Experience: 5+ years in Data Engineering or ML Engineering, with proven experience in:
- building data workflows and ML pipelines
- packaging and deploying models using Docker & CI/CD
- designing platform and API architecture for ML systems.
Technical Skills:
- Programming & ML: Python, SQL, scikit-learn, XGBoost, LightGBM
- Data Engineering & Cloud Pipelines: Large-scale preprocessing, containerized ETL (Docker, Airflow, Kubernetes), workflow automation
- Data Streaming & Integration: Apache Kafka, micro-batch and real-time ingestion
- ML Lifecycle & Orchestration: MLFlow, CI/CD, Dagshub, Databricks, A/B Testing, modular ML system design
- API & Platform Development: FastAPI, Flask, RESTful APIs, architecture planning
- Data Governance, Privacy, Security & Access Control: Schema registry, lineage tracking, secure data handling, audit logging, RBAC
- AutoML & Optimization: PyCaret, H2O.ai, Google AutoML
- Model Monitoring & Automation: Drift detection, retraining workflows, Airflow / Kubeflow automation
Education: Bachelors or Masters in Computer Science, Machine Learning, or Information Systems.
Communication & Collaboration: Translating technical concepts, business storytelling, cross-functional delivery.