D
AI Platform Engineer
DeepLight
Posted 17 hours ago
Employment Type
Full Time
Location
Abu Dhabi
Requirements
On-prem Kubernetes, Infrastructure as code, GPU scheduling, Container orchestration, CI/CD automation, Linux administration, Docker experience, MLOps workflows, Monitoring & logging, Python Bash scripting, Networking fundamentals, Consultancy experience
Required Skills
Job Description
Responsibilities
- Design, deploy and operate on-premises Kubernetes clusters tailored for AI/ML workloads
- Build and maintain infrastructure-as-code (Terraform, Helm, Ansible) for reproducible environments
- Implement GPU scheduling, resource management and scaling strategies for large-model training and inference
- Develop and optimise AI/ML pipelines and MLOps workflows for training, deployment and monitoring
- Manage containerisation, networking and storage solutions to meet performance and reliability goals
- Implement observability, monitoring and logging (Prometheus, Grafana, ELK) for platform health and cost control
- Automate operational processes and CI/CD for model lifecycle and platform components
- Collaborate with data scientists and ML engineers to deliver high-performance compute environments
- Ensure platform security, compliance and operational reliability in an on-prem environment
Requirements
- On-prem Kubernetes
- Infrastructure as code
- GPU scheduling
- Container orchestration
- CI/CD automation
- Linux administration
- Docker experience
- MLOps workflows
- Monitoring & logging
- Python / Bash scripting
- Networking fundamentals
- Consultancy experience
Preferred Qualifications
- Hands-on experience with TensorFlow, PyTorch, Hugging Face, Ray or Kubeflow
- Proven track record supporting large-model training and inference workloads
- Familiarity with storage solutions for high-throughput GPU workloads (NVMe, Ceph, NFS)
- Experience with cluster capacity planning and cost optimisation for on-prem AI infrastructure
- Strong understanding of security best practices for enterprise AI platforms
- Previous experience delivering platform solutions in a consultancy or client-facing role
Benefits
- Competitive salary and performance bonuses
- Comprehensive health insurance coverage
- Professional development and certification support
- Annual leave and paid leave provisions
- International exposure and travel opportunities (flights)
- Flexible working arrangements (hybrid)
- Career advancement opportunities in a growing AI consultancy
About the Company
DeepLight is a specialist AI and data consultancy delivering intelligent enterprise systems across multiple industries, with particular depth in financial services and environmental sectors. We combine expertise in data science, statistical modeling, AI/ML technologies, workflow automation and systems integration to implement practical, high-impact solutions for clients. This role sits in Abu Dhabi and focuses on building robust, secure on-prem AI platforms to enable large-scale ML and generative AI workloads.