D

AI Platform Engineer

DeepLight

Posted 17 hours ago

Employment Type

Full Time

Location

Abu Dhabi

Requirements

On-prem Kubernetes, Infrastructure as code, GPU scheduling, Container orchestration, CI/CD automation, Linux administration, Docker experience, MLOps workflows, Monitoring & logging, Python Bash scripting, Networking fundamentals, Consultancy experience

Job Description

Responsibilities

  • Design, deploy and operate on-premises Kubernetes clusters tailored for AI/ML workloads
  • Build and maintain infrastructure-as-code (Terraform, Helm, Ansible) for reproducible environments
  • Implement GPU scheduling, resource management and scaling strategies for large-model training and inference
  • Develop and optimise AI/ML pipelines and MLOps workflows for training, deployment and monitoring
  • Manage containerisation, networking and storage solutions to meet performance and reliability goals
  • Implement observability, monitoring and logging (Prometheus, Grafana, ELK) for platform health and cost control
  • Automate operational processes and CI/CD for model lifecycle and platform components
  • Collaborate with data scientists and ML engineers to deliver high-performance compute environments
  • Ensure platform security, compliance and operational reliability in an on-prem environment

Requirements

  • On-prem Kubernetes
  • Infrastructure as code
  • GPU scheduling
  • Container orchestration
  • CI/CD automation
  • Linux administration
  • Docker experience
  • MLOps workflows
  • Monitoring & logging
  • Python / Bash scripting
  • Networking fundamentals
  • Consultancy experience

Preferred Qualifications

  • Hands-on experience with TensorFlow, PyTorch, Hugging Face, Ray or Kubeflow
  • Proven track record supporting large-model training and inference workloads
  • Familiarity with storage solutions for high-throughput GPU workloads (NVMe, Ceph, NFS)
  • Experience with cluster capacity planning and cost optimisation for on-prem AI infrastructure
  • Strong understanding of security best practices for enterprise AI platforms
  • Previous experience delivering platform solutions in a consultancy or client-facing role

Benefits

  • Competitive salary and performance bonuses
  • Comprehensive health insurance coverage
  • Professional development and certification support
  • Annual leave and paid leave provisions
  • International exposure and travel opportunities (flights)
  • Flexible working arrangements (hybrid)
  • Career advancement opportunities in a growing AI consultancy

About the Company

DeepLight is a specialist AI and data consultancy delivering intelligent enterprise systems across multiple industries, with particular depth in financial services and environmental sectors. We combine expertise in data science, statistical modeling, AI/ML technologies, workflow automation and systems integration to implement practical, high-impact solutions for clients. This role sits in Abu Dhabi and focuses on building robust, secure on-prem AI platforms to enable large-scale ML and generative AI workloads.

How to Apply