Platform Engineer

Deeplight · Dubai

Hybrid: DubaiFull TimeMid Level, SeniorInformation Technology
Posted 10 months ago

Job description

Responsibilities

  • Design, implement, and maintain secure, scalable, and highly available AWS cloud infrastructure for real-time AI workloads and enterprise applications.
  • Implement Infrastructure as Code (IaC) using tools like Terraform to automate provisioning, deployment, monitoring, and management of cloud infrastructure.
  • Enforce strict security and compliance practices, including identity and access management, data protection, encryption, and vulnerability scanning for AI data pipelines.
  • Monitor, analyze, and optimize system performance, ensuring reliability and operational efficiency for ML model training and inference workloads.
  • Develop and manage CI/CD pipelines to streamline deployments, including integration with MLOps workflows for rapid delivery of AI applications.
  • Collaborate with software engineering, data, and operations teams to gather requirements, implement solutions, troubleshoot, and support the full deployment lifecycle.
  • Create and update detailed documentation for system architecture, deployment workflows, configuration standards, and operational processes.

Requirements

  • Proven experience as a Platform or Infrastructure Engineer (or similar), with a strong track record in cloud infrastructure for scalable applications and AI workloads.
  • Hands-on with a variety of cloud platforms/services (compute, storage, networking, identity management, container orchestration) in public cloud, hybrid, or on-prem environments.
  • Proficiency in Docker, Kubernetes, microservices architectures, and distributed systems design.
  • Demonstrated experience in DevOps, including CI/CD pipeline creation, configuration management, and system monitoring (Jenkins, Git, Prometheus, Grafana).
  • Experience with AI/ML infrastructure: provisioning GPU environments, managing ML training pipelines, deploying models in production with automation and MLOps tools.
  • Strong analytical and problem-solving skills, delivering production-ready infrastructure in cross-functional, data-centric teams.
  • Commitment to continuous learning and staying abreast of advances in cloud infrastructure, automation, and AI operations.

Benefits

  • Impact: Work with a dynamic team at the forefront of AI to make meaningful industry and societal impact.
  • Innovation: Participate in cutting-edge projects at the intersection of AI, data engineering, and machine learning.
  • Collaboration: Collaborate with diverse talent, fostering creativity, learning, and professional growth.
  • Opportunity: Access ample opportunities for personal development, advancement, and leadership in a rapidly growing company.
  • Culture: Join a culture of curiosity, excellence, and collaboration where your ideas and contributions are valued.

About the Company

DeepLight is a pioneering AI company leading the way in artificial intelligence innovation. With a mission to harness data and machine learning for industry transformation, DeepLight is driven by a team of experts and a relentless pursuit of excellence in AI R&D.

Skills & tools

AWSTerraformDockerKubernetesCI/CDMLOpsDevOpsPrometheusGrafanaJenkinsCloud InfrastructureGPUmicroservices

What the team is looking for

Use this list as a quick fit check before you apply.

  1. 01AWS cloud
  2. 02Terraform
  3. 03DevOps
  4. 04Docker/Kubernetes
  5. 05CI/CD pipelines
  6. 06AI/ML infrastructure
  7. 07Security compliance
  8. 08Documentation