Copyright © 2025

Above analytics are generated algorithmically based on job titles and may not always be the same as the company's job classification. You can also check detailed occupation eligibility, and salary criteria on our UK Visa Eligible Occupations & Salary Thresholds page.
Disclaimer: Hunt UK Visa Sponsors aggregates job listings from publicly available sources, such as search engines, to assist with your job hunting. We do not claim affiliation with Gattaca. For the most up-to-date job details, please visit the official website by clicking "Apply Now."
DevOps Engineer – Reinforcement Learning Platforms
We are seeking an experienced DevOps Engineer to help build and scale a web-based platform for reinforcement learning (RL) training and RLOps. You will design, implement, and maintain the cloud infrastructure, CI/CD pipelines, and deployment systems that support large-scale RL workloads.
Responsibilities
• Design and manage scalable cloud infrastructure for high-performance RL training and distributed environments
• Build and optimise CI/CD pipelines for open-source and enterprise components
• Implement containerisation and orchestration using Docker and Kubernetes
• Develop Infrastructure as Code solutions (Terraform, CloudFormation, Pulumi)
• Implement monitoring, logging, and alerting for distributed ML systems
• Collaborate with ML teams on resource optimisation and cost efficiency
• Apply security best practices, manage access controls, and ensure compliance
• Automate operational tasks: backups, disaster recovery, maintenance
• Support GPU clusters and distributed compute resources for RL workloads
• Maintain availability and performance of production ML systems
Requirements
• Degree in Computer Science/Engineering or 3+ years of DevOps/infrastructure experience
• Strong background with AWS, GCP, or Azure, including ML/AI workloads
• Proficiency with Docker, Kubernetes, and ML-focused orchestration
• Experience with Terraform/CloudFormation/Pulumi and configuration management
• Solid understanding of CI/CD tools (GitHub Actions, GitLab CI, Jenkins)
• Knowledge of monitoring/observability tools (Prometheus, Grafana, OpenObserve)
• Experience with GPU infrastructure and distributed ML compute frameworks
• Familiarity with MLOps tools and model lifecycle management
• Strong scripting skills (Python, Bash)
• Understanding of cloud networking, security, and database fundamentals
• Experience with HPC environments or schedulers is a plus
• Strong problem-solving and communication skills
Compensation & Benefits
• Competitive salary and stock options
• 30 days’ holiday plus bank holidays
• Flexible and remote working options
• Enhanced parental leave
• £500 annual learning and development budget
• Pension scheme
• Regular socials and quarterly gatherings
• Bike-to-Work scheme