Disclaimer: Hunt UK Visa Sponsors aggregates job listings from publicly available sources, such as search engines, to assist with your job hunting. We do not claim affiliation with Flatiron Health. For the most up-to-date job details, please visit the official website by clicking "Apply Now."
We’re looking for a Site Reliability Engineer to help us accomplish our mission to improve and extend lives by learning from the experience of every person with cancer. Are you ready to be the next changemaker in cancer care?
Flatiron Health is a healthtech company using data for good to power smarter care for every person with cancer, around the world. Flatiron partners with cancer centers in the US, Europe and Asia to transform patients’ real-life experiences into real-world evidence and create a more modern, connected oncology ecosystem. Our multidisciplinary teams include oncologists, data scientists, software engineers, epidemiologists, product experts and more. Flatiron Health is an independent affiliate of the Roche Group.
What You’ll Do
We’re seeking a Site Reliability Engineer with a data engineering focus to help architect, build, and maintain reliable, scalable pipelines that form the foundation of our standardized data platform. You’ll partner closely with platform and data engineering teams to ensure that systems are robust, observable, and built with operational excellence in mind.
As an SRE embedded in the data space, you will:
- Design and build reliable, scalable, and maintainable data pipelines using Databricks, Airflow, and GitLab CI/CD
- Implement SLOs, SLIs, and monitoring for data pipeline health, latency, throughput, and quality
- Establish best practices and standards for data infrastructure (e.g. versioning, testing, rollout)
- Develop automation and leverage best-in-class AI tooling to reduce toil associated with orchestration, error detection, retries, and alerting
- Enable secure and scalable data access patterns and controls for data stores like Snowflake and Amazon RDS
- Collaborate with platform and application teams to guide infrastructure decisions for data workflows
- Optimize cost, reliability, and performance of data compute and storage layers in cloud environments (e.g. AWS)
- Participate in on-call rotations for the data platform stack and contribute to shared incident response processes
Who You Are
You’re an engineer with 5+ years of experience working in DevOps, platform engineering, data engineering, or SRE roles. You have a passion for building reliable systems and the curiosity to work at the intersection of data and infrastructure.
- Strong experience with workflow orchestration tools (Airflow preferred)
- Familiarity with Databricks, Spark, or other distributed compute platforms
- Experience building and maintaining CI/CD pipelines (GitLab preferred). Experience with data transformation projects (e.g. using dbt) a plus
- Competency in writing infrastructure-as-code (e.g. Terraform, Ansible, etc.)
- Proficient in Python or another scripting language used in data automation
- Experience designing and monitoring data SLAs and building systems that are observable and testable
- Familiar with cloud environments like AWS, including services like S3, IAM, EC2, and EMR
- Proficient with containerized deployments (e.g. using Docker. Kubernetes a plus)
- You value high-quality documentation and operational runbooks
- You’re collaborative, have strong written and verbal communication skills, and care about building systems that work for both engineers and the business
If this sounds like you, you'll fit right in at Flatiron.
Who We Are
Our people are at the center of everything we do. We strive to foster a culture where our teammates feel equipped and empowered to make meaningful contributions with confidence, compassion, and clarity. Visit the Life at Flatiron page to learn more.