×
Mark Sharpley

Mark Sharpley

Head of Platforms


Background


About

About

Experience across HPC and Cloud Computing. Results driven, with a focus on Agile and DevOps culture transformation. Primarily works with AWS and Kubernetes. Passionate about Open Source and Cloud Native Computing

Work Experience

Work Experience

  • Head of Platforms, StatsBomb

    Oct, 2023 - Present

    Responsible for infrastructure and developer platforms across the business, including our application, AI and data science platforms

    • Leading multiple teams of platform engineers; the Infrastructure Platform, Data Platform and IT teams

    • IC and lead on our Kubernetes and AWS strategy

    • Modularisation of our platforms, enabling us to scale and grow

    • Focused on creating a tooling and abstractions to reduce developer friction

    • Working with the CTO and wider engineering division to create a vision for the future of our platforms

    • Brought in techtalks to help with culture change and learning

  • Principal Platform Engineer, StatsBomb

    Jun, 2022 - Sep, 20231 year 3 months

    Principal Engineer on StatsBomb's Infrastructure Platform

    • Principal engineer on the Platform Team

    • Daily IC on our codebase, working with other engineers and our customers to deliver value

    • Reducing technical debt across the platform and business

    • Driving cost effectiveness across our cloud environments

  • Senior Tech Lead, AstraZeneca

    Oct, 2020 - Jun, 20221 year 8 months

    Working on AstraZeneca's Scientific Compute Platform. Designing and building out a global compute infrastructure spanning multiple private clouds, using Cloud Native methodologies and tooling

    • Driving our team transformation, creating patterns and processes that can scale to a team of 25+ engineers working in an agile way

    • Working daily with Openstack (Kolla and Kayobe), Terraform, K8S, Nomad, Consul, Packer and Github Actions

    • Enabling my team and wider AZ colleagues by helping to drive forward a vision of using a blend of immutable infrastructure and container orchestration to help meet the needs of our business

    • Migrated LXD controlplane powered by Ansible into Openstack and Nomad

    • Lead integration of NVIDIA DGX

    • Brought IAC and Test Driven Development into the team culture

    • Introduced git-ops and cloud native CI/CD to the team

  • Research Computing Specialist, University Of Cambridge

    Oct, 2018 - Oct, 20202 years

    Research Computing Services at the University of Cambridge

    • Lead engineer for the ISO27001 certified Secure Research Computing Platform (SRCP)

    • Delivered and supported the SRCP using IAC, TDD and End to End testing, CI/CD, Agile and DevOps methodologies

    • Helped to bring our main SLURM/Infiniband clusters into Openstack, developed pipelines and management processes using Cloud Native tooling and Agile/SCRUM

    • Tooling used - Terraform, Ansible, Gitlab, Consul, Python

    • Worked daily with a four petaflop SLURM cluster, Infiniband, Lustre, Openstack, Ceph, Vault, Consul, Nomad, K8S, Docker, Singularity, Prometheus, Grafana, Gitlab, Jira, Confluence, Slack

  • Computing Officer, Stem Cell Institute

    Feb, 2017 - Oct, 20181 year 8 months

    Responsible for the IT estate for the Stem Cell Institute, including their bioinformatics infrastructure. Managed and developed their SLURM/infiniband cluster. Line managed a team of technicians.

    • Migrated to Ansible for configuration management of the compute and storage

    • Implemented Prometheus/Grafana for increased observability of the SLURM cluster/storage

    • Installed and then trained users on using Singularity to help with bioinformatics software requests

    • End user support and training on Linux/SLURM

    • Containerised legacy locally hosted websites and databases using LXC and Ansible

  • Network Manager, Monk's Walk School

    Sep, 2015 - Feb, 20171 year 5 months

    Managed the IT Support Team and IT estate for 1750 users.

    • Designed and replaced the whole Network mostly in house - Fibres through to a new Windows Domain running on Hyper-V with CSV storage

    • Xen Hypervisors and HP SAN migration to Hyper-V and REFS.

    • Migrated 1500 users from On Premise Exchange to Office365

    • Built a new Windows Domain and migrated estate onto it

    • Automated image build process for Windows using MDT, Chocolately and Powershell

    • Implemented ticketing and documentation processes

    • Automated user management and on boarding through Powershell

Projects Experience

Projects Experience

  • StatsBomb Platform Consolidation

    Jan, 2024 - Jan, 20241 day

    Consolidation of multiple cloud environments and platforms

    • Leading multiple teams of engineers to create a new platform that will consolidate multiple cloud environments and platforms into a modular, loosely coupled codebase

    • Roll out of SSO and RBAC improvements across the business

    • Improvements to our IAC and CI/CD pipelines, consolidation onto argoCD

    • Creation of glue golang microservices, crossplane providers, K8S operators to faciliate this

    • Observability improvements across the platform, using Tempo, Loki, promstack and Grafana

  • Evolution of the Scientific Compute Platform at AstraZeneca

    Oct, 2020 - Jun, 20221 year 8 months

    Creation of a global Scientific Compute Platform

    • We started with a global platform with multiple private clouds and SLURM clusters, with the aim of consolidating into a single global platform, with common tooling and patterns. I was a Senior Engineer on this project, working on the architecture and implementation of the project. The work was hugely varied, I implemented a multi region Hashicorp Vault cluster service, brought in TDD and IAC beyond Ansible, taught engineers how to build and test images, worked on the migration from Canonical Openstack and LXD to Kolla based Openstack and Nomad/K8S, rearchitected the software module pipeline.

  • ISO27001 Secure Research Platform

    Oct, 2018 - Oct, 20202 years

    Openstack based trusted research environments for the University of Cambridge

    • I was the lead engineer on the ISO27001 certified Secure Research Computing Platform. I was responsible for the architecture, implementation and support of the platform. The platform was built using IAC, TDD, CI/CD, Agile and DevOps methodologies, all of which was implemented within the team from scratch. The platform was built on Openstack, with SLURM and Infiniband clusters, and was used by researchers across the University of Cambridge, particularly during the Covid pandemic.

Skills

Skills

  • Tooling

    Terraform

    Terragrunt

    Crossplane

    ArgoCD

    OIDC/OAuth2

    Argo Workflows

    Cilium

    Istio

    Kubernetes

    SLURM

    Golang

    Python

    Vault

  • Clouds

    AWS

    Google Cloud

    Openstack

  • Soft Skills

    Collaboration

    Communication

    Curiousity

Education

Education

  • Geography, Bachelors, The University of Hertfordshire

    Dec, 2004 - Feb, 2009

Interests

Interests

  • Endurance sports events

  • Current Affairs

  • Technology

  • Open Source