×
Alex Howard Whitaker

Alex Howard Whitaker

Senior Staff Cloud Engineer

London, UK
English

Background


About

About

Alex is a technically focused staff-level cloud engineer with over a decade of experience having organisation-level impact on AWS cloud usage within a large enterprise. He has led delivery of fully-managed, mature, well-architected cloud solutions that abstract complexity and empower developers while also owning trade-offs and meeting broader business goals of cost management, compliance and security at FTSE-250 technology company.

Work Experience

Work Experience

  • Senior Staff Cloud EngineerOcado Group

    Feb, 2020 - Present

    In 2020 I was promoted to a cross-team role at the staff engineer level. I report to the Engineering manager for three AWS cloud teams. My time is split between a 'home team' and strategic initiatives to impact the business's AWS cloud usage and support the growing number of developer platform teams. I have a remit to decide architectural principles through RFCs, identify potential points of risk and failure within the teams and challenge local decisions when they stray from our high-level goals and culture.

    • Reduced costs and support effort by migrating our on-premises robotic control system to the Cloud. Video about this: https://www.youtube.com/watch?v=hxgo_CdRF5k

    • Reduced latency, saved compute costs and raised the quality of data used for Ecommerce conversion metrics by implementing a bot control system using AWS Web Application Firewall. This required close collaboration with senior Ecommerce engineers to integrate with the webshop front-end. This feature blocks over 500k requests a day. In places where latency and DDoS attacks are a massive business risk, I decided to introduce guard rails when users wanted to apply changes to avoid blocking legitimate users. Live production traffic was impossible to replicate in test environments, so I enforced the analysis of rule changes in 'count' mode before being applied, accepting the trade-off in the speed of change releases.

    • Matured our security posture by helping launch a new team to implement security tooling and best practices on AWS for the whole organisation

    • Maintain our security posture as the single-threaded owner of the overall AWS Cloud security architecture for the company across >150 AWS accounts, 17 regions with a seven-figure cloud-spend and differing regulatory requirements for our 11 international partners around the world.

    • Set our AWS identity strategy to implement change management policies, reduce the blast radius of our teams' permissions, and enforce separation of duties and production controls

    • Helped teams address security faults by building a security issue management system that aggregates vulnerabilities and security findings in AWS Security Hub and raises JIRA tickets with teams to resolve issues

    • Improved reliability and reduced cost by migrating a business stream (~20 teams) from an on-premises Kubernetes solution to AWS

    • Gathered learning from incidents by running 'post-incident retrospectives' that examine failure handling and track improvements

    • Align cloud teams' architectures by writing RFCs, taking part in design reviews and technical governance for my area

    • Help development teams innovate their tech stack by evaluating newly released Cloud products to see how they can fit into our existing standards for maturity, stability and maintainability.

    • Drive PCI DSS 4.0 compliance for cloud teams by collecting and standardising requirements and evidence of compliance for processing card payments as a merchant and service-provider, and demonstrate these requirements to our third-party auditor. PCI DSS is a strong influence on our security architecture for the handling of sensitive data, systemic access control, least-privilege enforcement and maintaining non-reputable audit trails. As such many of these requirements must be turned into product capabilities on our roadmap.

  • Cloud EngineerOcado Group

    Sep, 2014 - Feb, 20205 years 5 months

    In 2014 entered a secondment to the newly created Cloud services team, the first cloud team in the company and was responsible for building all Cloud infrastructure such as VPCs, ingress proxy, identity, backups and self-service developer applications for creating databases and requesting temporary access to applications. I joined the team permanently after my secondment and it has since split into four different teams as our Cloud portfolio has grown. This role was primarily focused on developing, deploying and maintaining specific pieces of Cloud infrastructure that we ran on behalf of developers. This matured into a self-service developer platform as our number of users grew.

    • Participated in the team from the very start of our cloud migration journey including launching our first international partner Bonpreu in Barcelona running entirely on AWS.

    • Created a new Python-based AWS identity management system that stored permissions for SSO roles and granted them to groups of users synchronised with Azure SSO and Active Directory.

    • Built a Cloudformation-based CI/CD pipeline for managing infrastructure-as-code in >100 AWS accounts

    • Deployed an Nginx based ingress proxy to securely provide access to applications on our platform

    • Architected and developed an authentication system in Lua to authenticate developers connecting their apps from outside the Cloud

    • Built Django applications to allow developers to self-serve provisioning of their cloud resources.

    • Managed Elasticsearch, Logstash and Kibana based logging clusters for developers

    • Interim team leader during a period of expansion and turnover within the department.

  • Application Support EngineerOcado Group

    Jan, 2013 - Sep, 20141 year 8 months

    Joined Ocado as 3rd-line support engineer primarily for Ecommerce applications written in Java. Supporting Tomcat web apps and Oracle databases. This role was highly operations focused with about 30% of time reserved for development and improvement.

    • Upgraded the previous manual, static, file and bash script based configuration and deployments system to Puppet-based automated, declarative configuration and deployment system.

    • Worked with development teams to understand their needs and build effective monitoring and housekeeping systems for live operation stability

    • Ensured the Operations team had up to date technical documentation for supporting and maintaining business systems.

    • Managed scheduled systems shut-downs for infrastructure upgrades.

    • Worked with infrastructure teams to provide developers with newly procured application servers for development and production.

    • Participated in 24/7 on-call rota for supporting production systems.

Projects Experience

Projects Experience

  • Hippolyte

    Mar, 2017 - Nov, 20178 months

    Open-source, at-scale, point-in-time backup solution for DynamoDB

    • Created an automated DynamoDB backup solution designed to handle frequent, recurring backups of large numbers of tables, scale read throughput, and batch together backup jobs over multiple EMR clusters.

    • Since no native DynamoDB solution existed, solution was designed to de-risk migration from on-premises relational databases

    • Built on AWS Lambda using the Serverless Framework

Skills

Skills

  • AWS - Core Strengths

    EC2

    ECS

    Lambda

    S3

    IAM

    IAM Identity Center

    CloudFront

    AWS WAF

    Inspector

    AWS Config

    AWS Organizations

  • AWS - Familiarity

    DynamoDB

    RDS

    SNS

    SQS

  • Infrastructure-as-Code

    CloudFormation

    AWS CDK

    GitLab CI

    CDKTF

  • Development

    Python

    Django

    Nginx

    Git

    Docker

    Bash

    Linux

Education

Education

  • Mathematics, Bachelor of Science, The University of Birmingham

    Sep, 2009 - Jun, 2012