DevOps Engineer, ZOO Digital
Mar, 2023 - Present
While working at ZOO, I have been at the forefront of a new observability implementation built around industry-leading systems like Loki, Prometheus and CloudWatch. I've been working to assist the business define its performance issues and assist in resolving them, as well as maintaining the uptime & stability of software stacks. This has included heavy use of AWS resources, from EC2 to RDS and SQS, working in a heavily-automated GitOps environment close to both developers and AWS to provide good solutions to development needs.
Lots of usage of AWS (with services like EC2, EKS, ECS, RDS and SQS) to provide stable experiences to customers
Extensive usage of TeamCity to build, image, and deploy developer software teams' products to customers like Disney, Paramount and HBO.
Implemented the Grafana stack, including Prometheus (Mimir), Loki and Grafana Agent, to centralize & add missing value to instrumentation from servers, containers and applications, handling gigabytes of ingested, aggregated and alerted logging and metrics data per day.
Supported Django applications of varying sizes, from small addon systems and internal tooling to large, high-traffic end-user applications like ZOOstudio and ZOOcore.