×
Thomas Hudak

Thomas Hudak

Senior Director of Operations | Optum

Minneapolis, MN, US
English

Background


About

About

I’m a senior engineering leader operating at Fortune 4 scale, delivering zero-downtime, real-time data systems for mission-critical healthcare. I excel at orchestrating globally distributed teams, championing DevOps/SRE practices, and leveraging automation, containerization, and rigorous compliance frameworks. My track record includes engineering solutions for tens of millions of daily transactions, swiftly integrating new technologies into production. Above all, I focus on building collaborative, high-performing cultures that drive tangible results in the healthcare ecosystem.

Work Experience

Work Experience

  • Senior Director of Platform OperationsOptum

    Aug, 2024 - Present

    Driving innovation and operational excellence at Optum, leading the transformation of global platform operations with a focus on Criticab Business Applications (CBA's) including our Epic EMR instances, and Exadata platforms for our Rx business. Trusted subject matter expert in cloud architecture, DevOps, SRE, and platform modernization, delivering scalable, secure, and reliable solutions for mission-critical systems across a global enterprise. Consistently deliver results in a zero-margin-for-error scenario, ensuring continuity of vital healthcare claims processing for millions of Americans while working directly with government and executive leadership to provide recovery updates and ensure transparency.

    • Spearheaded the recovery of the largest claims processing system in the world, supporting over $1.4 trillion in annual transactions and processing 30-50% of all healthcare claims in the United States, after a catastrophic system-wide failure following a major cybersecurity incident.

    • Tier-0 Reliability & Real-Time Scale: Directed multi-region Kafka clusters handling 30–50% of U.S. healthcare claims, achieving 99.999% availability in mission-critical, 24/7 environments.

    • Global Team Leadership: Managed a team of 120+ engineers spanning 4 continents, implementing uniform DevOps and SRE standards to ensure consistent performance, compliance, and security across all geographies.

    • Infrastructure-as-Code & Automation: Spearheaded extensive IaC and CI/CD pipelines—driving a 60% reduction in provisioning time, enabling rapid service rollouts without compromising on security or reliability.

    • Regulatory & Security Excellence: Navigated HIPAA/HITRUST compliance at scale, establishing robust governance models and zero-trust network architectures, enabling continuous audits and frictionless re-certifications.

    • High-Impact Recoveries: Executed the swift recovery of the world’s largest healthcare claims processing system post-cybersecurity incident, providing daily updates to executive leadership and federal agencies.

    • Innovative Data Approaches: Pioneered real-time analytics solutions on top of Kafka streams (KSQL, event-driven microservices) to enhance operational visibility, reduce mean time to detect (MTTD), and improve patient care workflows.

    • Infrastructure Automation & Standardization: Pioneering Infrastructure as Code (IaC) practices and CI/CD pipelines to ensure predictable, repeatable, and scalable deployments across complex environments.

    • Resilient Network & Security Architectures: Designing zero-trust, fault-tolerant network frameworks with robust access controls, enabling secure operations across global platforms.

    • Operational Governance & Reliability Engineering: Establishing gold-standard practices for platform observability, proactive incident management, and long-term reliability in Tier-0 healthcare systems.

    • Cross-Functional Collaboration & Stakeholder Engagement: Acting as a bridge between engineering, operations, and executive leadership, fostering alignment, shared goals, and a culture of innovation.

  • Senior Director of SRE - OmniChannelOptum

    Jun, 2022 - Nov, 20242 years 5 months

    Led the flagship OmniChannel platform for a Fortune 4 healthcare enterprise, overseeing one of the world's largest call center operations with zero tolerance for downtime. Focused on scaling real-time data flows, generative AI, and predictive analytics to deliver a seamless experience for tens of millions of users.

    • Oversaw a globally distributed SRE team responsible for >25k servers, ensuring continuous operation across peak load (50k+ live agents, 2B+ talk minutes annually).

    • Implemented advanced AI solutions (LLMs, predictive analytics) to reduce average call-handling time by 15%, contributing to higher member satisfaction and faster resolution rates.

    • Built automation frameworks with Ansible/Chef/HashiCorp tools, cutting deployment lead time by 70% and standardizing release workflows across hundreds of enterprise apps.

    • Established comprehensive performance testing strategies for real-time voice, chat, and virtual assistant solutions, reducing platform instability incidents by 40%.

    • Championed cross-functional collaboration with application, security, and business teams, ensuring regulatory compliance and industry-leading reliability metrics.

  • Senior Director of Site Reliability EngineeringOptum

    Sep, 2021 - Jun, 20229 months

    Elevated the enterprise SRE ecosystem by introducing modern infrastructure-as-code principles and an 'Everything as Code' philosophy. Focused on building robust operational frameworks for highly regulated, zero-downtime healthcare applications used by millions of providers and patients daily.

    • Integrated CI/CD pipelines across 80+ global engineering teams, increasing deployment frequency by 4x while reducing meantime-to-recovery (MTTR) by 35%.

    • Directed a large-scale rollout of observability tooling (Prometheus, Grafana, Splunk), enabling proactive anomaly detection and accelerating incident resolution.

    • Authored an internal SRE 'playbook' that standardized on-call protocols, postmortem best practices, and capacity planning across critical healthcare services.

    • Coached over 200 titled SRE engineers in DevOps/SRE methods, fostering a culture of continuous learning, automation, and operational excellence.

    • Delivered on-time reliability targets for Tier-0 business applications, maintaining sub-5-minute RTO/RPO under strict SLAs.

  • Senior Director of Infrastructure EngineeringUnitedHealth Group

    Jun, 2019 - Sep, 20212 years 3 months

    Spearheaded a transformational shift at OptumLabs, scaling massive data processing infrastructures for AI/ML workloads. Oversaw Kafka, MongoDB, and Kubernetes clusters supporting multi-petabyte data sets, delivering sub-second latency for mission-critical healthcare analytics.

    • Created an Infrastructure Factory Abstraction Layer, reducing environment provisioning time by 80% and streamlining multi-cloud adoption for dozens of research teams.

    • Managed 8PB SAN with Kafka/MongoDB pipelines, processing 50M+ records/day for real-time healthcare analytics with near-zero data loss tolerance.

    • Drove AI/ML data pipelines to integrate advanced matching algorithms for patient records, achieving 99.99% accuracy in medical record correlation.

    • Designed custom Kubernetes clusters for high-throughput AI workloads, enabling auto-scaling and cost optimizations across public and private cloud environments.

    • Implemented best-in-class CICD and automation tooling, delivering consistent environment builds and improving productivity of 300+ developers.

  • DevOps DirectorOptum

    Jun, 2017 - Jun, 20192 years

    Pioneered the DevOps modernization of the Health Services ID (HSID) system, enabling frictionless access for over 80 million users. Oversaw microservices adoption and CI/CD transformations that drastically cut deployment cycles and boosted application reliability.

    • Orchestrated a ground-up rewrite of the HSID platform, eliminating legacy bottlenecks and scaling to support ~80M user authentications daily.

    • Introduced containerization (Docker/Kubernetes) and integrated robust CI/CD pipelines, achieving zero-downtime deployments and 5x faster delivery.

    • Reduced incidents related to access management by 50% through real-time monitoring, proactive alerting, and dev-owned runbooks.

    • Fostered a DevOps mindset across cross-functional teams—empowering developers to take ownership of end-to-end delivery.

    • Transitioned critical systems to microservices, aligning with future cloud-native strategies and improving overall system resilience.

  • Senior Platform EngineerBest Buy

    Jul, 2015 - May, 20171 year 10 months

    Infrastructure Architect and Senior Platform Engineer overseeing the design, management, and automation of Unix platforms, VMware virtualization, Storage, Network, and middleware hosting platforms in Best Buy's Enterprise Technology group. Directly supported the global fleet of Best Buy Corporate systems for all mission-critical applications.

    • Implemented a global configuration management solution across Unix fleets.

    • Defined the first Kubernetes offering on RedHat OpenShift for production containerized apps.

    • Delivered a talk at RedHat Summit (May 2017) on these innovations.

  • Senior System Engineer, SPS Commerce

    Oct, 2014 - Jul, 20159 months

    Defined administration and management standards for data center and cloud operations, collaborating with development teams to implement DevOps practices. Led the migration of platforms from private data centers to AWS, establishing CI/CD pipelines and early automation frameworks to support application scalability.

    • Migrated on-prem environments to AWS for improved scalability.

    • Developed CI/CD pipelines for automated AWS resource provisioning.

    • Collaborated with scrum teams to deliver DevOps-based solutions.

  • Senior Platform EngineerAccenture

    Jan, 2013 - Oct, 20141 year 9 months

    Sr. Platform Engineer supporting Accenture’s Unix/Linux and Oracle VM, Solaris LDOM, and VMware ESX environments. Primarily responsible for lifecycle management, security, and administration of hosting platforms serving a wide client portfolio.

    • Managed large-scale Unix/Linux environments for enterprise clients.

    • Ensured security and compliance across Oracle VM and Solaris LDOM infrastructures.

    • Optimized VMware ESX environments for high availability and performance.

  • Global Unix Team Lead, Senior Unix EngineerAccenture

    Nov, 2011 - Jan, 20131 year 2 months

    Oversaw Best Buy’s global Unix environment, comprising approximately 1600 mixed physical and virtual servers (x86, Sparc, VMware). Managed HP and Oracle physical servers and VMware virtualization, including clustered solutions for core enterprise applications. Served as DevOps SME for mobile and .com teams, focusing on automation, virtualization, and cloud strategies.

    • Managed large-scale Unix systems (x86, Sparc, VMware) for enterprise applications.

    • Implemented DevOps best practices for the Mobile and .com teams.

    • Drove automation and virtualization initiatives, enhancing operational efficiency.

  • Operations EngineerCode 42 Software, Inc.

    Sep, 2010 - Apr, 20117 months

    Code 42 Software's primary product is Crashplan, an online backup platform supporting hundreds of thousands of users across five international data centers. As one of three Operations team members, responsibilities included data center logistics, capacity management, enterprise appliance/infrastructure builds, and systems integration.

    • Maintained petabytes of user data storage across multiple global data centers.

    • Coordinated capacity management and logistics for rapid growth and scaling.

    • Implemented enterprise appliance integrations to streamline operations.

  • Senior Technical AnalystAmeriprise Financial Services, Inc.

    Aug, 2009 - Sep, 20101 year 1 month

    Part of the Hosting and Infrastructure Services group, supporting Unix platforms (P series, AIX, Linux) and distributed technologies. Provided expert guidance to project and application delivery teams, ensuring seamless integration across network, security, infrastructure, and storage touchpoints.

    • Facilitated smooth cross-team integration for critical business projects.

    • Ensured Unix/AIX/Linux environments met security and reliability standards.

    • Optimized infrastructure to reduce operational risks.

  • IT Manager, Lead Network/System AdministratorYugma Inc.

    Jun, 2007 - Apr, 20091 year 10 months

    Oversaw all corporate IT systems, vendor engagements, and service providers. Served as the lead system and network administrator for internal, remote office, and production environments. Also developed web conferencing global call center systems, customer support platforms, and executed a cost-saving data center migration project.

    • Managed day-to-day IT operations for corporate and remote sites.

    • Developed global call center infrastructure and customer support solutions.

    • Led a two-week data center migration project, reducing hosting costs by $20k/month.

  • Network AdministratorWilson Learning

    Jun, 2006 - Feb, 20078 months

    Network administrator for a multi-national organization of business education professionals, responsible for maintaining and optimizing network systems across multiple locations.

    • Maintained network connectivity and security across international offices.

    • Implemented routine upgrades and patches for network infrastructure.

    • Provided technical support to remote and local users.

  • Senior Technology AdvisorEscape Key Computing

    Feb, 2006 - Feb, 20071 year

    Senior technology decision-maker for client implementations and internal services. Oversaw technology strategy and advised clients on best-fit solutions. Escape Key acquired 1337 Consulting in 2006.

    • Guided clients on technology stack choices, security, and scalability.

    • Coordinated with internal teams to align services with client needs.

    • Ensured smooth transition during acquisition of 1337 Consulting.

  • President/CEO, 1337 Consulting LLC

    Apr, 2003 - Apr, 20063 years

    Founded 1337 Consulting to provide affordable network and IT consulting services to small/medium businesses. Specialized in wireless networking, Active Directory, Exchange, Windows PC, and Linux server support. Also developed an event hosting service offering on-site networking, meeting hosting, logistics, and audience response systems.

    • Established a successful consultancy serving SMB clients in the Twin Cities.

    • Implemented wireless networking and open-source solutions for diverse needs.

    • Provided event hosting and on-site networking for medium/large corporate events.

  • Co Founder, TCOS

    Dec, 2001 - Dec, 20021 year

    Co-founded a hosting and service provider focused on P2P WISP and consultancy. Delivered wireless internet services and advising solutions in a rapidly evolving connectivity market.

    • Pioneered early P2P WISP solutions for local communities.

    • Provided consultancy on network design and hosting infrastructures.

    • Built robust wireless coverage in underserved areas.

  • Systems Administrator, Sistina Software

    Dec, 2000 - Dec, 20011 year

    Systems administrator and storage engineer supporting Global File System (GFS) and Linux Logical Volume Manager (LVM) development. Sistina was an original open-source company of the late ’90s, later acquired by RedHat in 2003. LVM is now widely used in many Linux distributions worldwide.

    • Supported development of GFS and LVM, key open-source technologies.

    • Managed storage hardware integrations for large-scale Linux environments.

    • Collaborated with engineers to optimize GFS/LVM for enterprise use.

  • Radio / Network Engineer, Baldeagle.com Internet Services

    Aug, 1997 - Dec, 20003 years 4 months

    Responsible for systems, network, and radio-based packet network design, deployment, and management. Played a pivotal role in establishing reliable radio networking solutions.

    • Deployed and managed radio-based packet networks across various regions.

    • Integrated network solutions for rural and hard-to-reach areas.

    • Maintained stable connectivity and performed regular system checks.

  • Desktop / Network Support, ITI Security Systems

    Dec, 1996 - Dec, 19971 year

    Handled desktop and network support for a security systems provider, ensuring reliability and uptime for mission-critical applications and user environments.

    • Troubleshot desktop and network issues for on-site and remote staff.

    • Maintained hardware/software inventories and performed routine maintenance.

    • Collaborated with security specialists to support system integrations.

Skills

Skills

  • Cloud & Infrastructure

    AWS

    Azure

    GCP

    Container Orchestration (Kubernetes, OpenShift)

    Serverless Platforms

  • Automation & DevOps

    Infrastructure as Code (Terraform, Ansible)

    CI/CD (Jenkins, GitLab)

    Observability (Splunk, Prometheus)

    GitOps & Version Control

  • Distributed Systems & Data

    Kafka

    High-Throughput Message Queues

    Relational & NoSQL Databases

    Stream Processing

    Event-Driven Architectures

  • Security & Compliance

    HIPAA/HITRUST

    Zero-Trust Frameworks

    Advanced IAM

    Regulatory Adherence

    Data Privacy

  • Leadership & Strategy

    Global Team Building

    Cost Optimization

    Regulatory Alignment

    Cross-Functional Collaboration

    Executive Stakeholder Engagement

  • Rapid Technology Adoption

    Mastering Unfamiliar Tools Quickly

    Driving Enterprise-Scale Implementations

    Adaptability

    Continuous Learning

Volunteer Work

Volunteer Work

  • Board Member, El Sistema Minnesota

    Sep, 2011 - Jan, 2016

    Act as an unflinching advocate for the kids.

References

References

  • Luke Beard

    Tom Hudak is the genuine article. His technical scope is beyond any I've encountered in my career. Additionally, Tom is an excellent manager who listens with the intent to understand, which is unfortunately rare.

  • Nathan Dornquast

    Tom has great energy. He's a passionate enthusiast who really knows his stuff. Too many skills to list, but exceptional experience with Linux, networking as well as hardware. When you're in trouble and sitting in your fox hole; Tom's a good guy to have with you.

  • Robert Terhaar

    Working with Tom was mind-blowing. I have yet to work with another person who is as deeply knowledgeable and passionate as he is. Tom is a unique person who excels at being highly specialized and also very generalized, as he understands the microscopic technical details and how the details matter from the larger business perspective.

  • Jonah Cagley

    Tom is extremely knowledgeable regarding design and administration of networked systems and is adept at implementing solutions to mitigate risks and exposures. He is results-oriented and is able to work successfully with cross-functional team members with a variety of personalities and knowledge levels. Tom has an impressive ability to explain complex issues and determine clear, concise resolution(s). In addition, Tom is a tireless champion of new technology and embraces the use of any application designed to render better results and save an organization money. He was able to implement a multitude of new, cost-saving tools into the work flow and processes at Yugma with great results. Tom's technical expertise and ability to solve problems would be a incredible asset for any organization.

  • Mark Espena

    Tom and I have worked together in various roles for over 10 years. Tom balances strong interpersonal skills with a highly technical background - a balance sometimes hard to come by in this industry. He fosters client relationships as well as mentors other staff technicians. He is a formidable problem-solver and always willing to get his hands dirty to trouble-shoot complex technical issues. While working for me, he was a trusted IT advisor to our clients and i would recommend Tom to anyone needing a personable and solid technician.

  • Brian Kantar, MS, PMP

    Tom Hudak has been a pleasure to work with each time I have had the opportunity to do so. His dedication and commitment to technology is proof that Tom has what it takes to get the job done.

  • Vas Bhandarkar

    Tom worked for me for over a year. During that time Tom went above and beyond the call of duty as was written in his job description. Besides managing IT operations for the company he managed customer relationships, helped with defining new products and also product managed our Audio conferencing product. He also created and managed a well-regarded customer-forum Wiki for the company, and engaged customers in meaningful dialogues on our company's products and product lines. His knowledge of IT systems is diverse and eclectic and he was a tremendous resource to the entire organization.