×
Tony Jacobs

Tony Jacobs

Multi-Cloud Data Architect/Engineer


Background


About

About

Experienced Multi-Cloud Data Specialist with a proven track record of designing and delivering scalable, high-performance data solutions across diverse cloud environments. Skilled in integrating data across platforms using complex SQL, Python, advanced matching routines, and performance tuning to optimize pipelines and analytics workflows. Proficient in Databricks and Medallion Architecture, with expertise in Snowflake and dbt for modernizing data platforms. Extensive experience with customer data integration and transformation, drawing on work at TruGreen and recent projects to deliver actionable insights. Combines Azure, AWS, and GCP cloud expertise with a passion for leveraging machine learning and AI to drive innovation and business value through advanced analytics and automation.

Work Experience

Work Experience

  • TruGreen - Microsoft Azure Data Engineer/Architect

    Sep, 2021 - Jul, 20242 years 10 months

    Co-Leader of the design and implementation of a medallion data lakehouse architecture that leveraged low-cost storage solutions to optimize data warehousing and analytics capabilities using Databricks and delta tables. This innovative approach consolidated disparate data systems into a unified platform, streamlining data access and accelerating data-driven initiatives. Teams across the organization were empowered to access and analyze comprehensive, up-to-date datasets, facilitating faster insights and driving more informed decision-making.

    • Spearheaded the migration of a legacy IBM AS/400 system to Azure Synapse, reducing reporting processing time from 6 hours to under 1 hour, enabling real-time, company-wide distribution of critical metrics and boosting decision-making speed by 85%.

    • Designed and implemented scalable data pipelines using Databricks and the Medallion Architecture, transforming raw data into organized Bronze, Silver, and Gold layers, resulting in a 30% increase in data processing efficiency and a 40% reduction in storage costs.

    • Developed highly optimized Python scripts for ETL/ELT workflows, streamlining data cleansing, transformation, and loading processes, achieving an 80% reduction in processing time and improving data accuracy by 25%.

    • Implemented Azure Purview for data governance, improving compliance with GDPR and HIPAA regulations and enhancing audit readiness by 30%.

    • Integrated Azure Machine Learning with Synapse pipelines to enable predictive analytics for customer segmentation, achieving a 25% improvement in churn prediction accuracy.

    • Optimized costs by deploying automated scaling for Databricks clusters and Azure Synapse, reducing monthly cloud expenses by 20%.

    • Leveraged Azure Data Share to enable secure, cross-functional data collaboration, reducing data silos and enhancing enterprise-wide insights.

    • Designed and implemented targeted customer campaigns by leveraging Azure Synapse Analytics to analyze and segment large-scale customer datasets, resulting in a 15% increase in customer engagement.

    • Developed automated workflows using Azure Data Factory to ingest, transform, and load customer data into Azure SQL Database, enabling near real-time campaign personalization and execution.

    • Enhanced campaign effectiveness by implementing data matching and cleansing routines within Azure Databricks, ensuring the quality and reliability of customer datasets.

  • Carestarter - AWS Data Engineer

    Dec, 2020 - Jul, 20217 months

    Generated weekly and monthly Web-App Adoption metrics for management and outside parties. Provided ad-hoc metrics/reporting for the organization as well as assisted with researching various AWS services to better understand the AWS ecosystem.

    • Designed and implemented a data migration architecture leveraging AWS S3 as the source system and Snowflake as the target, ensuring secure and efficient data transfer using Snowflake's Snowpipe for continuous ingestion.

    • Utilized AWS Glue and PySpark to orchestrate ETL pipelines, transforming raw customer data from S3 into a format optimized for analytics within Snowflake's cloud-native architecture.

    • Configured Snowflake External Stages to directly access data stored in S3, reducing data movement complexity and enabling seamless integration for large-scale migration tasks.

    • Applied AWS IAM roles and Snowflake Access Control to ensure secure cross-platform data transfer, maintaining compliance with organizational data governance standards.

    • Used Snowflake Streams and Tasks to process incremental updates during migration, ensuring the real-time synchronization of customer datasets without service interruptions.

    • Designed a robust error-handling framework using AWS CloudWatch and Snowflake Query History, enabling real-time monitoring and resolution of data pipeline issues.

    • Optimized performance and storage by implementing Snowflake clustering keys and micro-partitioning, improving query performance on migrated customer data.

    • Conducted rigorous data validation using AWS Athena and Snowflake SQL, ensuring the accuracy and completeness of the migrated datasets.

    • Resolved support cases to increase Amazon SES email limits from 200/hour to 25,000/hour, resulting in a 12,400% improvement in email throughput and enabling the seamless delivery of high-volume communications for business-critical operations.

    • Generated comprehensive weekly and monthly Web-App Adoption metrics, providing actionable insights that drove a 20% improvement in platform adoption rates and supported strategic decision-making for both management and external stakeholders.

    • PLed the development of multiple Proofs of Concept (POCs) and implemented scalable solutions across a diverse range of AWS services, improving operational efficiency and accelerating project delivery timelines by 30%.

  • Y&L Consulting - Azure & GCP Data Engineer

    Dec, 2019 - Jan, 20211 year 1 month

    I consulted as an Azure data architect, data engineer, and business analyst supporting multiple high-value health-care and banking projects. Core Azure services used were Azure Databricks and ADF with bias towards serverless methodologies.

    • Designed and implemented Azure-based architectures as a technical architect and hands-on data engineer for multiple POC cloud engagements, migrating 100TB+ of legacy system data to the cloud and building scalable data lakes that improved data accessibility for 5+ data science, reporting, and analytics teams.

    • Integrated external data sources into Azure Data Lake using Azure Data Factory, facilitating seamless data exchange across Azure Blob Storage, AWS S3 buckets, and Parquet files, increasing data accessibility by 40% and reducing data transfer times by 30%.

    • Developed and optimized data pipelines for ingestion, profiling, cleansing, transformation, and scheduling using Azure Databricks, Python, and SQL, creating robust Dimension and Fact tables that improved query performance by 50% and reduced data processing times by 70%.

    • Implemented microservices and BI workflows using U-SQL in Azure Data Lake Analytics and Python, delivering advanced data processing and analytics capabilities that supported real-time insights for 10+ business units.

    • Built a real-time Operational Data Store using Azure Event Hubs and Stream Analytics, enabling direct streaming into Power BI for corporate reporting and achieving a 90% reduction in report generation time.

    • Deployed and managed HDInsight clusters and utilized Hadoop ecosystem tools, including Kafka, Spark, Databricks, Sqoop, Pig, Hive, and CosmosDB, enabling real-time streaming analytics for 1M+ daily transactions and efficient batch processing of large-scale datasets.

  • DrillingInfo - Big Data Oil & Gas Cloud Consultant

    Jun, 2018 - Dec, 20191 year 6 months

    I provided design and analysis advice, recommendations, proto-types, as well as database migration assistance for high-value clients in the energy sector. I am familiar with various big data architectures, etl methodologies, and various devops practices in support of oil & gas activities within AWS, GCP, and Azure cloud platforms.

    • Advised on big data strategies and provided expert recommendations, prototypes, and database migration support for high-value customer data feeds, enabling the successful migration of 50TB+ of critical data to cloud platforms and improving data processing efficiency by 40%.

    • Demonstrated expertise in big data architectures, ELTL methodologies, and DevOps practices, supporting 100% uptime for data pipelines and analytics workflows across AWS, Azure, and GCP environments, resulting in a 30% reduction in operational costs and 50% faster deployment cycles.

  • DrillingInfo - Big Data Specialist

    Apr, 2014 - May, 20184 years 1 month

    Selected to join the prestigious DI Consulting team and played a pivotal role as data architect/engineer for client solutions. Most memorable achievement was the opportunity to design and build DrillingInfo's very first Cloud DB Offering (DIBI), which significantly affected sales, subscriptions, and generated a crazy amount of cloud interest within the first few months of release.

    • Sole architect and developer of DrillingInfo's DIBI Cloud database.

    • I was provided the vision and requirements by management and executed the plan with minimal supervision.

    • Data discovery and requirements gathering.

    • Data modeling (conceptual, logical, physical).

    • Working with RDS, SQL Server, PostgreSQL, Oracle, and S3 for storage.

    • Heavy use and creation of on premise SSIS packages for orchestration of ETL processes to simplify delivery to customer cloud databases.

    • Heavy use of FME (aka. Feature Manipulation Engine) as a data integration platform for supporting spatial data worldwide.

  • EZCorp Consultant - Data Architect

    Apr, 2013 - Apr, 20141 year

    SQL Server Database Architecture and Design (OLTP/OLAP) engineer. Front lines with SQL Server Database Programming and Development to support multiple SSAS cubes primarily within the Microsoft BI Stack (SSIS, SSAS, SSRS, MDS).

    • Gather and analyze business requirements and translate business needs into long-term business intelligence solutions leveraging the Microsoft technology BI Stack (SSIS, SSAS, and SSRS).

    • Master Data Services support (MDS) and Data Quality Services (DQS).

    • Developed SSAS Cubes (Multi-Dimensional/Tabular) as well as associated reports with appropriate dimensions and measures, configured attributes of dimension table with hierarchy relationships, deployed KPIs on SSRS, as well as added new measure groups and dimensions when necessary.

  • Amherst Holdings Consultant - Data Architect

    Dec, 2012 - Apr, 20134 months

    Designed an integrated datawarehouse proto-type to aggregate home loan data from CoreLogic to resolve the need to purchase pre-aggregated data from 3rd parties to significantly reduce costs.

    • Predictive Analytics for positional bond trading (training set, test set) to identify best tranches and MBS (Mortgage-Backed Securities).

    • Matching Algorithm creation for binding multiple home to loan datasets to identify undervalued properties to acquire in line with corporate strategy.

    • Data and Requirements gathering, analysis, & design deliverables.

    • Data Modeling (Conceptual, Logical, and Physical Models based on requirements).

    • ETL Design (data profiling, source2target mapping, and etl).

    • Data Dictionary design, creation, and implementation.

    • Fact and Dimension table creation for consumption by Tableau, SSRS, & LogiAnalytics.

  • IBM - Information Management Specialist

    Jan, 2012 - Dec, 201211 months

    Performed MDM Implementations for high-value clients across the world. Configured Initiate MDM member data models, data standardization algorithms, bucketing, data derivation, weight generation, & bulk cross matching. Instance/Hub/Client Tools creation and deployment.

    • MDM Implementations - Clients: Adventist Healthcare | Advocate Healthcare | Choice Hotels

    • Instance/Hub/Client Tools configuration and deployment.

    • Customer requirements gathering, analysis, & review.

    • Configure Initiate MDM member data model.

    • Configure Algorithms (Standardize data, bucketing, & comparison functions).

    • Initiate MDM member data model configuration.

    • Data derivation, weight generation, bulk cross matching, and deployments.

  • AMD - Business Intelligence Engineer

    Sep, 2011 - Dec, 20113 months

    Performed requirements gathering, data profiling, data modeling (conceptual, logical, physical), and SSIS development to support fact and dimensional supply chain data for cube development and consumption.

    • ASIA Sales/Inventory/ASP SSAS Cube Development.

    • Requirements document analysis resulting in re-defining scope and deliverables.

    • Identification of facts and dims with associated source-to-mapping document deliverable.

    • Conceptual, logical, and physical data model deliverables.

    • Cube specifications document deliverable.

    • ETL (SSIS) Package SSAS Cube deliverables.

  • NCSoft - Qlikview BI Consultant

    Mar, 2011 - Sep, 20116 months

    Migration of Cognos Reports to QlikView Reports (6-month contract). I've always wanted to work for a gaming company and this was a bucket list opportunity.

    • Analysis Phase – Identified and discussed QlikView system functional & technical requirements, proposed possible solutions with positive/negatives and provided a recommended solution, specified the final agreed upon detailed solution requirements.

    • Planning Phase – Established the QlikView project management elements of the project, including communications, roles and responsibilities, initial timeline, processes and procedures.

    • Design Phase – Conceptual, logical and physical data model development. Developed a 3-Tier Mixed (qvds & QlikMarts) blueprint (design document) for the construction of the QlikView solution (QVD Architecture, QVD Reload/Load Methodology, Publisher, and Server implementation & performance), including critical elements such as data sources and integration, application workflow, infrastructure, based on requirements.

    • Checkpoint Phase – Assessed the design requirements against resources, timeline, and estimate to completion, making modifications when appropriate.

    • Build Phase – Developed and integrated QlikView solution components to meet design specifications via an iterative development process.

    • Test Phase – Testing of the QlikView system across multiple levels (unit, system, functional, user acceptance, performance, security), refined the solution to ensure acceptance.

    • Deploy Phase – Preparation of finished QlikView solution, including design automation, developed required documentation (user guide, admin guide, user training.

    • Review Phase – Project team discussions to gather lessons learned and assess project performance.

  • HomeAway - Business Intelligence Engineer

    Jan, 2009 - Mar, 20112 years 2 months

    Microsoft BI Stack (SSIS, SSAS, SSRS)

    • Heavy MS BI Stack (SSIS, SSAS, SSRS) development across the board.

    • Worked with business users to define and analyze problems.

    • Created conceptual, logical, and implemented physical data models.

    • Develop and Manage Reporting Solutions, SSRS schedules, subscriptions, and security using Report Manager.

    • Tableau Reporting Solutions, Server Product & Key management, licensing task automation.

    • WMS for Integration with Tableau deployment.

  • Dell - Business Intelligence Consultant - (Contract w/ extensions)

    Jun, 2004 - Jan, 20094 years 7 months

    Microsoft BI Stack (SSIS, SSAS, SSRS) Business Intelligence Developer projects and activities.

    • Implementation of tables, views, triggers, functions, stored procedures, constraints, indexes, full-text searches, as well as the creation of user-defined types to support database activities.

    • SQL Server Performance Monitoring and Troubleshooting.

    • Gathering of performance and optimization data via SQL Server Profiler, Database Engine Tuning Advisor, and DMVs.

    • Monitoring of SQL Server Agent job history to identify failures, outcome details, and job activities.

Skills

Skills

  • Amazon Web Services (AWS) Big Data

    SageMaker

    S3

    EC2

    EMR

    Route53

    ElasticSearch

    Kinesis Firehose

    Kinesis Streams

    Kinesis Analytics

    Redshift

    Machine Learning

    DynamoDB

  • Google Cloud Platform (GCP) Big Data

    BigQuery

    BigTable

    DataFlow

    DataProc

    Pub/Sub

    Data Fusion

    Composer

    Catalog

    Data Studio

    DataPrep

    Google Sheets

    Data Transfer

    Cloud Storage

    DataLab

  • Microsoft Cloud Platform (Azure) Big Data

    RDBMS

    CosmosDB

    DMS

    Azure Datawarehouse

    Databricks

    HDInsight

    Machine Learning

    Stream Analytics

    Data Lake Store

    Data Lake Analytics

    Data Catalog

    Data Factory

  • DevOps

    Maven

    Node.js

    Git

    GitHub

    Jira

    Jenkins

    Docker

    Kubernetes

    Puppet

    Ansible

    HTML

    JavaScript

    MongoDB

    Python

    Bash

Education

Education

  • Information Systems , Bachelors ,  The University of Texas at San Antonio

    Aug, 2001 - May, 2005

  • Certified AWS Data Engineer Associate, Certificate ,  Amazon Web Services

    Nov, 2024 - Nov, 2027

  • Certified Azure Data Engineer Associate , Certificate ,  Microsoft

    May, 2020 - May, 2022

  • Google Cloud Certified - Associate Cloud Engineer , Certificate ,  Google

    Nov, 2019 - Nov, 2021

  • Certified AWS Solutions Architect Associate , Certificate ,  Amazon Web Services

    Jun, 2017 - Jun, 2020

  • Data Engineering, Big Data, and Machine Learning on GCP Specialization , Certificate ,  Coursera

    - Present

  • Google Cloud Platform Big Data and Machine Learning Fundamentals , Certificate ,  Coursera

    - Present

  • Leveraging Unstructured Data with Cloud Dataproc on Google Cloud Platform , Certificate ,  Coursera

    - Present

  • Serverless Data Analysis with Google BigQuery and Cloud Dataflow , Certificate ,  Coursera

    - Present

  • Serverless Machine Learning with Tensorflow on Google Cloud Platform , Certificate ,  Coursera

    - Present

  • Building Resilient Streaming Systems on Google Cloud Platform , Certificate ,  Coursera

    - Present

  • IBM Initiate MDM - Certified Professional , Certificate ,  IBM

    - Present

  • Tableau Certified Professional , Certificate ,  Coursera

    - Present

  • QlikView Certified Designer & Developer , Certificate ,  IBM

    - Present