Lim Chen Chuen (Aaron)

Data Engineer

Professional Summary

Data Engineer with 3 years of production experience designing and maintaining scalable data pipelines and cloud infrastructure on GCP. Hands-on expertise spanning BigQuery data asset governance and constraint management, data orchestration (Dagster), data quality validation (Pandera), cloud infrastructure provisioning (Pulumi), and LLM observability infrastructure on Kubernetes. Experienced in infrastructure-as-code, CI/CD automation with GitHub Actions, and multi-region cloud deployments across sandbox, staging, and production environments.

Professional Experience

Data Engineer

Pulsifi
Mar 2023 - Apr 2026
  • Supported a production Change Data Capture (CDC) streaming pipeline (Apache Beam/Dataflow) by provisioning and maintaining its cloud infrastructure, including Dataflow flex template deployments on Artifact Registry; implemented and maintained BigQuery primary and foreign key constraint validation across tables within the same dataset, ensuring referential integrity that enabled query optimizer block pruning and delivered significant reduction in bytes scanned and query costs.
  • Built an end-to-end Cloud Function from scratch integrating GCP Pub/Sub, AWS SQS, and BigQuery to expose talent analytics usage data; provisioned multi-region infrastructure (Cloud Functions, IAM bindings, Pub/Sub push subscriptions) using Pulumi across sandbox, staging, and production environments.
  • Deployed a self-hosted Langfuse LLM observability platform on GKE, managing full infrastructure provisioning — Kubernetes autopilot cluster, Helm chart deployment, Cloud SQL (PostgreSQL) backend, Secret Manager integration, and dynamically injected environment secrets into Helm values for environment-aware deployments.
  • Orchestrated a Freshdesk ML pipeline end-to-end using Dagster, including asset dependency design, scheduling, SQL query refinements, and dynamic Pydantic-based environment configuration for accurate multi-environment pipeline runs.
  • Built a BigQuery data validation framework from scratch using Pandera, standardizing schema validation across datasets with multi-environment CI/CD support (sandbox, staging, production, EU region); migrated the project to a monorepo polylith architecture and standardized GCP client usage.
  • Designed and provisioned BigQuery datasets, materialized views, scheduled queries, and day-partitioned tables across multi-region deployments (SG and EU), supporting BI and analytics workloads for customer success, talent acquisition, and finance; managed fine-grained IAM access at dataset and table level for internal teams and service accounts.
  • Maintained a PostgreSQL-to-BigQuery data reconciliation function with OIDC authentication, EU region support, and unit test coverage; developed and maintained an internal back-office application on Cloud Run with role-based file export restrictions, centralised authorization config, and access control unit tests.
  • Standardized CI/CD pipeline patterns across the data engineering team; led migration of multiple projects from Poetry to uv; adopted Pulumi as the primary IaC tool replacing Terraform; implemented OIDC-based short-lived credential authentication; established semantic versioning and automated release workflows using python-semantic-release.

Software Engineering Intern

Symplicity Sdn. Bhd.
Jan 2021 - Mar 2021
  • Developed bancassurance client product features using C# and ASP.NET framework.
  • Designed and built company website and webpages using WordPress.

Education

Sunway University & Lancaster University

BSc (Hons) in Computer Science

Professional Skills

Languages:

Python, SQL

Cloud Platform (GCP):

BigQuery, Dataflow, Cloud Functions, Cloud Run, GKE, Pub/Sub, Cloud Storage, Secret Manager, IAM, Data Catalog, Dataplex, Artifact Registry

Data Orchestration:

Dagster

Data Quality & Validation:

Pandera, datacompy

Databases:

BigQuery, PostgreSQL

Infrastructure as Code:

Pulumi, Terraform

CI/CD & DevOps:

GitHub Actions, Docker, OIDC, python-semantic-release, ruff

AI / LLM Infrastructure:

Kubernetes (GKE), Helm, Langfuse

Package Management:

uv, Poetry

Other:

AWS SQS, Pydantic, Pendulum

Volunteer Experience

Volunteer

Kechara Soup Kitchen
Sep 2020 - Sep 2020

Volunteered every Saturday for 4 weeks, collecting and distributing meals to homeless individuals across the Kuala Lumpur area.

Awards and Honors

National Hammer Throw — 3rd Place (National)

Awarded by: Majlis Sukan Sekolah Malaysia (MSSM)

Date: 2014

National athlete at MSSM kali ke-55 (2013) and ke-56 (2014), achieving 3rd place nationally both years. 1st place Kedah State in 2013 and 2014. SUKMA representative in 2014.

Malaysian National Chemistry Quiz — Distinction

Awarded by: Malaysian Institute of Chemistry

Date: 2014

Languages

English:

Fluent

Bahasa Malaysia:

Fluent

Mandarin Chinese:

Fluent