×
Lim Chen Chuen (Aaron)

Lim Chen Chuen (Aaron)

Data Engineer

Kuala Lumpur, MY
+60124873976
English, Bahasa Malaysia, Mandarin Chinese

Background


About

About

Data Engineer with 3 years of production experience designing and maintaining scalable data pipelines and cloud infrastructure on GCP. Hands-on expertise spanning BigQuery data asset governance and constraint management, data orchestration (Dagster), data quality validation (Pandera), cloud infrastructure provisioning (Pulumi), and LLM observability infrastructure on Kubernetes. Experienced in infrastructure-as-code, CI/CD automation with GitHub Actions, and multi-region cloud deployments across sandbox, staging, and production environments.

Work Experience

Work Experience

  • Data Engineer, Pulsifi

    Mar, 2023 - Apr, 20263 years 1 month

    • Supported a production Change Data Capture (CDC) streaming pipeline (Apache Beam/Dataflow) by provisioning and maintaining its cloud infrastructure, including Dataflow flex template deployments on Artifact Registry; implemented and maintained BigQuery primary and foreign key constraint validation across tables within the same dataset, ensuring referential integrity that enabled query optimizer block pruning and delivered significant reduction in bytes scanned and query costs.

    • Built an end-to-end Cloud Function from scratch integrating GCP Pub/Sub, AWS SQS, and BigQuery to expose talent analytics usage data; provisioned multi-region infrastructure (Cloud Functions, IAM bindings, Pub/Sub push subscriptions) using Pulumi across sandbox, staging, and production environments.

    • Deployed a self-hosted Langfuse LLM observability platform on GKE, managing full infrastructure provisioning — Kubernetes autopilot cluster, Helm chart deployment, Cloud SQL (PostgreSQL) backend, Secret Manager integration, and dynamically injected environment secrets into Helm values for environment-aware deployments.

    • Orchestrated a Freshdesk ML pipeline end-to-end using Dagster, including asset dependency design, scheduling, SQL query refinements, and dynamic Pydantic-based environment configuration for accurate multi-environment pipeline runs.

    • Built a BigQuery data validation framework from scratch using Pandera, standardizing schema validation across datasets with multi-environment CI/CD support (sandbox, staging, production, EU region); migrated the project to a monorepo polylith architecture and standardized GCP client usage.

    • Designed and provisioned BigQuery datasets, materialized views, scheduled queries, and day-partitioned tables across multi-region deployments (SG and EU), supporting BI and analytics workloads for customer success, talent acquisition, and finance; managed fine-grained IAM access at dataset and table level for internal teams and service accounts.

    • Maintained a PostgreSQL-to-BigQuery data reconciliation function with OIDC authentication, EU region support, and unit test coverage; developed and maintained an internal back-office application on Cloud Run with role-based file export restrictions, centralised authorization config, and access control unit tests.

    • Standardized CI/CD pipeline patterns across the data engineering team; led migration of multiple projects from Poetry to uv; adopted Pulumi as the primary IaC tool replacing Terraform; implemented OIDC-based short-lived credential authentication; established semantic versioning and automated release workflows using python-semantic-release.

  • Software Engineering Intern, Symplicity Sdn. Bhd.

    Jan, 2021 - Mar, 20212 months

    • Developed bancassurance client product features using C# and ASP.NET framework.

    • Designed and built company website and webpages using WordPress.

Skills

Skills

  • Languages

    Python

    SQL

  • Cloud Platform (GCP)

    BigQuery

    Dataflow

    Cloud Functions

    Cloud Run

    GKE

    Pub/Sub

    Cloud Storage

    Secret Manager

    IAM

    Data Catalog

    Dataplex

    Artifact Registry

  • Data Orchestration

    Dagster

  • Data Quality & Validation

    Pandera

    datacompy

  • Databases

    BigQuery

    PostgreSQL

  • Infrastructure as Code

    Pulumi

    Terraform

  • CI/CD & DevOps

    GitHub Actions

    Docker

    OIDC

    python-semantic-release

    ruff

  • AI / LLM Infrastructure

    Kubernetes (GKE)

    Helm

    Langfuse

  • Package Management

    uv

    Poetry

  • Other

    AWS SQS

    Pydantic

    Pendulum

Education

Education

  • Computer Science, BSc (Hons), Sunway University & Lancaster University

    - Jan, 2022

Awards

Awards

  • National Hammer Throw — 3rd Place (National) , Majlis Sukan Sekolah Malaysia (MSSM)

    Awarded on: Jan 01, 2014

    National athlete at MSSM kali ke-55 (2013) and ke-56 (2014), achieving 3rd place nationally both years. 1st place Kedah State in 2013 and 2014. SUKMA representative in 2014.

  • Malaysian National Chemistry Quiz — Distinction , Malaysian Institute of Chemistry

    Awarded on: Jan 01, 2014

Volunteer Work

Volunteer Work

  • Volunteer, Kechara Soup Kitchen

    Sep, 2020 - Sep, 2020

    Volunteered every Saturday for 4 weeks, collecting and distributing meals to homeless individuals across the Kuala Lumpur area.