×
Jack Stehn

Jack Stehn

AI / Data Engineer

San Francisco, California, US
(415) 787-7975

Background


About

About

Curiosity-driven Engineer with Data Science roots, inspired by my journey (homelessness to Berkeley grad) to use data thoughtfully to understand the world and drive impact. Excited by xAI's mission. Offer end-to-end experience building/maintaining scalable data pipelines for high-volume user events (Python, SQL, PostgreSQL, GCP/AWS) and deriving insights to shape product strategy. Strong foundation in ML/Deep Learning/NLP and hands-on experience with local LLMs provides basis for rapidly learning specific AI frameworks. Thrive on collaborating closely with product/research teams and solving challenging problems in fast-paced, high-ownership environments.

Work Experience

Work Experience

  • Senior Data Scientist, Caliber Public Schools

    Oct, 2024 - Present

    Led data strategy, designed and built foundational data infrastructure (GCP, BigQuery, PostgreSQL), and developed ML capabilities as the sole data expert.

    • Owned full data engineering lifecycle: designed, built, and maintained automated, scalable data pipelines and warehousing on GCP (BigQuery, GCS, PostgreSQL) using Python, Dagster, dbt, dlt.

    • Integrated diverse data sources (APIs, databases including PostgreSQL, files) to create unified datasets for analytics and ML.

    • Developed and deployed Deep Learning & NLP models for predictive analytics and insights.

    • Initiated and led weekly sessions upskilling data practitioners, fostering collaboration and data literacy.

    • Partnered with leadership to define key metrics and identify high-impact, data-driven opportunities.

  • Data Scientist, SetSail

    Sep, 2021 - Jan, 20231 year 4 months

    Applied software and data engineering skills to build & scale data infrastructure (AWS, PostgreSQL) and ML features in a high-growth B2B SaaS startup.

    • Led the design, implementation, and performance analysis of a high-performance data pipeline overhaul (Python, AWS, PostgreSQL) processing high-volume user event data (TBs scale), optimizing for latency/reliability.

    • Architected and implemented scalable data solutions, including data modeling (star schema) and performance optimization.

    • Developed and deployed production ML/NLP models analyzing user communication data.

    • Worked directly with enterprise users to gather feedback and iterate on ML-driven features.

    • Implemented robust SWE practices including comprehensive testing and CI/CD (GitHub Actions) within an Agile workflow.

    • Collaborated closely with Product and Engineering to define requirements and translate insights into features.

  • Data Science Research Intern, UC Berkeley School of Public Health

    Sep, 2020 - May, 20218 months

    Provided data science and programming support for academic research on social determinants of health.

    • Applied Python programming and Deep Learning (LSTMs, Transformers) for NLP tasks, including building an intent classification model for a chatbot.

    • Performed statistical analysis on sensitive datasets, ensuring ethical handling and data privacy.

    • Collaborated with researchers and community partners on project execution.

Skills

Skills

  • Programming & Core SWE

    Python (Proficient)

    SQL (Proficient)

    PostgreSQL

    Bash/Shell Scripting

    API Integration (Design Familiar)

    Production Code Quality

    Testing (Unit/Integration)

    CI/CD

    Git

    Agile Methodologies

  • Data Engineering & Cloud

    Data Pipelines & Orchestration (Dagster, dbt)

    Google Cloud Platform (GCP)

    BigQuery

    AWS Cloud Platform

    Docker

    Data Architecture

    Scalability & Performance Tuning

    System Performance Analysis

    Large-Scale Data Processing

  • AI / Machine Learning

    PyTorch

    TensorFlow

    scikit-learn

    pandas

    NumPy

    LLM Concepts / GenAI

    Deep Learning (Transformers, LSTMs)

    Natural Language Processing (NLP)

    Predictive Modeling

    User Behavior Modeling

    ML Model Deployment & Monitoring

    Ethical AI Practices

  • Collaboration & Ownership

    End-to-End Project Ownership

    Cross-functional Collaboration (Product, Research, Engineering)

    Stakeholder Communication

    Requirements Gathering

    Learning Agility / Adaptability

Education

Education

  • Data Science (Domain Emphasis: Quantitative Social Science), Bachelor's Degree, University of California, Berkeley

    Jan, 2019 - Dec, 2021

Awards

Awards

  • Impact Fellow , Education Pioneers

    Awarded on: Jan 01, 2024

    Selected for a leadership development fellowship focused on using data and management skills to advance equity in the education sector.

  • 2020-2021 Outstanding Data Science Undergraduate Award , UC Berkeley

    Awarded on: May 21, 2021

    Recognized for excellence in Data Science undergraduate studies, research, and community contributions at UC Berkeley.

Volunteer Work

Volunteer Work

  • Event Producer, Bearrison Street Fair

    Sep, 2022 - Present

    Managed logistics, entertainment, and fundraising (secured $90k+) for a large-scale LGBTQ+ community event.

  • Data Team Lead, San Francisco Gay Men's Chorus

    Dec, 2023 - Present

    Provide data-driven insights for policy-making and organizational growth via survey analysis.

Interests

Interests

  • Technology & Tinkering

    Generative AI (Local Models / Ollama)Large Language Models (LLMs)Raspberry Pi Projects
  • Creative & Community

    SingingDancingCommunity EngagementLocal Politics
References

References

  • Danny Pan (Direct Manager @ SetSail)

    Jack's passion for data analytics and software development stands out... [They] worked on... enhancing modeling capabilities... and building out a data ETL process that transformed the data infrastructure to help SetSail scale... collaborative work with the engineering and product team continually earned praises... always proactive to jump in and help solve a problem.

  • Josh Mantovani, M.A. (Senior Colleague @ SetSail)

    Jack's expertise in data science, combined with their passion for software engineering, make them a valuable asset... keen ability to plan and lead complex cross functional projects and their software engineering skills are second to none... a great communicator...

  • Sarah Nam (Team Colleague @ SetSail)

    Jack was an integral part of the planning and designing of data pipeline overhaul at SetSail... able to adjust the design... maintaining conversations across the product and engineering teams... a fast learner and willing to dig into new technologies...