Jan Hendrik Metzen

Senior AI Researcher

I am Senior AI Researcher at Aleph Alpha Research. As part of Aleph Alpha's Foundation Models team, I focus on LLM pretraining and optimization. In particular, we have developed a new tokenizer-free LLM architecture that allows for efficient pretraining, domain adaptation, and inference. Moreover, we are working on a new optimization methods that allow for efficient training of large-efficient models.

I was Senior Expert at Bosch Center for Artificial Intelligence (BCAI) until 08/2024. As a Senior Expert in Deep Learning at Bosch Center for Artificial Intelligence (BCAI), my primary research focused on making AI (specifically computer-vision based perception) robust, reliable, and safe. This included identifying vulnerabilities of Transformer-based neural network against adversarial patch/token attacks as well as finding systematic errors of image classifiers on rare subgroups and systematic errors of object detectors. We also developed architectures that are certifiably robust against patch attacks for image classifiers as well as for semantic segmentation. Furthermore, we proposed methods for adversarially training neural networks to become robust against universal perturbations and universal adversarial patches. Moreover, we provide methods for test-time adaptation of neural networks to improve robustness to domain shifts and study the role of shape-biased representations on robustness to common image corruptions.

A different strand of my research is automating machine learning (AutoML), specifically Neural Architecture Search. The latter research field is motivated by the vast design space of neural networks and the diversity of inference hardware. Manually tailoring a neural architecture for every type of hardware is cumbersome and not scalable - hardware-aware neural architecture search can vastly improve design efficiency and thus reduce cost of AI development. See our survey on Neural Architecture Search and a more recent survey on Neural Architecture Search for Dense Prediction Tasks in Computer Vision. We have also developed AutoCLIP, a method auto-tuning zero-shot classifiers for vision-language models that improves zero-shot performance across a broad range of domains.

I also love contributing to machine learning libraries, both open source and proprietary. I am a (recently mostly inactive) core contributor of scikit-learn, where I contributed tools for probability calibration of classifiers and for kernel ridge regression. Moreover, I have written a complete redesign of the Gaussian process module for scikit-learn. At BCAI, I am/was involved as core developer for frameworks for deep learning training pipelines, neural architecture search, and robustness evaluation.

I am a member of ELLIS and regularly review for scientific conferences and journals such as ICLR, ICML, NeurIPS, and TMLR. I was senior area chair of the AutoML 2022 conference and co-organizer of the workshops NAS@ICLR 2020 and NAS@ICLR 2021. I have been recognized by ICLR as Highlighted Reviewer in 2022 and Outstanding Reviewer in 2021.

See my personal website for more details.

Work

Senior AI Researcher

– Present

Senior AI Researcher in Aleph Alpha's Foundation Models team, where I focus on LLM pretraining and optimization. In particular, we have developed a new tokenizer-free LLM architecture that allows for efficient pretraining, domain adaptation, and inference. Moreover, we are working on a new optimization methods that allows for efficient training of large models.

  • Efficient LLM pretraining

  • Tokenizer-free LLM Architectures

Senior Expert 'Robust Scalable Perception'

Making Deep Learning Perception Safe by Model Auditing and Robust Training and Architectures.

  • Robustness of Deep Learning (Systematic Errors, Domain Shift, Adversarial Examples)

  • Neural Architecture Search

  • Synthetic Data

Volunteer

Highlighted Reviewer

ICLR 2022

Outstanding Reviewer

ICLR 2021

Regularly Reviewer

ICLR, ICML, NeurIPS, TMLR
– Present

Member

ELLIS
– Present

Education

Projects

SVGStud.io

– Present

As a hobby project, I have built SVGStud.io. SVGStud.io is a powerful AI SVG generator designed for creators, designers, and professionals looking to effortlessly generate, edit, and customize Scalable Vector Graphics (SVGs).

  • SVGStud.io has more than 1000 weekly active users as of 03/2025.

Awards

Scholarship (2004 - 2006)

Awarded by Studienstiftung des Deutschen Volkes (German National Academic Foundation)

Languages

German

Native speaker

English

Fluent

Spanish

Basic

French

Basic