Matt Fisher

Senior software engineer with 15+ years experience, now focused on AI safety evaluation. Maintaining UK AISI's Inspect Evals framework. Background includes scaling a startup from 4 to 200 employees and 11 years as an Australian Army Engineering Officer.

Skills

AI Safety & Evaluation

Inspect AI framework
LLM evaluation design and review
Evaluation pipeline orchestration
Testing standards development
Vivaria

Web Development

Python
JavaScript
Django
React.js
FastAPI
SQL
HTML
CSS

ML & AI Tools

Agentic AI tools
LLM APIs
Jupyter
Pandas
NumPy
PyTorch
Hugging Face
Streamlit

Cloud & Infrastructure

AWS
Docker
CI/CD
PostgreSQL
Redis
Celery
Kubernetes

Monitoring & Testing

pytest
jest
TDD
New Relic
Sentry
CloudWatch
Papertrail

Engineering Practices

Code review
Technical leadership
Agile development
AI coding agents

Work Experience (3)

Jan 2025 - Current

Senior Software Engineer (AI Evaluations)

Arcadia Impact

https://arcadiaimpact.org

Maintaining the [Inspect Evals repository](https://github.com/UKGovernmentBEIS/inspect_evals/) on behalf of the UK AI Security Institute.

Reviewing and quality-assuring contributed AI evaluations for correctness, reliability, and reproducibility
Establishing testing standards and procedures for evaluation contributions
Collaborating with international AI safety researchers and engineers
Working with frontier LLMs including GPT, Claude and Gemini
Building the Inspect Evals Scoring system using AWS Batch to orchestrate bulk runs of AI evaluations
Contributing to the Inspect Evals Dashboard showcasing LLM performance across diverse evaluations

Jul 2013 - Dec 2024

Senior Software Engineer

Edrolo

https://edrolo.com.au

First employee at ed-tech startup; scaled company from founding to ~200 employees with platform used by the majority of Australian high schools.

Architected and deployed Edrolo.com.au from inception using Django, React.js, and AWS
Built video captioning system using OpenAI's Whisper model, saving ~$100,000 pa
Grew engineering team from sole developer to 12-person tech team
Mentored junior developers and established team workflows and standards
Developed internal tools for user enrolments, payments, and shipping

May 2002 - Jul 2013

Engineering Officer (RAEME)

The Australian Army

Served as an Engineering Officer in the Australian Army, 5 years full time and 6 years active reserve.

Led soldiers in maintenance of weapons, vehicles, and equipment
Deployed to East Timor as Technical Regulation Officer
Developed prototype applications to improve logistics workflows
Published research on survivable military information systems (Best Paper, MilCIS 2010)
Held a Secret security clearance when required, currently Baseline

Projects (3)

Inspect Evals

Jan 2025 - Current

https://github.com/UKGovernmentBEIS/inspect_evals

AI Evaluation
Python
Code Review

A repository of community contributed LLM evaluations for Inspect AI. Created in collaboration by the UK AISI, Arcadia Impact, and the Vector Institute.

Inspect Evals Scoring

Jan 2025 - Apr 2025

https://github.com/ArcadiaImpact/inspect_evals_scoring

AI Evaluation
AWS Batch

A system for orchestrating bulk runs of AI evaluations using AWS Batch, to provide the data for the Inspect Evals Dashboard.

Inspect Evals Dashboard

Jan 2025 - Apr 2025

https://inspect-evals-dashboard.streamlit.app/

AI Evaluation
Dashboard

Showcases how well a diverse set of LLMs perform on the evaluations implemented in Inspect Evals.

Volunteer

Jan 2012 - Jan 2019

Spokesperson and Company Secretary

Stasis Systems Australia

Health advocacy

Education (2)

2000 - 2004

Bachelor of Engineering (Mechatronic)

University of Adelaide

Grade: First Class Honours

2000 - 2003

Bachelor of Mathematical and Computer Sciences

University of Adelaide

Certificates

2023-11-01

XCS224N: Natural Language Processing with Deep Learning

Stanford Online

https://digitalcredential.stanford.edu/check/36D3D2EE52DB3FE03B6E2A58361D7508ED7B4FCB0CA55044115CCD84E2E82C89dU5wVGM3d0RuMzZDQW5FY01VVWdMSjVveEdJWUp0c2w1L2J0N0QwalZnc0RMbE1w

2023-09-30

AI Safety - Governance

BlueDot Impact

https://bluedot.org/certification?id=recxLRd8yUwIzBbDm

2023-10-15

AI Safety - Alignment

AI Safety Quest

2023-12-15

AI Safety - Alignment 201

AI Safety Australia & NZ

Publications

1 Nov 2010

Plan for the Worst: Steps Towards Survivable Networks in MilCIS 2010

Discusses making military logistic and administrative information systems more survivable and thus more usable in deployed environments. Awarded Best Paper at MilCIS 2010.