Applied AI · Agents & Evals
AI for engineering teams whose data is scarce, expensive, and regulated.
I spent fifteen years running materials labs and quality systems in medical devices — a world where one data point can cost hundreds of dollars and take ten weeks to get, a label can't be trusted until the measurement system is, and nothing ships until validation proves it works. Now I build AI systems held to that same standard: production RAG and agent systems with eval harnesses, acceptance criteria, and audit trails. There's a live system below and writeups on how it was built. I build for engineers and scientists because I've been both.
About

I see the gap before it's a problem
I started as a lab tech, running tensile tests and prepping metallographic samples, learning how materials behave before I ever managed anyone. Fifteen years later, I lead the materials and clinical engineering function for a $500M global operation. The technical foundation matters, but what stuck with me was everything around it: what it's like to be the person doing the careful, detailed work, what gets glossed over by the people relying on it, and how technical expertise turns into results the rest of the business can use. Those lessons shape how I lead now.
The pattern across all of it is the same: I see the gap, whether it's a missing lab, no quality system, no process, or no data, and I figure out how to stand it up, prove it works, and make it stick. Sometimes that means pointing the gap out to people who haven't noticed it yet. It's meant building an ISO 17025 quality system from scratch, turning around a plating line that was scrapping nearly nine parts in ten, standing up a BSL-2 microbiology program, and writing the EU MDR strategy for a $140M device portfolio. Lately the gap has been software: my teams were losing days to manual document searches and report writing, so I learned to build the AI tools that give that time back, with a bias toward systems that stay traceable and auditable instead of just demoing well.
The technical work is only half of what I do. The other half is building the team that carries it further than I could alone: engineers and scientists who trust each other, who stay, and who become the people their colleagues call when something breaks.
Selected Work
Materials Science Literature RAG
A production RAG and agent system over materials-science corpora. Hybrid retrieval, a 95-case eval harness, and an agent that verifies its own citations against the sources, built solo and deployed live for around $12–20/month. The live app is invite-only; the full case study is public.
Test Report Automation
Lab scientists spent about four hours on each engineering test report. This drafts one in about twenty minutes and flags errors before a scientist reviews and signs off, saving over 500 hours a year.
See the project →Raw Materials Certificate Processing
Reads supplier certificates of compliance as they arrive, pulls the data with citations to the source PDF, and checks each value against the right standard, so problems surface before the shipment does.
See the project →Start Here
If you want to understand how I build AI for regulated, physical-world work, start with these. The first maps Lean Six Sigma onto machine learning; the others cover where AI actually pays off, and how I debug a retrieval system with evals.
When One Data Point Costs $600 and Ten Weeks
Machine learning assumes data is cheap and abundant. In manufacturing, a single dataset can cost thousands of dollars and take months to collect, and the lessons that matter most can take years to come back from the field. It turns out Lean Six Sigma has been solving the scarce-data problem for a century, and modern ML is converging on the same answers. Here's the map between the two, and why the real win is closing the loop over the long run.
We've Been Having the Wrong Conversation About AI
While the tech industry debates 10x productivity and headcount reduction, manufacturing, test labs, and traditional engineering firms are sitting on enormous untapped potential. The real AI story isn't about replacing people. It's about giving them the tools to close decades of digital debt and focus on work that actually matters.
Building a Search Engine That Actually Works:
I vibe coded a RAG system for a corpus of engineering handbooks and watched the search function return garbage. Here's how I diagnosed the root causes, ran a 40-configuration parameter sweep, benchmarked four backend architectures, and tested a knowledge-graph hypothesis — improving NDCG@5 by nearly 400% along the way.
Areas of Practice
The AI systems I build for engineering and quality teams, grounded in the fifteen years of materials and quality work they're built to serve.