> Pate Motter

PhD | AI Performance Engineer @ Google

Specializing in high-performance computing and scaling LLM inference on TPUs.

View Projects GitHub LinkedIn

About

My name is Pate Motter. I hold a PhD in Computer Science with a core focus on High-Performance Computing (HPC).

Currently, I work as an AI Performance Engineer at Google, where my primary focus revolves around optimizing LLM inference performance at scale on Tensor Processing Units (TPUs). My passion lies at the intersection of systems engineering and machine learning, pushing the hardware limits for massive AI models.

Work & Docs

vLLM TPU Inference

Core Contributor | Google

Enabling high-performance, unified JAX and PyTorch LLM inference on TPUs to serve massive models with optimized memory and throughput.

Python JAX Pallas TPU

MaxText

Contributor | Google

An open-source LLM written in pure JAX, explicitly tailored and optimized to run at scale on Google Cloud TPUs.

Python JAX TPU

Demystifying AI

Creator & Educator

A technical series explaining complex Artificial Intelligence concepts in an approachable way, bridging the gap between theoretical machine learning and engineering practice.

Python ML Education

Recommended Resources

Scaling (JAX) Machine Learning

A comprehensive guide outlining core principles for scaling machine learning models effectively using JAX.

Pallas: JAX Kernel Language

Documentation for Pallas, an explicitly-scheduled kernel language integrated natively into JAX, essential for writing high-performance custom operations on TPUs and GPUs.