
AI that makes AI fast
AI that makes AI fast
Wafer builds AI agents that work as autonomous performance engineers, optimizing GPU kernels for AI inference. Our customers are chip companies and cloud providers who need their AI models running at peak performance on any type of hardware. Our founding team includes engineers from Google (Spanner, Gemini), Two Sigma, AWS, and Argonne National Lab, with NeurIPS publications in ML.
Wafer builds AI agents that work as autonomous performance engineers, optimizing GPU kernels for AI inference. Our product is serverless and dedicated inference for the world’s fastest open source LLMs, achieved by Wafer's autonomous performance engineers.
Wafer pivoted from building AI optimization tools for chip companies to offering inference services for LLMs - shifting from B2B tools to infrastructure services.