GGML - AI at the edge

ggml is a tensor library for machine learning to enable large models and high performance on commodity hardware. It is used by llama.cpp and whisper.cpp


Short voice command detection on a Raspberry Pi 4 using whisper.cpp


Simultaneously running 4 instances of 13B LLaMA + Whisper Small on a single M1 Pro


Running 7B LLaMA at 40 tok/s on M2 Max


Here are some sample performance stats on Apple Silicon June 2023:

The ggml way



Company is a company founded by Georgi Gerganov to support the development of ggml. Nat Friedman and Daniel Gross provided the pre-seed funding.

We are currently seeking to hire full-time developers that share our vision and would like to help advance the idea of on-device inference. If you are interested and if you have already been a contributor to any of the related projects, please contact us at

Business inquiries

For any business-related topics, including support or enterprise deployment, please contact us at