GGML - AI at the edge

ggml is a tensor library for machine learning to enable large models and high performance on commodity hardware. It is used by llama.cpp and whisper.cpp

Examples

Short voice command detection on a Raspberry Pi 4 using whisper.cpp

command-guided-0.gif

Simultaneously running 4 instances of 13B LLaMA + Whisper Small on a single M1 Pro

llama-podcast-1-final-lq.gif

Running 7B LLaMA at 40 tok/s on M2 Max

llama-podcast-1-final-lq.gif

Here are some sample performance stats on Apple Silicon June 2023:

The ggml way

Projects

Contributing

Company

ggml.ai is a company founded by Georgi Gerganov to support the development of ggml. Nat Friedman and Daniel Gross provided the pre-seed funding.

We are currently seeking to hire full-time developers that share our vision and would like to help advance the idea of on-device inference. If you are interested and if you have already been a contributor to any of the related projects, please contact us at jobs@ggml.ai

Business inquiries

For any business-related topics, including support or enterprise deployment, please contact us at sales@ggml.ai