GitHub KoboldCPP — Single-file executable for running GGUF models. Focused on story generation but general-purpose. llamafile (Mozilla) — Distributable single-file executables that run LLMs. No ...
Production-grade KV-cache and weight quantization for llama.cpp, with cross-backend kernel support for Apple Silicon, NVIDIA CUDA, AMD ROCm, and Vulkan.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results