imonoonoko.BitLlama 0.16.0

Pure Rust LLM inference engine with 1.58-bit ternary support and Test-Time Training

BitLlama is a Pure Rust LLM inference engine featuring 1.58-bit ternary quantization, Test-Time Training (TTT), Soul learning system, MCP server/client, and private RAG. Supports Llama, Gemma, Mistral, Qwen, and BitNet models. OpenAI-compatible API server included.

Command Line

imonoonoko.BitLlama 0.16.0

Command Line

Download Links For Version 0.16.0

Download Links For Version 1.0.0

Download Links For Version 0.15.0

Info

Dependencies

imonoonoko.BitLlama 0.16.0

Command Line

Download Links For Version 0.16.0

Download Links For Version 1.0.0

Download Links For Version 0.15.0

Info

Dependencies

Share