imonoonoko.BitLlama 0.16.0

Pure Rust LLM inference engine with 1.58-bit ternary support and Test-Time Training

BitLlama is a Pure Rust LLM inference engine featuring 1.58-bit ternary quantization, Test-Time Training (TTT), Soul learning system, MCP server/client, and private RAG. Supports Llama, Gemma, Mistral, Qwen, and BitNet models. OpenAI-compatible API server included.

Command Line

Info

  • last updated 2/18/2026 12:00:00 AM
  • Publisher: imonoonoko
  • License: MIT

Dependencies

No dependency information