Introduction
llamadart is a Dart and Flutter plugin for running llama.cpp models with
GGUF files across native and web targets.
Who this is for
- App developers building local-first AI features in Dart/Flutter.
- Teams that need OpenAI-style HTTP compatibility from local models.
- Maintainers who need predictable native/web runtime integration.
Core primitives
LlamaEngine: stateless generation API.ChatSession: stateful chat wrapper overLlamaEngine.LlamaBackend: platform backend abstraction used by the engine.
Read by workflow
- First setup: Installation
- First inference: Quickstart
- Multi-turn chat: First Chat Session
- Function calling: Tool Calling
- Template diagnostics: Chat Templates and Parsing
- Template internals: Template Engine Internals
- LoRA runtime workflows: LoRA Adapters
- Performance work: Performance Tuning
- Platform/backend planning: Platform & Backend Matrix
- Upgrade planning: Upgrade Checklist
- Maintainer operations: Maintainer Overview