Dart + Flutter local inference runtime

Build offline-ready AI features with llamadart

Documentation for product engineers and maintainers shipping local LLM features across Android, iOS, macOS, Linux, Windows, and web.

Read Documentation API on pub.dev

Single Dart API across native and browser targets
GGUF model lifecycle and streaming-first generation
OpenAI-compatible local server example included

AndroidiOSmacOSLinuxWindowsWeb

Quick start examples

terminal
dart pub add llamadart

minimal generation
import 'package:llamadart/llamadart.dart';

Future<void> main() async {
  final LlamaEngine engine = LlamaEngine(LlamaBackend());

  try {
    await engine.loadModel('path/to/model.gguf');
    await for (final token in engine.generate('Hello from llamadart')) {
      print(token);
    }
  } finally {
    await engine.dispose();
  }
}

For OpenAI-compatible HTTP flows, start from llamadart_server.

Start here

Choose a path based on what you are shipping

Start in 10 minutes

Install, load a GGUF model, and stream your first response.

Open quickstart

Ship chat and tools

Build tool calling, structured chat prompts, and streaming UX.

Read guides

Tune for production

Choose backends and tune context/runtime parameters.

Tune runtime

Run OpenAI-style server

Expose local models over HTTP for existing OpenAI clients.

See server example

Core guides

Reference docs for real production workflows

Model lifecycle

Predictable loading/unloading flow and resource cleanup patterns.

Lifecycle guide

Generation and streaming

Token streaming patterns for CLI apps, servers, and Flutter UIs.

Streaming guide

Multimodal

Image + text prompting with platform-specific constraints.

Multimodal guide

Platform matrix

Understand native/web support boundaries before shipping.

Support matrix

Performance tuning

Tune context length, threads, and generation settings safely.

Tuning guide

Troubleshooting

Fast fixes for model loading, runtime, and platform issues.

Debug issues

Maintainers

llamadart-specific maintenance and release operations

Maintainer overview

Repository ownership map and routine responsibilities.

Maintainer docs

Runtime ownership

Where to change native runtime, web bridge, and assets.

Ownership boundaries

Release checklist

Versioning, docs cut, and post-release verification sequence.

Release workflow