Warning
CapacitorLABS - This project is experimental. Support is not provided. Please open issues when needed.
Run large language models entirely on-device using Apple Intelligence (Foundation Models) on iOS and Gemini Nano on Android. No network requests, no API keys, no data leaving the device.
Note: On-device LLMs require physical hardware. Android emulators are not supported. iOS simulators are supported so long as the host device is capable of running Apple Intelligence and has it enabled.
npm install @capacitor/local-llm
npx cap sync| Platform | Minimum OS | Notes |
|---|---|---|
| iOS | 15 | Image generation requires iOS 18.4+. Text LLM (Foundation Models / Apple Intelligence) requires iOS 26+. |
| Android | 10 (API 29) | Gemini Nano via ML Kit requires a device that supports on-device AI (e.g. Pixel 6+). |
No additional configuration is required. Foundation Models and Image Playground are system frameworks available automatically on supported devices with Apple Intelligence enabled.
Call systemAvailability() at runtime to check whether the model is ready before sending prompts.
On iOS 18, systemAvailability() returns 'unavailable' for the text LLM. If prompt() or warmup() are called anyway, the promise will reject with an error. Image generation via generateImage() is fully functional on iOS 18.4+.
Gemini Nano is distributed via Google Play Services and must be downloaded to the device before use. The model is not bundled with your app.
Call systemAvailability() to inspect the current state. If the status is downloadable, trigger the download with download() and poll systemAvailability() until the status becomes available.
import { LocalLLM } from '@capacitor/local-llm';
const { status } = await LocalLLM.systemAvailability();
if (status === 'downloadable') {
await LocalLLM.download();
// Poll systemAvailability() until status === 'available'
}- Text LLM requires iOS 26 and Apple Intelligence. On iOS 18,
systemAvailability()returns'unavailable'for the text LLM andprompt()/warmup()will reject. download()is not available on iOS. The model is managed by the OS; usesystemAvailability()to check readiness.- Context limit is 4096 tokens. This applies to the combined length of system instructions, conversation history, and the current prompt.
maximumOutputTokensis clamped to 1–256 by the ML Kit API. Values outside this range will be coerced.- Multi-turn session context is managed in-memory by manually assembling conversation history into each prompt. It is not a native session API and does not persist across app restarts.
warmup()ignoressessionIdandpromptPrefixon Android — it warms up the model globally.- Not all Android 10+ devices support Gemini Nano. The device must have a compatible on-device AI chip (e.g. Pixel 6 and later).
- On-device models cannot be used while the app is in the background. Inference requests made while the app is backgrounded will fail.
- AICore enforces an inference quota per app. Making too many requests in a short period will result in an
BUSYerror response — consider exponential backoff when retrying. AnPER_APP_BATTERY_USE_QUOTA_EXCEEDEDerror can be returned if an app exceeds a longer-duration quota (e.g. a daily limit).
import { LocalLLM } from '@capacitor/local-llm';
const { text } = await LocalLLM.prompt({
prompt: 'Summarize the theory of relativity in one paragraph.',
});
console.log(text);Use a sessionId to maintain context across multiple prompts.
import { LocalLLM } from '@capacitor/local-llm';
const sessionId = 'my-chat-session';
await LocalLLM.prompt({
sessionId,
instructions: 'You are a helpful assistant.',
prompt: 'What is the capital of France?',
});
const { text } = await LocalLLM.prompt({
sessionId,
prompt: 'What is the population of that city?',
});
// Clean up when done
await LocalLLM.endSession({ sessionId });import { LocalLLM } from '@capacitor/local-llm';
// Pre-initialize the model before the user starts typing
await LocalLLM.warmup({
sessionId: 'my-session',
promptPrefix: 'You are a customer support agent for Acme Corp.',
});import { LocalLLM } from '@capacitor/local-llm';
const { pngBase64Images } = await LocalLLM.generateImage({
prompt: 'A serene mountain lake at sunrise, photorealistic',
count: 2,
});
// Use directly in an <img> tag
const src = `data:image/png;base64,${pngBase64Images[0]}`;systemAvailability()download()prompt(...)endSession(...)generateImage(...)warmup(...)addListener('systemAvailabilityChange', ...)removeAllListeners()- Interfaces
- Type Aliases
The main plugin interface for interacting with on-device LLMs.
systemAvailability() => Promise<SystemAvailabilityResponse>Checks the availability status of the on-device LLM.
Use this method to determine if the LLM is ready to use, needs to be downloaded, or is unavailable on the device.
Returns: Promise<SystemAvailabilityResponse>
Since: 1.0.0
download() => Promise<void>Downloads the on-device LLM model.
This method initiates the download of the LLM model when it's not already present on the device. Only available on Android.
Since: 1.0.0
prompt(options: PromptOptions) => Promise<PromptResponse>Sends a prompt to the on-device LLM and receives a response.
Use this method to interact with the LLM. You can optionally provide a sessionId to maintain conversation context across multiple prompts.
| Param | Type | Description |
|---|---|---|
options |
PromptOptions |
- The prompt options including the text prompt and optional configuration |
Returns: Promise<PromptResponse>
Since: 1.0.0
endSession(options: EndSessionOptions) => Promise<void>Ends an active LLM session.
Use this method to clean up resources when you're done with a conversation session. This is important for managing memory and preventing resource leaks.
| Param | Type | Description |
|---|---|---|
options |
EndSessionOptions |
- The options containing the sessionId to end |
Since: 1.0.0
generateImage(options: GenerateImageOptions) => Promise<GenerateImageResponse>Generates images from a text prompt using the on-device LLM.
Use this method to create images based on text descriptions. Optionally provide reference images to influence the generation. The generated images are returned as base64-encoded PNG strings in an array.
| Param | Type | Description |
|---|---|---|
options |
GenerateImageOptions |
- The image generation options including the prompt, optional reference images, and count |
Returns: Promise<GenerateImageResponse>
Since: 1.0.0
warmup(options: WarmupOptions) => Promise<void>Warms up the on-device LLM for faster initial responses.
Use this method to pre-initialize the LLM with a prompt prefix, reducing latency for the first actual prompt. This is useful when you know in advance the type of prompts you'll be sending.
| Param | Type | Description |
|---|---|---|
options |
WarmupOptions |
- The warmup options including the prompt prefix |
Since: 1.0.0
addListener(eventName: 'systemAvailabilityChange', listenerFunc: SystemAvailabilityChangeListener) => Promise<PluginListenerHandle>Registers a listener that is called whenever the on-device LLM availability status changes.
The listener is invoked with the new availability status each time it changes. Polling
begins when the first listener is added and stops when all listeners are removed via
removeAllListeners().
| Param | Type | Description |
|---|---|---|
eventName |
'systemAvailabilityChange' |
- The event name to listen for |
listenerFunc |
SystemAvailabilityChangeListener |
- The callback invoked with the new availability status on each change |
Returns: Promise<PluginListenerHandle>
Since: 1.0.0
removeAllListeners() => Promise<void>Response containing the system availability status of the on-device LLM.
| Prop | Type | Description | Since |
|---|---|---|---|
status |
LLMAvailability |
The current availability status of the LLM. | 1.0.0 |
Response from the LLM after processing a prompt.
| Prop | Type | Description | Since |
|---|---|---|---|
text |
string |
The text response generated by the LLM. | 1.0.0 |
Options for sending a prompt to the LLM.
| Prop | Type | Description | Since |
|---|---|---|---|
sessionId |
string |
Optional session identifier for maintaining conversation context. Provide the same sessionId across multiple prompts to maintain context. If not provided, each prompt is treated as independent. | 1.0.0 |
instructions |
string |
System-level instructions to guide the LLM's behavior. Use this to set the role, tone, or constraints for the LLM's responses. | 1.0.0 |
options |
LLMOptions |
Configuration options for controlling LLM inference behavior. | 1.0.0 |
prompt |
string |
The text prompt to send to the LLM. | 1.0.0 |
Configuration options for LLM inference behavior.
| Prop | Type | Description | Since |
|---|---|---|---|
temperature |
number |
Controls randomness in the model's output. Higher values (e.g., 0.8) make output more random, while lower values (e.g., 0.2) make it more focused and deterministic. | 1.0.0 |
maximumOutputTokens |
number |
The maximum number of tokens to generate in the response. On Android, this must be between 1 and 256. | 1.0.0 |
Options for ending an active LLM session.
| Prop | Type | Description | Since |
|---|---|---|---|
sessionId |
string |
The identifier of the session to end. This should match the sessionId used in previous prompt() calls. | 1.0.0 |
Response containing the generated image data.
| Prop | Type | Description | Since |
|---|---|---|---|
pngBase64Images |
string[] |
Array of generated images as base64-encoded PNG strings. Each string contains raw base64 data (without data URI prefix). To use in an img tag, prefix with 'data:image/png;base64,'. | 1.0.0 |
Options for generating an image from a text prompt.
| Prop | Type | Description | Default | Since |
|---|---|---|---|---|
prompt |
string |
The text prompt describing the image to generate. | 1.0.0 | |
promptImages |
string[] |
Optional array of reference images to influence the generated output. Provide base64-encoded image strings (with or without data URI prefix) that will be used as visual context or inspiration for the image generation. This allows you to combine text and image concepts for more controlled output. | 1.0.0 | |
count |
number |
The number of image variations to generate. Defaults to 1 if not specified. | 1 |
1.0.0 |
Options for warming up the on-device LLM.
| Prop | Type | Description | Since |
|---|---|---|---|
sessionId |
string |
The session identifier for the warmup. This identifier will be associated with the warmed-up session, allowing you to use the same session for subsequent prompts. | 1.0.0 |
promptPrefix |
string |
The prompt prefix to use for warming up the LLM. This text will be used to pre-initialize the model, reducing latency for subsequent prompts with similar prefixes. | 1.0.0 |
| Prop | Type |
|---|---|
remove |
() => Promise<void> |
Availability status of the on-device LLM.
'available' | 'unavailable' | 'notready' | 'downloadable' | 'responding'
Callback invoked when the on-device LLM availability status changes.
(availability: LLMAvailability): void