Skip to content

ionic-team/capacitor-local-llm

Repository files navigation

@capacitor/local-llm

Warning

CapacitorLABS - This project is experimental. Support is not provided. Please open issues when needed.

Run large language models entirely on-device using Apple Intelligence (Foundation Models) on iOS and Gemini Nano on Android. No network requests, no API keys, no data leaving the device.

Note: On-device LLMs require physical hardware. Android emulators are not supported. iOS simulators are supported so long as the host device is capable of running Apple Intelligence and has it enabled.

Install

npm install @capacitor/local-llm
npx cap sync

Platform Requirements

Platform Minimum OS Notes
iOS 15 Image generation requires iOS 18.4+. Text LLM (Foundation Models / Apple Intelligence) requires iOS 26+.
Android 10 (API 29) Gemini Nano via ML Kit requires a device that supports on-device AI (e.g. Pixel 6+).

iOS Setup

No additional configuration is required. Foundation Models and Image Playground are system frameworks available automatically on supported devices with Apple Intelligence enabled.

Call systemAvailability() at runtime to check whether the model is ready before sending prompts.

On iOS 18, systemAvailability() returns 'unavailable' for the text LLM. If prompt() or warmup() are called anyway, the promise will reject with an error. Image generation via generateImage() is fully functional on iOS 18.4+.

Android Setup

Gemini Nano is distributed via Google Play Services and must be downloaded to the device before use. The model is not bundled with your app.

Check availability and download

Call systemAvailability() to inspect the current state. If the status is downloadable, trigger the download with download() and poll systemAvailability() until the status becomes available.

import { LocalLLM } from '@capacitor/local-llm';

const { status } = await LocalLLM.systemAvailability();

if (status === 'downloadable') {
  await LocalLLM.download();
  // Poll systemAvailability() until status === 'available'
}

Platform Limitations

iOS

  • Text LLM requires iOS 26 and Apple Intelligence. On iOS 18, systemAvailability() returns 'unavailable' for the text LLM and prompt() / warmup() will reject.
  • download() is not available on iOS. The model is managed by the OS; use systemAvailability() to check readiness.
  • Context limit is 4096 tokens. This applies to the combined length of system instructions, conversation history, and the current prompt.

Android

  • maximumOutputTokens is clamped to 1–256 by the ML Kit API. Values outside this range will be coerced.
  • Multi-turn session context is managed in-memory by manually assembling conversation history into each prompt. It is not a native session API and does not persist across app restarts.
  • warmup() ignores sessionId and promptPrefix on Android — it warms up the model globally.
  • Not all Android 10+ devices support Gemini Nano. The device must have a compatible on-device AI chip (e.g. Pixel 6 and later).
  • On-device models cannot be used while the app is in the background. Inference requests made while the app is backgrounded will fail.
  • AICore enforces an inference quota per app. Making too many requests in a short period will result in an BUSY error response — consider exponential backoff when retrying. An PER_APP_BATTERY_USE_QUOTA_EXCEEDED error can be returned if an app exceeds a longer-duration quota (e.g. a daily limit).

Usage

Basic prompt

import { LocalLLM } from '@capacitor/local-llm';

const { text } = await LocalLLM.prompt({
  prompt: 'Summarize the theory of relativity in one paragraph.',
});

console.log(text);

Multi-turn conversation

Use a sessionId to maintain context across multiple prompts.

import { LocalLLM } from '@capacitor/local-llm';

const sessionId = 'my-chat-session';

await LocalLLM.prompt({
  sessionId,
  instructions: 'You are a helpful assistant.',
  prompt: 'What is the capital of France?',
});

const { text } = await LocalLLM.prompt({
  sessionId,
  prompt: 'What is the population of that city?',
});

// Clean up when done
await LocalLLM.endSession({ sessionId });

Reduce first-response latency with warmup

import { LocalLLM } from '@capacitor/local-llm';

// Pre-initialize the model before the user starts typing
await LocalLLM.warmup({
  sessionId: 'my-session',
  promptPrefix: 'You are a customer support agent for Acme Corp.',
});

Image generation (iOS only)

import { LocalLLM } from '@capacitor/local-llm';

const { pngBase64Images } = await LocalLLM.generateImage({
  prompt: 'A serene mountain lake at sunrise, photorealistic',
  count: 2,
});

// Use directly in an <img> tag
const src = `data:image/png;base64,${pngBase64Images[0]}`;

API

The main plugin interface for interacting with on-device LLMs.

systemAvailability()

systemAvailability() => Promise<SystemAvailabilityResponse>

Checks the availability status of the on-device LLM.

Use this method to determine if the LLM is ready to use, needs to be downloaded, or is unavailable on the device.

Returns: Promise<SystemAvailabilityResponse>

Since: 1.0.0


download()

download() => Promise<void>

Downloads the on-device LLM model.

This method initiates the download of the LLM model when it's not already present on the device. Only available on Android.

Since: 1.0.0


prompt(...)

prompt(options: PromptOptions) => Promise<PromptResponse>

Sends a prompt to the on-device LLM and receives a response.

Use this method to interact with the LLM. You can optionally provide a sessionId to maintain conversation context across multiple prompts.

Param Type Description
options PromptOptions - The prompt options including the text prompt and optional configuration

Returns: Promise<PromptResponse>

Since: 1.0.0


endSession(...)

endSession(options: EndSessionOptions) => Promise<void>

Ends an active LLM session.

Use this method to clean up resources when you're done with a conversation session. This is important for managing memory and preventing resource leaks.

Param Type Description
options EndSessionOptions - The options containing the sessionId to end

Since: 1.0.0


generateImage(...)

generateImage(options: GenerateImageOptions) => Promise<GenerateImageResponse>

Generates images from a text prompt using the on-device LLM.

Use this method to create images based on text descriptions. Optionally provide reference images to influence the generation. The generated images are returned as base64-encoded PNG strings in an array.

Param Type Description
options GenerateImageOptions - The image generation options including the prompt, optional reference images, and count

Returns: Promise<GenerateImageResponse>

Since: 1.0.0


warmup(...)

warmup(options: WarmupOptions) => Promise<void>

Warms up the on-device LLM for faster initial responses.

Use this method to pre-initialize the LLM with a prompt prefix, reducing latency for the first actual prompt. This is useful when you know in advance the type of prompts you'll be sending.

Param Type Description
options WarmupOptions - The warmup options including the prompt prefix

Since: 1.0.0


addListener('systemAvailabilityChange', ...)

addListener(eventName: 'systemAvailabilityChange', listenerFunc: SystemAvailabilityChangeListener) => Promise<PluginListenerHandle>

Registers a listener that is called whenever the on-device LLM availability status changes.

The listener is invoked with the new availability status each time it changes. Polling begins when the first listener is added and stops when all listeners are removed via removeAllListeners().

Param Type Description
eventName 'systemAvailabilityChange' - The event name to listen for
listenerFunc SystemAvailabilityChangeListener - The callback invoked with the new availability status on each change

Returns: Promise<PluginListenerHandle>

Since: 1.0.0


removeAllListeners()

removeAllListeners() => Promise<void>

Interfaces

SystemAvailabilityResponse

Response containing the system availability status of the on-device LLM.

Prop Type Description Since
status LLMAvailability The current availability status of the LLM. 1.0.0

PromptResponse

Response from the LLM after processing a prompt.

Prop Type Description Since
text string The text response generated by the LLM. 1.0.0

PromptOptions

Options for sending a prompt to the LLM.

Prop Type Description Since
sessionId string Optional session identifier for maintaining conversation context. Provide the same sessionId across multiple prompts to maintain context. If not provided, each prompt is treated as independent. 1.0.0
instructions string System-level instructions to guide the LLM's behavior. Use this to set the role, tone, or constraints for the LLM's responses. 1.0.0
options LLMOptions Configuration options for controlling LLM inference behavior. 1.0.0
prompt string The text prompt to send to the LLM. 1.0.0

LLMOptions

Configuration options for LLM inference behavior.

Prop Type Description Since
temperature number Controls randomness in the model's output. Higher values (e.g., 0.8) make output more random, while lower values (e.g., 0.2) make it more focused and deterministic. 1.0.0
maximumOutputTokens number The maximum number of tokens to generate in the response. On Android, this must be between 1 and 256. 1.0.0

EndSessionOptions

Options for ending an active LLM session.

Prop Type Description Since
sessionId string The identifier of the session to end. This should match the sessionId used in previous prompt() calls. 1.0.0

GenerateImageResponse

Response containing the generated image data.

Prop Type Description Since
pngBase64Images string[] Array of generated images as base64-encoded PNG strings. Each string contains raw base64 data (without data URI prefix). To use in an img tag, prefix with 'data:image/png;base64,'. 1.0.0

GenerateImageOptions

Options for generating an image from a text prompt.

Prop Type Description Default Since
prompt string The text prompt describing the image to generate. 1.0.0
promptImages string[] Optional array of reference images to influence the generated output. Provide base64-encoded image strings (with or without data URI prefix) that will be used as visual context or inspiration for the image generation. This allows you to combine text and image concepts for more controlled output. 1.0.0
count number The number of image variations to generate. Defaults to 1 if not specified. 1 1.0.0

WarmupOptions

Options for warming up the on-device LLM.

Prop Type Description Since
sessionId string The session identifier for the warmup. This identifier will be associated with the warmed-up session, allowing you to use the same session for subsequent prompts. 1.0.0
promptPrefix string The prompt prefix to use for warming up the LLM. This text will be used to pre-initialize the model, reducing latency for subsequent prompts with similar prefixes. 1.0.0

PluginListenerHandle

Prop Type
remove () => Promise<void>

Type Aliases

LLMAvailability

Availability status of the on-device LLM.

'available' | 'unavailable' | 'notready' | 'downloadable' | 'responding'

SystemAvailabilityChangeListener

Callback invoked when the on-device LLM availability status changes.

(availability: LLMAvailability): void

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors