@capacitor/local-llm

Warning

CapacitorLABS - This project is experimental. Support is not provided. Please open issues when needed.

Run large language models entirely on-device using Apple Intelligence (Foundation Models) on iOS and Gemini Nano on Android. No network requests, no API keys, no data leaving the device.

Note: On-device LLMs require physical hardware. Android emulators are not supported. iOS simulators are supported so long as the host device is capable of running Apple Intelligence and has it enabled.

Install

npm install @capacitor/local-llm
npx cap sync

Platform Requirements

Platform	Minimum OS	Notes
iOS	15	Image generation requires iOS 18.4+. Text LLM (Foundation Models / Apple Intelligence) requires iOS 26+.
Android	10 (API 29)	Gemini Nano via ML Kit requires a device that supports on-device AI (e.g. Pixel 6+).

iOS Setup

No additional configuration is required. Foundation Models and Image Playground are system frameworks available automatically on supported devices with Apple Intelligence enabled.

Call systemAvailability() at runtime to check whether the model is ready before sending prompts.

On iOS 18, systemAvailability() returns 'unavailable' for the text LLM. If prompt() or warmup() are called anyway, the promise will reject with an error. Image generation via generateImage() is fully functional on iOS 18.4+.

Android Setup

Gemini Nano is distributed via Google Play Services and must be downloaded to the device before use. The model is not bundled with your app.

Check availability and download

Call systemAvailability() to inspect the current state. If the status is downloadable, trigger the download with download() and poll systemAvailability() until the status becomes available.

import { LocalLLM } from '@capacitor/local-llm';

const { status } = await LocalLLM.systemAvailability();

if (status === 'downloadable') {
  await LocalLLM.download();
  // Poll systemAvailability() until status === 'available'
}

Platform Limitations

iOS

Text LLM requires iOS 26 and Apple Intelligence. On iOS 18, systemAvailability() returns 'unavailable' for the text LLM and prompt() / warmup() will reject.
download() is not available on iOS. The model is managed by the OS; use systemAvailability() to check readiness.
Context limit is 4096 tokens. This applies to the combined length of system instructions, conversation history, and the current prompt.

Android

maximumOutputTokens is clamped to 1–256 by the ML Kit API. Values outside this range will be coerced.
Multi-turn session context is managed in-memory by manually assembling conversation history into each prompt. It is not a native session API and does not persist across app restarts.
warmup() ignores sessionId and promptPrefix on Android — it warms up the model globally.
Not all Android 10+ devices support Gemini Nano. The device must have a compatible on-device AI chip (e.g. Pixel 6 and later).
On-device models cannot be used while the app is in the background. Inference requests made while the app is backgrounded will fail.
AICore enforces an inference quota per app. Making too many requests in a short period will result in an BUSY error response — consider exponential backoff when retrying. An PER_APP_BATTERY_USE_QUOTA_EXCEEDED error can be returned if an app exceeds a longer-duration quota (e.g. a daily limit).

Usage

Basic prompt

import { LocalLLM } from '@capacitor/local-llm';

const { text } = await LocalLLM.prompt({
  prompt: 'Summarize the theory of relativity in one paragraph.',
});

console.log(text);

Multi-turn conversation

Use a sessionId to maintain context across multiple prompts.

import { LocalLLM } from '@capacitor/local-llm';

const sessionId = 'my-chat-session';

await LocalLLM.prompt({
  sessionId,
  instructions: 'You are a helpful assistant.',
  prompt: 'What is the capital of France?',
});

const { text } = await LocalLLM.prompt({
  sessionId,
  prompt: 'What is the population of that city?',
});

// Clean up when done
await LocalLLM.endSession({ sessionId });

Reduce first-response latency with warmup

import { LocalLLM } from '@capacitor/local-llm';

// Pre-initialize the model before the user starts typing
await LocalLLM.warmup({
  sessionId: 'my-session',
  promptPrefix: 'You are a customer support agent for Acme Corp.',
});

Image generation (iOS only)

import { LocalLLM } from '@capacitor/local-llm';

const { pngBase64Images } = await LocalLLM.generateImage({
  prompt: 'A serene mountain lake at sunrise, photorealistic',
  count: 2,
});

// Use directly in an <img> tag
const src = `data:image/png;base64,${pngBase64Images[0]}`;

API

systemAvailability()
download()
prompt(...)
endSession(...)
generateImage(...)
warmup(...)
addListener('systemAvailabilityChange', ...)
removeAllListeners()
Interfaces
Type Aliases

The main plugin interface for interacting with on-device LLMs.

systemAvailability()

systemAvailability() => Promise<SystemAvailabilityResponse>

Checks the availability status of the on-device LLM.

Use this method to determine if the LLM is ready to use, needs to be downloaded, or is unavailable on the device.

Returns: Promise<SystemAvailabilityResponse>

Since: 1.0.0

download()

download() => Promise<void>

Downloads the on-device LLM model.

This method initiates the download of the LLM model when it's not already present on the device. Only available on Android.

Since: 1.0.0

prompt(...)

prompt(options: PromptOptions) => Promise<PromptResponse>

Sends a prompt to the on-device LLM and receives a response.

Use this method to interact with the LLM. You can optionally provide a sessionId to maintain conversation context across multiple prompts.

Param	Type	Description
`options`	`PromptOptions`	- The prompt options including the text prompt and optional configuration

Returns: Promise<PromptResponse>

Since: 1.0.0

endSession(...)

endSession(options: EndSessionOptions) => Promise<void>

Ends an active LLM session.

Use this method to clean up resources when you're done with a conversation session. This is important for managing memory and preventing resource leaks.

Param	Type	Description
`options`	`EndSessionOptions`	- The options containing the sessionId to end

Since: 1.0.0

generateImage(...)

generateImage(options: GenerateImageOptions) => Promise<GenerateImageResponse>

Generates images from a text prompt using the on-device LLM.

Use this method to create images based on text descriptions. Optionally provide reference images to influence the generation. The generated images are returned as base64-encoded PNG strings in an array.

Param	Type	Description
`options`	`GenerateImageOptions`	- The image generation options including the prompt, optional reference images, and count

Returns: Promise<GenerateImageResponse>

Since: 1.0.0

warmup(...)

warmup(options: WarmupOptions) => Promise<void>

Warms up the on-device LLM for faster initial responses.

Use this method to pre-initialize the LLM with a prompt prefix, reducing latency for the first actual prompt. This is useful when you know in advance the type of prompts you'll be sending.

Param	Type	Description
`options`	`WarmupOptions`	- The warmup options including the prompt prefix

Since: 1.0.0

addListener('systemAvailabilityChange', ...)

addListener(eventName: 'systemAvailabilityChange', listenerFunc: SystemAvailabilityChangeListener) => Promise<PluginListenerHandle>

Registers a listener that is called whenever the on-device LLM availability status changes.

The listener is invoked with the new availability status each time it changes. Polling begins when the first listener is added and stops when all listeners are removed via removeAllListeners().

Param	Type	Description
`eventName`	`'systemAvailabilityChange'`	- The event name to listen for
`listenerFunc`	`SystemAvailabilityChangeListener`	- The callback invoked with the new availability status on each change

Returns: Promise<PluginListenerHandle>

Since: 1.0.0

removeAllListeners()

removeAllListeners() => Promise<void>

Interfaces

SystemAvailabilityResponse

Response containing the system availability status of the on-device LLM.

Prop	Type	Description	Since
`status`	`LLMAvailability`	The current availability status of the LLM.	1.0.0

PromptResponse

Response from the LLM after processing a prompt.

Prop	Type	Description	Since
`text`	`string`	The text response generated by the LLM.	1.0.0

PromptOptions

Options for sending a prompt to the LLM.

Prop	Type	Description	Since
`sessionId`	`string`	Optional session identifier for maintaining conversation context. Provide the same sessionId across multiple prompts to maintain context. If not provided, each prompt is treated as independent.	1.0.0
`instructions`	`string`	System-level instructions to guide the LLM's behavior. Use this to set the role, tone, or constraints for the LLM's responses.	1.0.0
`options`	`LLMOptions`	Configuration options for controlling LLM inference behavior.	1.0.0
`prompt`	`string`	The text prompt to send to the LLM.	1.0.0

LLMOptions

Configuration options for LLM inference behavior.

Prop	Type	Description	Since
`temperature`	`number`	Controls randomness in the model's output. Higher values (e.g., 0.8) make output more random, while lower values (e.g., 0.2) make it more focused and deterministic.	1.0.0
`maximumOutputTokens`	`number`	The maximum number of tokens to generate in the response. On Android, this must be between 1 and 256.	1.0.0

EndSessionOptions

Options for ending an active LLM session.

Prop	Type	Description	Since
`sessionId`	`string`	The identifier of the session to end. This should match the sessionId used in previous prompt() calls.	1.0.0

GenerateImageResponse

Response containing the generated image data.

Prop	Type	Description	Since
`pngBase64Images`	`string[]`	Array of generated images as base64-encoded PNG strings. Each string contains raw base64 data (without data URI prefix). To use in an img tag, prefix with 'data:image/png;base64,'.	1.0.0

GenerateImageOptions

Options for generating an image from a text prompt.

Prop	Type	Description	Default	Since
`prompt`	`string`	The text prompt describing the image to generate.		1.0.0
`promptImages`	`string[]`	Optional array of reference images to influence the generated output. Provide base64-encoded image strings (with or without data URI prefix) that will be used as visual context or inspiration for the image generation. This allows you to combine text and image concepts for more controlled output.		1.0.0
`count`	`number`	The number of image variations to generate. Defaults to 1 if not specified.	`1`	1.0.0

WarmupOptions

Options for warming up the on-device LLM.

Prop	Type	Description	Since
`sessionId`	`string`	The session identifier for the warmup. This identifier will be associated with the warmed-up session, allowing you to use the same session for subsequent prompts.	1.0.0
`promptPrefix`	`string`	The prompt prefix to use for warming up the LLM. This text will be used to pre-initialize the model, reducing latency for subsequent prompts with similar prefixes.	1.0.0

PluginListenerHandle

Prop	Type
`remove`	`() => Promise<void>`

Type Aliases

LLMAvailability

Availability status of the on-device LLM.

'available' | 'unavailable' | 'notready' | 'downloadable' | 'responding'

SystemAvailabilityChangeListener

Callback invoked when the on-device LLM availability status changes.

(availability: LLMAvailability): void

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
android		android
example-app		example-app
ios		ios
src		src
.eslintignore		.eslintignore
.gitignore		.gitignore
.prettierignore		.prettierignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
CapacitorLocalLlm.podspec		CapacitorLocalLlm.podspec
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md
commitlint.config.mjs		commitlint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
rollup.config.mjs		rollup.config.mjs
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

@capacitor/local-llm

Install

Platform Requirements

iOS Setup

Android Setup

Check availability and download

Platform Limitations

iOS

Android

Usage

Basic prompt

Multi-turn conversation

Reduce first-response latency with warmup

Image generation (iOS only)

API

systemAvailability()

download()

prompt(...)

endSession(...)

generateImage(...)

warmup(...)

addListener('systemAvailabilityChange', ...)

removeAllListeners()

Interfaces

SystemAvailabilityResponse

PromptResponse

PromptOptions

LLMOptions

EndSessionOptions

GenerateImageResponse

GenerateImageOptions

WarmupOptions

PluginListenerHandle

Type Aliases

LLMAvailability

SystemAvailabilityChangeListener

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages