Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 38 additions & 9 deletions server/services/stt/deepgram.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -190,15 +190,22 @@ The default `LiveOptions` are:

Parameters passed via the `params` constructor argument for `DeepgramFluxSTTService`.

| Parameter | Type | Default | Description |
| --------------------- | ------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `eager_eot_threshold` | `float` | `None` | EagerEndOfTurn threshold. Lower values trigger faster responses with more LLM calls; higher values are more conservative. `None` disables EagerEndOfTurn. |
| `eot_threshold` | `float` | `None` | End-of-turn confidence threshold (default 0.7). Lower = faster turn endings. |
| `eot_timeout_ms` | `int` | `None` | Time in ms after speech to finish a turn regardless of confidence (default 5000). |
| `keyterm` | `list` | `[]` | Key terms to boost recognition accuracy for specialized terminology. |
| `mip_opt_out` | `bool` | `None` | Opt out of Deepgram's Model Improvement Program. |
| `tag` | `list` | `[]` | Tags for request identification during usage reporting. |
| `min_confidence` | `float` | `None` | Minimum average confidence required to produce a `TranscriptionFrame`. |
| Parameter | Type | Default | Description | On-the-fly |
| --------------------- | ------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |
| `eager_eot_threshold` | `float` | `None` | EagerEndOfTurn threshold. Lower values trigger faster responses with more LLM calls; higher values are more conservative. `None` disables EagerEndOfTurn. | ✓ |
| `eot_threshold` | `float` | `None` | End-of-turn confidence threshold (default 0.7). Lower = faster turn endings. | ✓ |
| `eot_timeout_ms` | `int` | `None` | Time in ms after speech to finish a turn regardless of confidence (default 5000). | ✓ |
| `keyterm` | `list` | `[]` | Key terms to boost recognition accuracy for specialized terminology. | ✓ |
| `mip_opt_out` | `bool` | `None` | Opt out of Deepgram's Model Improvement Program. | |
| `tag` | `list` | `[]` | Tags for request identification during usage reporting. | |
| `min_confidence` | `float` | `None` | Minimum average confidence required to produce a `TranscriptionFrame`. | |

<Note>
Parameters marked with ✓ in the "On-the-fly" column can be updated mid-stream
using `STTUpdateSettingsFrame` without requiring a WebSocket reconnect. See
the [Updating Settings Mid-Stream](#updating-settings-mid-stream) example
below.
</Note>

### DeepgramSageMakerSTTService

Expand Down Expand Up @@ -292,6 +299,27 @@ stt = DeepgramFluxSTTService(
)
```

### Updating Settings Mid-Stream

The `keyterm`, `eot_threshold`, `eager_eot_threshold`, and `eot_timeout_ms` settings can be updated on-the-fly using `STTUpdateSettingsFrame`:

```python
from pipecat.frames.frames import STTUpdateSettingsFrame
from pipecat.services.deepgram.flux import DeepgramFluxSTTSettings

# During pipeline execution, update settings without reconnecting
await task.queue_frame(
STTUpdateSettingsFrame(
delta=DeepgramFluxSTTSettings(
eot_threshold=0.8,
keyterm=["Pipecat", "Deepgram"],
)
)
)
```

This sends a `Configure` message to Deepgram over the existing WebSocket connection, allowing you to adjust turn detection behavior and key terms without interrupting the conversation.

### SageMaker Service

```python
Expand All @@ -314,6 +342,7 @@ stt = DeepgramSageMakerSTTService(

- **Finalize on VAD stop**: When the pipeline's VAD detects the user has stopped speaking, `DeepgramSTTService` and `DeepgramSageMakerSTTService` send a [finalize](https://developers.deepgram.com/docs/finalize) request to Deepgram for faster final transcript delivery.
- **Flux turn management**: `DeepgramFluxSTTService` provides its own turn detection via `StartOfTurn`/`EndOfTurn` events and broadcasts `UserStartedSpeakingFrame`/`UserStoppedSpeakingFrame` directly. Use `ExternalUserTurnStrategies` to avoid conflicting VAD-based turn management.
- **Flux on-the-fly configuration**: `DeepgramFluxSTTService` supports updating `keyterm`, `eot_threshold`, `eager_eot_threshold`, and `eot_timeout_ms` mid-stream via `STTUpdateSettingsFrame`. These updates are sent as `Configure` messages over the existing WebSocket connection without requiring a reconnect.
- **EagerEndOfTurn**: In Flux, enabling `eager_eot_threshold` provides faster response times by predicting end-of-turn before it is confirmed. EagerEndOfTurn transcripts are pushed as `InterimTranscriptionFrame`s. If the user resumes speaking, a `TurnResumed` event is fired.
- **Deprecated vad_events**: The `vad_events` option in standard `DeepgramSTTService` is deprecated. Use Silero VAD instead.
- **SageMaker deployment**: The SageMaker service requires a Deepgram model deployed to an AWS SageMaker endpoint. See the [Deepgram SageMaker deployment guide](https://developers.deepgram.com/docs/deploy-amazon-sagemaker) for setup instructions.
Expand Down