Skip to content

Conversation

@Panmax
Copy link
Contributor

@Panmax Panmax commented Jan 29, 2026

Summary

  • Add denoise_audio and snr_threshold parameters to Google STT plugin to support audio denoising and SNR filtering
  • Upgrade google-cloud-speech dependency from >= 2 to >= 2.33 to enable DenoiserConfig support

Description

This PR adds support for Google Cloud Speech-to-Text's denoiser and SNR filtering features, which help improve transcription accuracy in noisy environments.

New parameters:

  • denoise_audio (bool): Enables audio denoising to reduce background noise such as music, rain, or street traffic. Note: cannot remove background human voices.
  • snr_threshold (float): Controls the minimum loudness of speech required for transcription. This helps filter out non-speech audio or background noise. Recommended values:
    • 10.0 - 100.0 when denoise_audio=True
    • 0.5 - 5.0 when denoise_audio=False

Usage example:

from livekit.plugins.google import STT

stt = STT(
    model="chirp_3",
    location="us",
    denoise_audio=True,
    snr_threshold=20.0,  # medium sensitivity
)

Changes

  • livekit-plugins/livekit-plugins-google/pyproject.toml: Updated google-cloud-speech version requirement
  • livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py:
    • Added denoise_audio and snr_threshold fields to STTOptions
    • Added build_denoiser_config() method to STTOptions
    • Updated STT.__init__() with new parameters
    • Updated _build_recognition_config() to include denoiser config for V2 API
    • Updated SpeechStream._build_streaming_config() to include denoiser config
    • Updated update_options() methods in both STT and SpeechStream classes

References

Summary by CodeRabbit

  • New Features

    • Added audio denoising controls for Google Speech-to-Text: enable/disable denoising and set signal-to-noise ratio threshold to reduce background noise during transcription.
  • Chores

    • Updated minimum Google Cloud Speech client version requirement to ensure compatibility with the new denoising features.

### Summary

- Add `denoise_audio` and `snr_threshold` parameters to Google STT plugin to support audio denoising and SNR filtering
- Upgrade `google-cloud-speech` dependency from `>= 2` to `>= 2.33` to enable `DenoiserConfig` support

### Description

This PR adds support for Google Cloud Speech-to-Text's denoiser and SNR filtering features, which help improve transcription accuracy in noisy environments.

**New parameters:**

- `denoise_audio` (bool): Enables audio denoising to reduce background noise such as music, rain, or street traffic. Note: cannot remove background human voices.
- `snr_threshold` (float): Controls the minimum loudness of speech required for transcription. This helps filter out non-speech audio or background noise. Recommended values:
  - `10.0 - 100.0` when `denoise_audio=True`
  - `0.5 - 5.0` when `denoise_audio=False`

**Usage example:**

```python
from livekit.plugins.google import STT

stt = STT(
    model="chirp_3",
    location="us",
    denoise_audio=True,
    snr_threshold=20.0,  # medium sensitivity
)
```

### Changes

- `livekit-plugins/livekit-plugins-google/pyproject.toml`: Updated `google-cloud-speech` version requirement
- `livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py`:
  - Added `denoise_audio` and `snr_threshold` fields to `STTOptions`
  - Added `build_denoiser_config()` method to `STTOptions`
  - Updated `STT.__init__()` with new parameters
  - Updated `_build_recognition_config()` to include denoiser config for V2 API
  - Updated `SpeechStream._build_streaming_config()` to include denoiser config
  - Updated `update_options()` methods in both `STT` and `SpeechStream` classes

### References

- [Google Cloud Speech-to-Text Chirp 3 Documentation](https://cloud.google.com/speech-to-text/docs/models/chirp-3)
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Added denoise_audio and snr_threshold options to Google STT interfaces and options; implemented STTOptions.build_denoiser_config(); updated V2 recognition and streaming config assembly to inject DenoiserConfig when present; bumped google-cloud-speech minimum to 2.33.

Changes

Cohort / File(s) Summary
Google STT Denoising & Options
livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
Added denoise_audio: NotGivenOr[bool] and snr_threshold: NotGivenOr[float] to STTOptions; new build_denoiser_config() returning `cloud_speech_v2.DenoiserConfig
Dependency Update
livekit-plugins/livekit-plugins-google/pyproject.toml
Raised google-cloud-speech lower bound to >= 2.33, < 3 to require denoiser-related types/fields.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant STT as STT
  participant SpeechStream as SpeechStream
  participant GoogleAPI as GoogleAPI

  Client->>STT: create/update (denoise_audio, snr_threshold)
  STT->>STT: update STTOptions with denoising fields
  STT->>SpeechStream: propagate updated options
  SpeechStream->>STT: request recognition config
  STT->>STT: build_denoiser_config() -> DenoiserConfig?
  STT->>GoogleAPI: StreamingRecognitionConfig (RecognitionConfig + optional DenoiserConfig)
  GoogleAPI-->>SpeechStream: streaming transcripts/events
  SpeechStream-->>Client: deliver transcripts
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • Hormold

Poem

🐰
Ears perk up for cleaner sound,
Hopping bytes now swirl around.
Denoise whispers, SNR in tow,
Transcripts clear where circuits go.
Hoppity-hop — the streams glow.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding denoiser support (including denoise_audio and snr_threshold parameters) to the Google STT plugin. It is concise, specific, and clearly reflects the primary objective of the changeset.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

🧪 Unit Test Generation v2 is now available!

We have significantly improved our unit test generation capabilities.

To enable: Add this to your .coderabbit.yaml configuration:

reviews:
  finishing_touches:
    unit_tests:
      enabled: true

Try it out by using the @coderabbitai generate unit tests command on your code files or under ✨ Finishing Touches on the walkthrough!

Have feedback? Share your thoughts on our Discord thread!


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Panmax Panmax changed the title ## Pull Request: Add denoiser support for Google STT plugin Add denoiser support for Google STT plugin Jan 29, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py`:
- Around line 323-343: The diagnostic warns about using a bare dict annotation
for recognition_config_kwargs; update its type to a parameterized Mapping/Dict
with concrete key/value types (e.g., Dict[str, Any] or Mapping[str, object]) so
mypy strict mode doesn't infer Any; locate the variable
recognition_config_kwargs in the function that constructs and returns
cloud_speech_v2.RecognitionConfig (the block that builds
recognition_config_kwargs and returns
RecognitionConfig(**recognition_config_kwargs)) and change the annotation there
(and similarly for the other dict usage around the 555-576 region) to a properly
parameterized type such as Dict[str, object] or Mapping[str, Any].
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46d7057 and fd9ff2c.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py (2)
livekit-agents/livekit/agents/utils/misc.py (1)
  • is_given (25-26)
livekit-agents/livekit/agents/stt/stt.py (1)
  • model (115-124)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: unit-tests
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
🔇 Additional comments (5)
livekit-plugins/livekit-plugins-google/livekit/plugins/google/stt.py (5)

81-84: Good addition of denoise/SNR fields with NOT_GIVEN defaults.
Keeps backward compatibility while exposing the new options.


119-134: DenoiserConfig builder cleanly scoped to V2 + optional inputs.
The guardrails look solid and avoid V1 misuse.


156-193: Constructor docs and config wiring for denoise/SNR look solid.
Clear parameter behavior and correct propagation into STTOptions.

Also applies to: 247-248


435-480: Option updates now propagate denoise/SNR to active streams.
Looks consistent with the rest of the option updates.


518-547: SpeechStream option updates correctly carry denoise/SNR and trigger reconnect.
The update path stays coherent with other config changes.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant