ArchiveBox archives web content including audio/video files. Adding speech-to-text for archived audio content would make it searchable alongside text content. FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides:
- SenseVoice: Ultra-fast multilingual ASR (50x faster than Whisper-large)
- Paraformer: Production-grade ASR with timestamps and punctuation
- OpenAI-compatible API: POST /v1/audio/transcriptions
Use case: When archiving pages with embedded audio (podcasts, interviews, meeting recordings), FunASR can transcribe the audio content and add it to ArchiveBoxes searchable index. This makes audio content as discoverable as text content.
Since ArchiveBox is self-hosted and FunASR also runs locally, they integrate naturally without external API dependencies.
Would adding FunASR transcription for archived audio be useful?
ArchiveBox archives web content including audio/video files. Adding speech-to-text for archived audio content would make it searchable alongside text content. FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides:
Use case: When archiving pages with embedded audio (podcasts, interviews, meeting recordings), FunASR can transcribe the audio content and add it to ArchiveBoxes searchable index. This makes audio content as discoverable as text content.
Since ArchiveBox is self-hosted and FunASR also runs locally, they integrate naturally without external API dependencies.
Would adding FunASR transcription for archived audio be useful?