Tools / Audio Transcriber
Audio Transcriber icon

Audio Transcriber

Speech-to-text with speaker labels

Convert spoken audio into accurate text with AI speech recognition. Supports 99+ languages, speaker diarization, word-level timestamps, and audio event tagging. Perfect for meetings, interviews, podcasts, and lectures.

1 skillv0.02
Transcribe Audio

Transcribe an audio file from a URL to text using AI speech-to-text. Supports speaker diarization, word-level and character-level timestamps, audio event tagging, and automatic language detection for 99+ languages.

Returns: Full transcription text, word-level timing data with speaker labels, detected language, and confidence scores
Parameters
audio_url *stringPublicly accessible URL of the audio file to transcribe (MP3, WAV, M4A, FLAC, OGG, etc.)
language_codestringISO 639-1 language code (e.g. "en", "es", "fr"). If omitted, language is auto-detected.
diarizebooleanEnable speaker diarization to identify and label different speakers in the audio.
num_speakersintegerExpected number of speakers (1-32). Helps improve diarization accuracy when known.
timestamps_granularitystringGranularity of timestamps in the response. "word" returns per-word timing, "character" returns per-character timing.
tag_audio_eventsbooleanTag non-speech audio events such as laughter, applause, music, and background noise.
Example
Basic transcription of an audio file
curl -H "Authorization: Bearer $TOOLROUTER_API_KEY" \
  -d '{
  "tool": "audio-transcriber",
  "skill": "transcribe_audio",
  "input": {
    "audio_url": "https://example.com/podcast-episode.mp3"
  }
}' \
  https://api.toolrouter.com/v1/tools/call
Loading reviews...
Loading activity...
v0.022026-03-22
  • Added subtitle, expanded description, and agent instructions
v0.012026-03-20
  • Initial release

Quick Start

MCP (Claude Code)
claude mcp add --transport stdio \
  --env TOOLROUTER_API_KEY=YOUR_API_KEY \
  toolrouter -- npx -y toolrouter-mcp
REST API
curl -H "Authorization: Bearer $TOOLROUTER_API_KEY" \
  -d '{"tool":"audio-transcriber","skill":"transcribe_audio","input":{}}' \
  https://api.toolrouter.com/v1/tools/call

Use Cases

Open Transcribe Customer Support Calls

Transcribe Customer Support Calls

Convert customer support call recordings to searchable text for quality assurance, training, and compliance.

Audio Transcriber icon
Audio Transcriber
4 agent guides
Open Convert Voice Memos to Text

Convert Voice Memos to Text

Turn voice memos and dictated notes into written text for easy searching, sharing, and organizing.

Audio Transcriber icon
Audio Transcriber
4 agent guides
Open Dub Marketing Videos

Dub Marketing Videos

Translate and dub your marketing videos into multiple languages to reach international audiences.

Audio Dubber icon
Audio Dubber
4 agent guides
View all use cases for Audio Transcriber

Workflows

Open Audio Content Repurposing

Audio Content Repurposing

Transform audio content into multilingual text and audio formats, informed by social media trends.

Audio Transcriber icon
Audio Transcriber
Voice Generator icon
Voice Generator
Translate icon
Translate
Social Media Content icon
Social Media Content
4 steps4 tools
Open Interview Post-Production

Interview Post-Production

Streamline interview editing with speaker isolation, transcription, video editing, and custom graphics.

Audio Isolator icon
Audio Isolator
Audio Transcriber icon
Audio Transcriber
Video Editor icon
Video Editor
Generate Image icon
Generate Image
4 steps4 tools

Frequently Asked Questions

How many languages does it support?

It supports 99+ languages and can auto-detect the language from the audio.

Can I get speaker labels and timestamps?

Yes. Turn on diarization for speakers, and choose word-level timestamps when you need subtitle-style timing.

Can it tag non-speech audio like laughter or applause?

Yes. Enable audio event tagging to capture sounds like laughter, applause, and music.

Should I clean noisy audio first?

If the recording is messy, run it through `audio-isolator` first for better transcription quality.