-->

Deepgram Launches Flux Conversational Speech Recognition Model

Deepgram, a voice artificial intelligence platform provider, has launched Flux, a conversational speech recognition model for real-time voice agents.

Unlike traditional automatic speech recognition (ASR), which was built for transcription use cases like captions or meeting notes, Flux is trained to understand the nuances of dialogue. It doesn't just capture what was said; it knows when a speaker has finished, when to respond, and how to keep the flow of conversation natural and engaging.

Flux embeds turn-taking directly into recognition. Its conversation-aware recognition handles timing inside the model itself, with context-aware turn detection and native barge-in handling for fluid exchanges. Flux also offers ultra-low latency with 260-millisecond end-of-turn detection, plus distinct events to support eager response generation before a turn is complete. Turn-complete transcripts and structured conversational cues replace client-side logic. Other features include Nova-3 level accuracy, GPU-efficient concurrency with 100+ streams per GPU, and predictable costs.

"Flux redefines what speech recognition can do for real-time AI," said Scott Stephenson, CEO and co-founder of Deepgram, in a statement. "For decades, ASR was built to listen and record. Flux is different; it listens, understands, and guides conversations with human-like timing. It's the foundation voice agents have been waiting for and is our latest milestone toward solving the Audio Turing Test."

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues