LiveKit is a powerful platform for building real-time audio and video applications. They build on top of WebRTC to abstract away the complicated details of building real-time applications, allowing developers to rapidly build and deploy applications for video conferencing, livestreaming, interactive virtual events, and more.
Today, we’re excited to announce that AssemblyAI’s Streaming Speech-to-Text API is now available as an integration for LiveKit. This integration allows you to easily add real-time transcription to your LiveKit applications, enabling a wide range of use cases such as live captioning, transcription, and more.
The integration is part of LiveKit’s AI Agents framework. Once you instantiate an AssemblyAI agent with assemblyai.STT()
, you can create a stream
that allows your agent to send audio to and receive transcripts from AssemblyAI’s Streaming Speech-to-Text API in real-time.
From there you can do things like log the transcriptions on the server, or display them in your frontend application:
async def transcribe_track(participant: rtc.RemoteParticipant, track: rtc.Track):
audio_stream = rtc.AudioStream(track)
stt_impl = assemblyai.STT()
stt_stream = stt_impl.stream()
stt_forwarder = transcription.STTSegmentsForwarder(
room=ctx.room, participant=participant, track=track
)
# Run tasks for audio input and transcription output in parallel
# You decide what to do upon receiving transcriptions
await asyncio.gather(
_handle_audio_input(audio_stream, stt_stream),
_handle_transcription_output(stt_stream, stt_forwarder),
)
Check out our blog on How to add real-time Speech-to-Text to your LiveKit application to learn how you can start integrating AssemblyAI’s Streaming Speech-to-Text API with LiveKit today, or go directly to the Agent repository to check out the code.
You can browse all of our integrations on the Integrations page of our Docs.
Source: Read MoreÂ