Transcribe audio with Ruby using Universal-1

We recently announced our latest speech recognition model, Universal-1, which achieves state-of-the-art speech-to-text accuracy. Trained on millions of hours of audio data, Universal-1 demonstrates near-human accuracy, even with accented speech, background noise, and difficult phrases like flight numbers and email addresses.

Universal-1 is also an order of magnitude faster than our previous model, Conformer-2, and supports English, Spanish, French, and German, with more languages coming shortly.

Along with Universal-1, weâ€™ve also introduced two new classes of models: Best and Nano. Best lets you take advantage of Universal-1 for applications where accuracy is paramount. Nano is our new cost-effective alternative with support for 99 different languages.

In this post, youâ€™ll learn how to transcribe an audio file in your Ruby applications using Universal-1 and Nano.

Set up the AssemblyAI Ruby SDK

The easiest way to transcribe audio is by using one of our official SDKs.

To install the AssemblyAI Ruby SDK, add the gem to your bundle and install the bundle:

bundle add assemblyai
bundle install

Create a new file main.rb, and configure a new authenticated SDK client using your AssemblyAI API key from your account dashboard.

require ‘assemblyai’

client = AssemblyAI::Client.new(
api_key: ENV[‘ASSEMBLYAI_API_KEY’]
)

Youâ€™ll find all the operations you need on the AssemblyAI instance.

Transcribe an audio file using Universal-1

All transcriptions use the Best by default, so youâ€™ll always get the highest accuracy without any extra configuration.

Use the following code to transcribe an audio file from a URL using Best:

transcript = client.transcripts.transcribe(
audio_url: “https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3”
)

raise transcript.error unless transcript.error.nil?

puts transcript.text

If you instead want to transcribe a local file, you can upload the file to AssemblyAI and pass the uploaded file URL to the transcribe method:

uploaded_file = client.files.upload(file: ‘./audio.mp3’)

transcript = client.transcripts.transcribe(audio_url: uploaded_file.upload_url)

raise transcript.error unless transcript.error.nil?

puts transcript.text

To run your application, configure your ASSEMBLYAI_API_KEY as an environment variable, and use the following command to execute the code:

ruby main.rb

Nanoâ€”a cost-effective alternative

Switching between Best and Nano is only a matter of setting the speech model parameter. To use Nano, set the speech_model to AssemblyAI::Transcripts::SpeechModel::NANO:

transcript = client.transcripts.transcribe(
audio_url: “https://storage.googleapis.com/aai-web-samples/5_common_sports_injuries.mp3”,
speech_model: AssemblyAI::Transcripts::SpeechModel::NANO
)

Best, Nano and More with Audio Intelligence

We just used Universal-1 through both the Best and Nano class of models to transcribe audio.

Next, there are many further features that AssemblyAI offers beyond transcription to explore, such as:

Entity detection to automatically identify and categorize key information.
Content moderation for detecting inappropriate content in audio files to ensure that your content is safe for all audiences.
PII redaction to minimize sensitive information about individuals by automatically identifying and removing it from your transcript.
LeMUR for applying Large Language Models (LLMs) to audio data in a single line of code.

You can also learn more about our approach to creating superhuman Speech AI models on our Research page.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Transcribe audio with Ruby using Universal-1

Set up the AssemblyAI Ruby SDK

Transcribe an audio file using Universal-1

Nanoâ€”a cost-effective alternative

Best, Nano and More with Audio Intelligence

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

MegaAgent: A Practical AI Framework Designed for Autonomous Cooperation in Large-Scale LLM Agent Systems

Microsoft’s Windows 11 “Uninstall Edge” doc compares to Chrome, but won’t remove it

Automate the deployment of Amazon RDS for Db2 Instances with Terraform

Announcing the Web AI Acceleration Fund

Evaluation of generative AI techniques for clinical report summarization

New Guide Explains How to Eliminate the Risk of Shadow SaaS and Protect Corporate Data

Bookspotz World-Changing Individual and Group Training Courses

Securing Operational Technology: The Foundation of Modern Industrial Operations in META Region

Transcribe audio with Ruby using Universal-1

Set up the AssemblyAI Ruby SDK

Transcribe an audio file using Universal-1

Nanoâ€”a cost-effective alternative

Best, Nano and More with Audio Intelligence

Related Posts