With the explosion of audio and video content available online, it is hard to ensure this content does not include swear words and other profanity. With profanity detection AI models, developers can automatically filter out offensive language at scale.
In this tutorial, you’ll learn how to use Node.js to filter profanity from audio files. By the end of this guide, you’ll be equipped to implement this functionality in just a few lines of code, enhancing both user experience and content compliance.
Here is the audio file you will be running profanity filtering on, along with the filtered output, where the asterisks represent harmful speech that has automatically been filtered:
Filtering profanity from audio and video files is easy as s*** with AssemblyAI.
Step 1: Set up your environment
First, install Node.js 18 or higher on your system.
Next, create a new project folder, change directories to it, and initialize a new node project:
mkdir filter-profanity
cd filter-profanity
npm init -y
Open the package.json file and add type: “module”, to the list of properties.
{
…
“type”: “module”,
…
}
Then, install the AssemblyAI JavaScript SDK which lets you interact with AssemblyAI API more easily:
npm install –save assemblyai
Next, get a free AssemblyAI API key here; or, if you already have one, you can copy your API key from your dashboard. Once you’ve copied your API key, configure it as the ASSEMBLYAI_API_KEY environment variable on your machine:
# Mac/Linux:
export ASSEMBLYAI_API_KEY=<YOUR_KEY>
# Windows:
set ASSEMBLYAI_API_KEY=<YOUR_KEY>
Step 2: Transcribe and filter the audio file
Now that your environment is set up, you can submit an audio file for transcription with profanity filtering. For this tutorial, you’ll be using this example file. If you want to use your own file, you can use either a local file on your system or a remote file as long as it is a publicly accessible download URL. You can also use video files.
Create a file called index.js, and in the file, import the assemblyai package and create an AssemblyAI client.
import { AssemblyAI } from ‘assemblyai’;
// create AssemblyAI API client
const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });
Create a variable for the URL or the path to the audio file you want to filter profanity from:
// replace with local file path or your remote file
const audioFile = “https://storage.googleapis.com/aai-web-samples/profanity-filtering.mp3”
Transcribe the audio file with the filter_profanity option set to true:
// transcribe audio file with profanity filtering
const transcript = await client.transcripts.transcribe({
audio: audioFile,
filter_profanity: true
});
Step 3: Print the filtered text
You can print the profanity-filtered transcript text as follows:
// throw error if transcript status is error
if (transcript.status === “error”) {
throw new Error(transcript.error);
}
// print transcript text
console.log(transcript.text);
Save your file and execute it by running node index.js in the project directory.
You’ll see the profanity-filtered audio transcript printed to the terminal – if you used the default file from above you’ll see the following output printed to the terminal:
Filtering profanity from audio and video files is easy as s*** with AssemblyAI.
The transcript contains a lot more information about the transcribed audio file, like word-level timestamps and more, which you can access through the object’s properties. Check out the AssemblyAI docs to learn more about Transcript objects and the other information you can get back from the AssemblyAI API.
Source: Read MoreÂ