Redact Personal Identifiable Information (PII) from audio with Node.js

Personally Identifiable Information, or PII, is personal information about an individual that can be used to identify an individual. How and by whom PII is handled and accessed is regulated by laws such as HIPAA, GDPR, and CCPA.

Redacting PII from video and audio files is a common requirement for applicationsâ€”for example, a phone conversation between a doctor and a patient. Luckily, you can use AI to redact PII at scale.

In this tutorial, you’ll learn how to redact various PII categories, like medical conditions, email addresses, and credit card numbers, from audio and video files and their textual transcripts. Here you can see the final redacted transcript youâ€™ll create, as well as the associated redacted audio file:

Good afternoon, MGK design. Hi. I’m looking to have plans drawn up for an addition in my house. Okay, let me have one of our architects return your call. May I have your name, please? My name is ####. ####. And your last name? My last name is #####. Would you spell that for me, please? # # # # # #. Okay, and your telephone number? Area code? ###-###-#### that’s ###-###-#### yes, ma’am. Is there a good time to reach you? That’s my cell, so he could catch me anytime on that. Okay, great. I’ll have him return your call as soon as possible. Great. Thank you very much. You’re welcome. Bye.

Architecture call redacted

0:00

/49.031837

Step 1: Set up your development environment

First, install Node.js 18 or higher on your system.
Next, create a new project folder, change directories to it, and initialize a new Node.js project:

mkdir pii-redaction
cd pii-redaction
npm init -y

Open the package.json file and add type: “module”, to the list of properties.

{
…
“type”: “module”,
…
}

This will tell Node.js to use the ES Module syntax for exporting and importing modules, and not to use the old CommonJS syntax.

Then, install the AssemblyAI JavaScript SDK which makes it easier to interact with AssemblyAI API:

npm install –save assemblyai

Next, you need an AssemblyAI API key that you can find on your dashboard. If you don’t have an AssemblyAI account yet, you must first sign up. Once youâ€™ve copied your API key, configure it as the ASSEMBLYAI_API_KEY environment variable on your machine:

# Mac/Linux:
export ASSEMBLYAI_API_KEY=<YOUR_KEY>

# Windows:
set ASSEMBLYAI_API_KEY=<YOUR_KEY>

2. Transcribe your audio and video with PII redaction

Now that your development environment is ready, you can start transcribing your audio and video files. In this tutorial, you’ll use a short phone conversation between a man and an architecture firm. Youâ€™ll use the AssemblyAI SDK to transcribe an audio file that is publicly accessible via a URL. You can also specify video files or local files. Create a file called index.js and add the following code:

import { AssemblyAI } from ‘assemblyai’;

// create AssemblyAI API client
const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

// transcribe audio file with PII redaction enabled
const transcript = await client.transcripts.transcribe({
audio: “https://storage.googleapis.com/aai-web-samples/architecture-call.mp3”,
redact_pii: true,
redact_pii_policies: [
“person_name”,
“phone_number”,
],
redact_pii_sub: “hash”,
});

The code imports the AssemblyAI client class from the assemblyai module, instantiates the client with your API key, and finally transcribes the audio file with PII redaction configured. Here’s what the different options for client.transcripts.transcribe({…}) do:

audio: Configures the audio or video to transcribe using a URL, local path, stream, or buffer
redact_pii: Set to true to enable the PII redaction model.
redact_pii_policies: A list of PII policies to redact.
redact_pii_sub: What to substitute the PII within the transcript. This can be the entity_name or hash.

If everything goes well, the transcript object will be populated with the redacted transcript text and many additional properties. However, you should verify whether an error occurred and log the error.

Add the following lines of JavaScript:

// throw error if transcript status is error
if (transcript.status === “error”) {
throw new Error(transcript.error);
}

Now that youâ€™re certain that the transcript is completed without an error, you can print out the redacted transcript:

console.log(transcript.text);

3. Get the PII redacted audio

You can also get the original audio with the PII redacted. Whenever PII is detected, the audio will be replaced with a beep.

To do this, first, add two additional options to configure PII audio redaction:

import { AssemblyAI } from ‘assemblyai’;

// create AssemblyAI API client
const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

// throw error if transcript status is error
if (transcript.status === “error”) {
throw new Error(transcript.error);
}

console.log(transcript.text);

When you set redact_pii_audio to true, AssemblyAI will create a redacted audio file for you. The redact_pii_audio_quality lets you control the quality of the audio file and can be mp3 or wav.

The redacted audio file will be temporarily accessible via a pre-signed URL for you to download.
Add the following import which you’ll need to save the file to disk.

import { writeFile } from “fs/promises”

Then add the following code to get the redacted audio file URL and download the file to disk.

const { redacted_audio_url } = await client.transcripts.redactions(transcript.id);

const redactedFileResponse = await fetch(redacted_audio_url);
await writeFile(“./redacted-audio.mp3”, redactedFileResponse.body);

4. Run the script

To run the script, go back to your shell and run:

node index.js

After a short while, you’ll see the following output in the console.

As you can see, the specified PII policies were redacted in the transcript. You can see the unredacted transcript below for comparison:

Good afternoon, MGK design. Hi. I’m looking to have plans drawn up for an addition in my house. Okay, let me have one of our architects return your call. May I have your name, please? My name is John. John. And your last name? My last name is Lowry. Would you spell that for me, please? L o w e r y. Okay, and your telephone number? Area code? 610-265-1714 that’s 610-265-1714 yes, ma’am. Is there a good time to reach you? That’s my cell, so he could catch me anytime on that. Okay, great. I’ll have him return your call as soon as possible. Great. Thank you very much. You’re welcome. Bye.

You’ll also see a new file on disk redacted-audio.mp3. Give it a listen to hear the PII bleeped out.

Note that a redacted audio file will be returned, even if you submitted a video file for transcription. In this case, you can use a tool like FFmpeg to replace the original audio in the file with the redacted version.

Next steps

In this tutorial, you learned how to automatically redact PII from audio and video files using AssemblyAI and Node.js. You can check out our docs on PII redaction to learn more about it or browse some of our other AI models.
Alternatively, feel free to check out our blog or YouTube channel for educational content on AI and Machine Learning.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Redact Personal Identifiable Information (PII) from audio with Node.js

Step 1: Set up your development environment

2. Transcribe your audio and video with PII redaction

3. Get the PII redacted audio

4. Run the script

Next steps

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

tonysm/rich-text-laravel

TOYOTA AVALON VS CAMRY: WHICH SEDAN WINS?

Pentesters: Is AI Coming for Your Role?

Message-Passing Monte Carlo (MPMC): A New State-of-the-Art Machine Learning Model that Generates Low-Discrepancy Points

Talk to ChatGPT on a Phone Call

How we built the GitHub Skyline CLI extension using GitHub

Shaping the future of Booking â€“ Interview with Miranda Slayter, Principal Product Designer at Booking

SiloFuse: Transforming Synthetic Data Generation in Distributed Systems with Enhanced Privacy, Efficiency, and Data Utility

Redact Personal Identifiable Information (PII) from audio with Node.js

Step 1: Set up your development environment

2. Transcribe your audio and video with PII redaction

3. Get the PII redacted audio

4. Run the script

Next steps

Related Posts