Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Redact Personal Identifiable Information (PII) from audio with Node.js

    Redact Personal Identifiable Information (PII) from audio with Node.js

    June 12, 2024

    Personally Identifiable Information, or PII, is personal information about an individual that can be used to identify an individual. How and by whom PII is handled and accessed is regulated by laws such as HIPAA, GDPR, and CCPA.

    Redacting PII from video and audio files is a common requirement for applications—for example, a phone conversation between a doctor and a patient. Luckily, you can use AI to redact PII at scale.

    In this tutorial, you’ll learn how to redact various PII categories, like medical conditions, email addresses, and credit card numbers, from audio and video files and their textual transcripts. Here you can see the final redacted transcript you’ll create, as well as the associated redacted audio file:

    Good afternoon, MGK design. Hi. I’m looking to have plans drawn up for an addition in my house. Okay, let me have one of our architects return your call. May I have your name, please? My name is ####. ####. And your last name? My last name is #####. Would you spell that for me, please? # # # # # #. Okay, and your telephone number? Area code? ###-###-#### that’s ###-###-#### yes, ma’am. Is there a good time to reach you? That’s my cell, so he could catch me anytime on that. Okay, great. I’ll have him return your call as soon as possible. Great. Thank you very much. You’re welcome. Bye.

    Architecture call redacted
    0:00
    /49.031837

    Step 1: Set up your development environment

    First, install Node.js 18 or higher on your system.
    Next, create a new project folder, change directories to it, and initialize a new Node.js project:

    mkdir pii-redaction
    cd pii-redaction
    npm init -y

    Open the package.json file and add type: “module”, to the list of properties.

    {
    …
    “type”: “module”,
    …
    }

    This will tell Node.js to use the ES Module syntax for exporting and importing modules, and not to use the old CommonJS syntax.

    Then, install the AssemblyAI JavaScript SDK which makes it easier to interact with AssemblyAI API:

    npm install –save assemblyai

    Next, you need an AssemblyAI API key that you can find on your dashboard. If you don’t have an AssemblyAI account yet, you must first sign up. Once you’ve copied your API key, configure it as the ASSEMBLYAI_API_KEY environment variable on your machine:

    # Mac/Linux:
    export ASSEMBLYAI_API_KEY=<YOUR_KEY>

    # Windows:
    set ASSEMBLYAI_API_KEY=<YOUR_KEY>

    2. Transcribe your audio and video with PII redaction

    Now that your development environment is ready, you can start transcribing your audio and video files. In this tutorial, you’ll use a short phone conversation between a man and an architecture firm. You’ll use the AssemblyAI SDK to transcribe an audio file that is publicly accessible via a URL. You can also specify video files or local files. Create a file called index.js and add the following code:

    import { AssemblyAI } from ‘assemblyai’;

    // create AssemblyAI API client
    const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

    // transcribe audio file with PII redaction enabled
    const transcript = await client.transcripts.transcribe({
    audio: “https://storage.googleapis.com/aai-web-samples/architecture-call.mp3”,
    redact_pii: true,
    redact_pii_policies: [
    “person_name”,
    “phone_number”,
    ],
    redact_pii_sub: “hash”,
    });

    The code imports the AssemblyAI client class from the assemblyai module, instantiates the client with your API key, and finally transcribes the audio file with PII redaction configured. Here’s what the different options for client.transcripts.transcribe({…}) do:

    audio: Configures the audio or video to transcribe using a URL, local path, stream, or buffer
    redact_pii: Set to true to enable the PII redaction model.
    redact_pii_policies: A list of PII policies to redact.
    redact_pii_sub: What to substitute the PII within the transcript. This can be the entity_name or hash.

    If everything goes well, the transcript object will be populated with the redacted transcript text and many additional properties. However, you should verify whether an error occurred and log the error.

    Add the following lines of JavaScript:

    // throw error if transcript status is error
    if (transcript.status === “error”) {
    throw new Error(transcript.error);
    }

    Now that you’re certain that the transcript is completed without an error, you can print out the redacted transcript:

    console.log(transcript.text);

    3. Get the PII redacted audio

    You can also get the original audio with the PII redacted. Whenever PII is detected, the audio will be replaced with a beep.

    To do this, first, add two additional options to configure PII audio redaction:

    import { AssemblyAI } from ‘assemblyai’;

    // create AssemblyAI API client
    const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

    // transcribe audio file with PII redaction enabled
    const transcript = await client.transcripts.transcribe({
    audio: “https://storage.googleapis.com/aai-web-samples/architecture-call.mp3”,
    redact_pii: true,
    redact_pii_policies: [
    “person_name”,
    “phone_number”,
    ],
    redact_pii_sub: “hash”,
    redact_pii_audio: true,
    redact_pii_audio_quality: “mp3”
    });

    // throw error if transcript status is error
    if (transcript.status === “error”) {
    throw new Error(transcript.error);
    }

    console.log(transcript.text);

    When you set redact_pii_audio to true, AssemblyAI will create a redacted audio file for you. The redact_pii_audio_quality lets you control the quality of the audio file and can be mp3 or wav.

    The redacted audio file will be temporarily accessible via a pre-signed URL for you to download.
    Add the following import which you’ll need to save the file to disk.

    import { writeFile } from “fs/promises”

    Then add the following code to get the redacted audio file URL and download the file to disk.

    const { redacted_audio_url } = await client.transcripts.redactions(transcript.id);

    const redactedFileResponse = await fetch(redacted_audio_url);
    await writeFile(“./redacted-audio.mp3”, redactedFileResponse.body);

    4. Run the script

    To run the script, go back to your shell and run:

    node index.js

    After a short while, you’ll see the following output in the console.

    Good afternoon, MGK design. Hi. I’m looking to have plans drawn up for an addition in my house. Okay, let me have one of our architects return your call. May I have your name, please? My name is ####. ####. And your last name? My last name is #####. Would you spell that for me, please? # # # # # #. Okay, and your telephone number? Area code? ###-###-#### that’s ###-###-#### yes, ma’am. Is there a good time to reach you? That’s my cell, so he could catch me anytime on that. Okay, great. I’ll have him return your call as soon as possible. Great. Thank you very much. You’re welcome. Bye.

    As you can see, the specified PII policies were redacted in the transcript. You can see the unredacted transcript below for comparison:

    Good afternoon, MGK design. Hi. I’m looking to have plans drawn up for an addition in my house. Okay, let me have one of our architects return your call. May I have your name, please? My name is John. John. And your last name? My last name is Lowry. Would you spell that for me, please? L o w e r y. Okay, and your telephone number? Area code? 610-265-1714 that’s 610-265-1714 yes, ma’am. Is there a good time to reach you? That’s my cell, so he could catch me anytime on that. Okay, great. I’ll have him return your call as soon as possible. Great. Thank you very much. You’re welcome. Bye.

    You’ll also see a new file on disk redacted-audio.mp3. Give it a listen to hear the PII bleeped out.

    Note that a redacted audio file will be returned, even if you submitted a video file for transcription. In this case, you can use a tool like FFmpeg to replace the original audio in the file with the redacted version.

    Next steps

    In this tutorial, you learned how to automatically redact PII from audio and video files using AssemblyAI and Node.js. You can check out our docs on PII redaction to learn more about it or browse some of our other AI models.
    Alternatively, feel free to check out our blog or YouTube channel for educational content on AI and Machine Learning.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleSkype receives an update with Snap AR Lenses integration
    Next Article Google’s Search Engine Experience (SGE) threatens to scale AI’s environmental impacts

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    tonysm/rich-text-laravel

    Development

    TOYOTA AVALON VS CAMRY: WHICH SEDAN WINS?

    Development

    Pentesters: Is AI Coming for Your Role?

    Development

    Message-Passing Monte Carlo (MPMC): A New State-of-the-Art Machine Learning Model that Generates Low-Discrepancy Points

    Development

    Highlights

    Artificial Intelligence

    Talk to ChatGPT on a Phone Call

    November 15, 2024

    The integration of voice communication and AI represents a big step forward in human-machine interaction…

    How we built the GitHub Skyline CLI extension using GitHub

    January 15, 2025

    Shaping the future of Booking – Interview with Miranda Slayter, Principal Product Designer at Booking

    November 7, 2024

    SiloFuse: Transforming Synthetic Data Generation in Distributed Systems with Enhanced Privacy, Efficiency, and Data Utility

    April 7, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.