Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Node.js Speech-to-Text with Punctuation, Casing, and Formatting

    Node.js Speech-to-Text with Punctuation, Casing, and Formatting

    May 29, 2024

    Automatically-generated transcripts from audio and video files are a lot more useful and readable when punctuation, casing, and formatting are added to the transcription result.

    Take this short segment for example. The text on top has no punctuation, casing, or formatting, and doesn’t filter out disfluencies. Meanwhile, the text at the bottom does have punctuation, casing, formatting, and no disfluencies.

    Notice the differences?

    The “ah” is a disfluency that was removed
    The beginning of sentences, I’s, and proper nouns are capitalized,
    Each sentence ends with a punctuation mark.

    In this tutorial, you’ll explore how to add punctuation, casing, and formatting to your transcripts using the AssemblyAI JavaScript SDK.

    Step 1: Set up your environment

    First, install Node.js 18 or higher on your system.
    Next, create a new project folder, change directories to it, and initialize a new node project:

    mkdir stt-formatting
    cd stt-formatting
    npm init -y

    Open the package.json file and add type: “module”, to the list of properties.

    {
    …
    “type”: “module”,
    …
    }

    Then, install the AssemblyAI JavaScript SDK which lets you interact with AssemblyAI API more easily:

    npm install –save assemblyai

    Next, get a free AssemblyAI API key here; or, if you already have one, you can copy your API key from your dashboard. Once you’ve copied your API key, configure it as the ASSEMBLYAI_API_KEY environment variable on your machine:

    # Mac/Linux:
    export ASSEMBLYAI_API_KEY=<YOUR_KEY>

    # Windows:
    set ASSEMBLYAI_API_KEY=<YOUR_KEY>

    Step 2: Transcribe and filter the audio file

    Now that your environment is set up, you can submit an audio file for transcription. For this tutorial, you’ll be using this example file. If you want to use your own file, you can use either a local file on your system or a remote file as long as it is a publicly accessible download URL. You can also use video files.

    Create a file called index.js, and in the file, import the assemblyai package and create an AssemblyAI client.

    import { AssemblyAI } from ‘assemblyai’;

    // create AssemblyAI API client
    const client = new AssemblyAI({ apiKey: process.env.ASSEMBLYAI_API_KEY });

    Create a variable for the URL or the path to the audio file you want to transcribe:

    // replace with local file path or your remote file
    const audioFile = “https://storage.googleapis.com/aai-docs-samples/espn.m4a”

    Transcribe the audio file with the following options:

    punctuate: true which adds punctuation,
    format_text: true which adds casing and formatting,
    disfluencies: false which removes disfluencies like “uhm”.

    // transcribe audio file with punctuation and text formatting and no disfluencies
    const transcript = await client.transcripts.transcribe({
    audio: audioFile,
    punctuate: true,
    format_text: true,
    disfluencies: false
    });

    You can reverse the options’ boolean values to get the raw unformatted transcript.

    Step 3: Print the filtered text

    You can print the formatted transcript text as follows:

    // throw error if transcript status is error
    if (transcript.status === “error”) {
    throw new Error(transcript.error);
    }

    // print transcript text
    console.log(transcript.text);

    Save your file and execute it by running node index.js in the project directory.

    What’s next

    There are a lot more options you can configure when creating a transcript, and the transcript object also contains a lot more information about the transcribed audio file, like word-level timestamps and more, which you can access through the object’s properties. Check out the AssemblyAI docs to learn more about Transcript Parameters and the Transcript objects and the other information you can get back from the AssemblyAI API. Additionally, you can retrieve the transcript segmented by paragraphs which further enhances how you present the transcript to your users.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleLessons from the Front: Indexing Content Hub in Coveo
    Next Article How does Data Engineering in Retail Maximize Efficiency?

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    Neglected Domains Used in Malspam to Evade SPF and DMARC Security Protections

    Development

    CVE-2023-53146 – “Linux Media DW2102 Null Pointer Dereference Vulnerability”

    Common Vulnerabilities and Exposures (CVEs)

    Your Android phone is getting an anti-theft upgrade, thanks to AI. How it works

    Development

    CVE-2025-44854 – Totolink CP900 Command Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    A Step-by-Step Coding Guide to Defining Custom Model Context Protocol (MCP) Server and Client Tools with FastMCP and Integrating Them into Google Gemini 2.0’s Function‑Calling Workflow

    April 21, 2025

    In this Colab‑ready tutorial, we demonstrate how to integrate Google’s Gemini 2.0 generative AI with…

    Highlights from Our ISMS Event at Hyderabad

    November 2, 2024

    Upgrading your Windows laptop? This affordable Dell model is my top pick for work

    May 16, 2025

    Microsoft Excel now lets users translate and detect the language of their texts

    June 28, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.