Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 5, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 5, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 5, 2025

      In MCP era API discoverability is now more important than ever

      June 5, 2025

      Google’s DeepMind CEO lists 2 AGI existential risks to society keeping him up at night — but claims “today’s AI systems” don’t warrant a pause on development

      June 5, 2025

      Anthropic researchers say next-generation AI models will reduce humans to “meat robots” in a spectrum of crazy futures

      June 5, 2025

      Xbox just quietly added two of the best RPGs of all time to Game Pass

      June 5, 2025

      7 reasons The Division 2 is a game you should be playing in 2025

      June 5, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Mastering TypeScript: How Complex Should Your Types Be?

      June 5, 2025
      Recent

      Mastering TypeScript: How Complex Should Your Types Be?

      June 5, 2025

      IDMC – CDI Best Practices

      June 5, 2025

      PWC-IDMC Migration Gaps

      June 5, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Google’s DeepMind CEO lists 2 AGI existential risks to society keeping him up at night — but claims “today’s AI systems” don’t warrant a pause on development

      June 5, 2025
      Recent

      Google’s DeepMind CEO lists 2 AGI existential risks to society keeping him up at night — but claims “today’s AI systems” don’t warrant a pause on development

      June 5, 2025

      Anthropic researchers say next-generation AI models will reduce humans to “meat robots” in a spectrum of crazy futures

      June 5, 2025

      Xbox just quietly added two of the best RPGs of all time to Game Pass

      June 5, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Transcribe and generate subtitles for YouTube videos with Node.js

    Transcribe and generate subtitles for YouTube videos with Node.js

    June 24, 2024

    This guide will teach you how to transcribe YouTube videos with Node.js and AssemblyAI. After creating the transcript, you’ll learn how to generate SRT subtitles, and lastly, you’ll use LeMUR to prompt the video using a Large Language Model (LLM).

    Step 1: Set up your development environment

    First, install Node.js 18 or higher on your system.
    Next, create a new project folder, change directories to it, and initialize a new Node.js project:

    mkdir transcribe-youtube-video
    cd transcribe-youtube-video
    npm init -y

    Open the package.json file and add type: “module”, to the list of properties.

    {
    …
    “type”: “module”,
    …
    }

    This will tell Node.js to use the ES Module syntax for exporting and importing modules, and not to use the old CommonJS syntax.

    Then, install the necessary NPM modules:

    assemblyai installs the AssemblyAI JavaScript SDK makes it easier to interact with the AssemblyAI API.
    youtube-dl-exec wraps the yt-dlp CLI tool which lets you retrieve information about YouTube videos and download them.
    tsx lets you execute TypeScript code without additional setup

    npm install –save assemblyai youtube-dl-exec tsx

    You must also install Python 3.7 or above on your system as python3, because it is required by youtube-dl-exec.

    Next, you need an AssemblyAI API key that you can find on your dashboard. If you don’t have an AssemblyAI account, first sign up for free. Once you’ve copied your API key, configure it as the ASSEMBLYAI_API_KEY environment variable on your machine:

    # Mac/Linux:
    export ASSEMBLYAI_API_KEY=<YOUR_KEY>

    # Windows:
    set ASSEMBLYAI_API_KEY=<YOUR_KEY>

    You can find the full source code of this application in this GitHub repository.

    Step 2. Retrieve the audio of a YouTube video

    To transcribe a video with AssemblyAI, you either need a public URL to the video file or upload the video file to AssemblyAI. Although, you only need the audio track of a video to generate a transcript, so you can also use a public URL to the audio track, or upload the audio to AssemblyAI.

    YouTube stores the audio and the video of a YouTube video in separate files, which you can retrieve in different formats and quality. The easiest way to retrieve the formats is using the yt-dlp CLI tool. The youtube-dl-exec module you installed wraps the yt-dlp CLI tool so you can retrieve this information from Node.js.

    Create a file called index.ts and add the following code:

    import { youtubeDl } from “youtube-dl-exec”;

    const youtubeVideoUrl = “https://www.youtube.com/watch?v=wtolixa9XTg”;

    console.log(“Retrieving audio URL from YouTube video”);
    const videoInfo = await youtubeDl(youtubeVideoUrl, {
    dumpSingleJson: true,
    preferFreeFormats: true,
    addHeader: [“referer:youtube.com”, “user-agent:googlebot”],
    });

    const audioUrl = videoInfo.formats.reverse().find(
    (format) => format.resolution === “audio only” && format.ext === “m4a”,
    )?.url;

    if (!audioUrl) {
    throw new Error(“No audio only format found”);
    }
    console.log(“Audio URL retrieved successfully”);
    console.log(“Audio URL:”, audioUrl);

    This script retrieves all the information about the YouTube video and stores it in the videoInfo variable.
    The formats property lists all the available video formats, ordered from worst to best quality. The script reverses the formats array so the best quality comes first, then looks for the first “audio only” format with m4a extension, and takes that format’s url property.

    Now that you have the audio URL of the YouTube video, you can transcribe the audio using AssemblyAI.

    At the top of index.ts, import the AssemblyAI class from the assemblyai module:

    import { AssemblyAI } from ‘assemblyai’;

    Then append the following code at the end of the index.ts file:

    console.log(“Transcribing audio”);
    const aaiClient = new AssemblyAI({
    apiKey: process.env.ASSEMBLYAI_API_KEY!,
    });

    const transcript = await aaiClient.transcripts.transcribe({
    // can also accept videos and local files
    audio: audioUrl,
    });

    The code sends the audio to AssemblyAI for transcription. If the transcription is successful, the transcript object will be populated with the transcript text and many additional properties. However, you should verify whether an error occurred and log the error.

    Add the following code to check if an error occurred:

    if (transcript.status === “error”) {
    throw new Error(“Transcription failed: ” + transcript.error);
    }

    console.log(“Transcription complete”);

    Step 3. Save the transcript and subtitles

    Now that you have a transcript, you can save the transcript text to a file. Add the following import which you’ll need to save files to disk.

    import { writeFile } from “fs/promises”

    Then add the following code to save the transcript to disk.

    console.log(“Saving transcript to file”);
    await writeFile(“./transcript.txt”, transcript.text!);
    console.log(“Transcript saved to file transcript.txt”);

    You can also generate SRT subtitles from the transcript and save it to disk like this:

    console.log(“Retrieving transcript as SRT subtitles”);
    const subtitles = await aaiClient.transcripts.subtitles(transcript.id, “srt”);
    await writeFile(“./subtitles.srt”, subtitles);
    console.log(“Subtitles saved to file subtitles.srt”);

    WebVTT Subtitle Format

    WebVTT file or Web Video Text to Track File is another widely supported and popular subtitle format. To generate WebVTT, replace “srt” with “vtt”, and save the file with the vtt-extension.

    Step 4. Run the script

    To run the script, go back to your shell and run:

    npx tsx index.ts

    After a little while you’ll see the transcript text and subtitles appear on your disk. This will take longer if the YouTube video is longer.

    Bonus: Prompt a YouTube video using LeMUR

    AssemblyAI makes it very easy to build generative AI features using our LLM framework called LeMUR.
    You can write a prompt to tell the LLM what to do with a given transcript and the LLM will generate a response.
    For example, you can write a prompt that tells LeMUR to summarize the video using bullet points.

    console.log(“Prompting LeMUR to summarize the video”);

    const prompt = “Summarize this video using bullet points”;
    const lemurResponse = await aaiClient.lemur.task({
    transcript_ids: [transcript.id],
    prompt,
    final_model: “default”
    });
    console.log(prompt + “: ” + lemurResponse.response);

    You can find the various supported models listed in the LeMUR documentation.

    If you add this code and run the script again, you’ll get a generated summary that looks like this:

    Here is a bullet point summary of the key points from the video:

    – Lay the math foundation with Khan Academy courses on basics like linear algebra, calculus, statistics etc. Come back later to fill gaps.

    – Learn Python – do a beginner and intermediate level course to get a solid base. Python skills are essential.

    – Learn key machine learning Python libraries like NumPy, Pandas, Matplotlib. Follow a crash course for each.

    – Do Andrew Ng’s machine learning specialization course on Coursera. Recently updated to include Python and libraries like NumPy, Scikit-learn, TensorFlow.

    – Implement some algorithms from scratch in Python to better understand concepts. An updated ML from scratch course will be released.

    – Do Kaggle’s intro and intermediate ML courses to learn more data preparation with Pandas.

    – Practice on Kaggle with competitions and datasets. Helps build portfolio and CV. Focus on learning over winning.

    – Specialize as per industry requirements in CV, NLP etc. Look at job descriptions. Consider learning MLOps.

    – Start a blog to write tutorials and share your projects. Helps cement knowledge and build CV.

    – Useful books referenced: Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow, Machine Learning Yearning by Andrew Ng.

    Next steps

    In this tutorial, you learned how to retrieve the audio file from a YouTube video, how to transcribe the audio file and generate subtitles, and finally, how to summarize the YouTube video using LeMUR.

    Check out our Audio Intelligence models and LeMUR to add even more capabilities to your audio and video applications.

    Alternatively, feel free to check out our blog or YouTube channel for educational content on AI and Machine Learning, or feel free to join us on Twitter or Discord to stay in the loop when we release new content.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleWorld’s biggest music labels shock the AI and music industries with landmark lawsuit
    Next Article Last Week in AI #276 – Claude 3.5 and Artifacts, Perplexity Bots, Sycophancy to subterfuge

    Related Posts

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-48906 – DSoftBus Authentication Bypass Vulnerability

    June 6, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-48907 – Apache IPC Deserialization Vulnerability

    June 6, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Amap – Gather Info in Easy Way

    Learning Resources

    Use Llama 3.1 405B for synthetic data generation and distillation to fine-tune smaller models

    Development

    MIT spinout maps the body’s metabolites to uncover the hidden drivers of disease

    Artificial Intelligence

    CVE-2025-2407 – Mobatime AMX MTAPI IIS Authentication Bypass

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    News & Updates

    Error’d: Tomorrow

    March 7, 2025

    It’s only a day away! Punctual Robert F. never procrastinates. But I think now would…

    Elden Ring DLC busted seamless co-op mod at launch, but Shadow of the Erdtree players can look forward to a fix soon

    June 21, 2024

    CVE-2025-3935 – ScreenConnect ASP.NET ViewState Code Injection Vulnerability

    April 25, 2025

    Atomfall seems like Fallout at first, but its masterful gameplay is more like Prey

    March 21, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.