Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Representative Line: Brace Yourself

      September 18, 2025

      Beyond the Pilot: A Playbook for Enterprise-Scale Agentic AI

      September 18, 2025

      GitHub launches MCP Registry to provide central location for trusted servers

      September 18, 2025

      MongoDB brings Search and Vector Search to self-managed versions of database

      September 18, 2025

      Distribution Release: Security Onion 2.4.180

      September 18, 2025

      Distribution Release: Omarchy 3.0.1

      September 17, 2025

      Distribution Release: Mauna Linux 25

      September 16, 2025

      Distribution Release: SparkyLinux 2025.09

      September 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      AI Momentum and Perficient’s Inclusion in Analyst Reports – Highlights From 2025 So Far

      September 18, 2025
      Recent

      AI Momentum and Perficient’s Inclusion in Analyst Reports – Highlights From 2025 So Far

      September 18, 2025

      Shopping Portal using Python Django & MySQL

      September 17, 2025

      Perficient Earns Adobe’s Real-time CDP Specialization

      September 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Valve Survey Reveals Slight Retreat in Steam-on-Linux Share

      September 18, 2025
      Recent

      Valve Survey Reveals Slight Retreat in Steam-on-Linux Share

      September 18, 2025

      Review: Elecrow’s All-in-one Starter Kit for Pico 2

      September 18, 2025

      FOSS Weekly #25.38: GNOME 49 Release, KDE Drama, sudo vs sudo-rs, Local AI on Android and More Linux Stuff

      September 18, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»How to Build a Conversational AI Chatbot with Stream Chat and React

    How to Build a Conversational AI Chatbot with Stream Chat and React

    June 17, 2025

    Modern chat applications are increasingly incorporating voice input capabilities because they offer a more engaging and versatile user experience. This also improves accessibility, allowing users with different needs to interact more comfortably with such applications.

    In this tutorial, I’ll guide you through the process of creating a conversational AI application that integrates real-time chat functionality with voice recognition. By leveraging Stream Chat for robust messaging and the Web Speech API for speech to text conversion, you’ll build a multi-faceted chat application that supports both voice and text interaction.

    Table of Contents

    • Prerequisites

    • Sneak Peek

    • Core Technologies

    • Backend Implementation Guide

      • Project Setup
    • Frontend Implementation Guide

      • Project Setup

      • Understanding the App Component

      • Adding AI to the Channel

      • Configuring the ChannelHeader

      • Adding an AI State Indicator

      • Building the Speech to Text Functionality

    • Complete Process Flow

    • Conclusion

    Prerequisites

    Before we begin, ensure you have the following:

    • A Stream account with an API key and secret (Read on how to get them here)

    • Access to an LLM API (like OpenAI, Anthropic).

    • Node.js and npm/yarn installed.

    • Basic knowledge of React and TypeScript.

    • Modern browser with WebSpeech API support (like Chrome, Edge)

    Sneak Peek

    Let’s take a quick look at the app we’ll be building in this tutorial. This way, you get a feel for what it does before we jump into the details.

    5228ae93-ff56-4b0f-8ea8-c7a160973191

    If you’re now excited, let’s get straight into it!

    Core Technologies

    This application is powered by three main players: Stream Chat, the Web Speech API, and a Node.js + Express backend.

    Stream Chat is a platform that helps you easily build and integrate rich, real-time chat and messaging experiences into your applications. It offers a variety of SDKs (Software Development Kits) for different platforms (like Android, iOS, React) and pre-built UI components to streamline development. Its robustness and engaging chat functionality make it a great choice for this app – we don’t need to build anything from scratch.

    Web Speech API is a browser standard that allows you to integrate voice input and output into your apps, enabling features like speech recognition (converting spoken speech to text) and speech synthesis (converting text to speech). We’ll use the speech recognition feature in this project.

    The Node.js + Express backend manages correct agent instantiation and processes the conversational responses generated by our LLM API.

    Backend Implementation Guide

    Let’s begin with our backend, the engine room – where user input is routed to the appropriate AI model, and a processed response is returned. Our backend supports multiple AI models, specifically OpenAI and Anthropic.

    Project Setup

    1. Create a folder, call it ‘My-Chat-Application’.

    2. Clone this Github repository

    3. After cloning, rename the folder to ‘backend’

    4. Open the .env.example file and provide the necessary keys (you’ll need to provide either the OpenAI or Anthropic key – the Open Weather key is optional).

    5. Rename the env.examplefile to .env

    6. Install dependencies by running this command:

       npm install
      
    7. Run the project by entering this command:

       npm start
      

      Your backend should be running smoothly on localhost:3000.

    Frontend Implementation Guide

    This section explores two broad, interrelated components: the chat structure and speech recognition.

    Project Setup

    We will be creating and setting up our React project with the Stream Chat React SDK. We’ll use Vite with the TypeScript template. To do that, navigate to your My-Chat-Application folder, open your terminal and enter this command:

    npm create vite frontend -- --template react-ts
    cd chat-example
    npm i stream-chat stream-chat-react
    

    With our frontend project set up, we can now run the app:

    npm run dev
    

    Understanding the App Component

    The main focus here is to initialize a chat client, connect a user, create a channel, and render the chat interface. We’ll go through all these processes step by step to help you understand them better:

    Define Constants

    First, we need to provide some important credentials that we need for user creation and chat client setup. You can find these credentials on your Stream dashboard.

    <span class="hljs-keyword">const</span> apiKey = <span class="hljs-string">"xxxxxxxxxxxxx"</span>;
    <span class="hljs-keyword">const</span> userId = <span class="hljs-string">"111111111"</span>;
    <span class="hljs-keyword">const</span> userName = <span class="hljs-string">"John Doe"</span>;
    <span class="hljs-keyword">const</span> userToken = <span class="hljs-string">"xxxxxxxxxx.xxxxxxxxxxxx.xx_xxxxxxx-xxxxx_xxxxxxxx"</span>; <span class="hljs-comment">//your stream secret key</span>
    

    Note: These are dummy credentials. Make sure to use your own credentials.

    Create a User

    Next, we need to create a user object. We’ll create it using an ID, name and a generated avatar URL:

    <span class="hljs-keyword">const</span> user: User = {
      <span class="hljs-attr">id</span>: userId,
      <span class="hljs-attr">name</span>: userName,
      <span class="hljs-attr">image</span>: <span class="hljs-string">`https://getstream.io/random_png/?name=<span class="hljs-subst">${userName}</span>`</span>,
    };
    

    Setup a Client

    We need to track the state of the active chat channel using the useState hook to ensure seamless real-time messaging in this Stream Chat application. A custom hook called useCreateChatClient initializes the chat client with an API key, user token, and user data:

      <span class="hljs-keyword">const</span> [channel, setChannel] = useState<StreamChannel>();
      <span class="hljs-keyword">const</span> client = useCreateChatClient({
        apiKey,
        <span class="hljs-attr">tokenOrProvider</span>: userToken,
        <span class="hljs-attr">userData</span>: user,
      });
    

    Initialize Channel

    Now, we initialize a messaging channel to enable real-time communication in the Stream Chat application. When the chat client is ready, the useEffect hook triggers the creation of a messaging channel named my_channel, adding the user as a member. This channel is then stored in the channel state, ensuring that the app is primed for dynamic conversation rendering.

      useEffect(<span class="hljs-function">() =></span> {
        <span class="hljs-keyword">if</span> (!client) <span class="hljs-keyword">return</span>;
        <span class="hljs-keyword">const</span> channel = client.channel(<span class="hljs-string">"messaging"</span>, <span class="hljs-string">"my_channel"</span>, {
          <span class="hljs-attr">members</span>: [userId],
        });
    
        setChannel(channel);
      }, [client]);
    

    Render Chat Interface

    With all the integral parts of our chat application all set up, we’ll return a JSX to define the chat interface’s structure and components:

     <span class="hljs-keyword">if</span> (!client) <span class="hljs-keyword">return</span> <span class="xml"><span class="hljs-tag"><<span class="hljs-name">div</span>></span>Setting up client & connection...<span class="hljs-tag"></<span class="hljs-name">div</span>></span></span>;
    
      <span class="hljs-keyword">return</span> (
        <span class="xml"><span class="hljs-tag"><<span class="hljs-name">Chat</span> <span class="hljs-attr">client</span>=<span class="hljs-string">{client}</span>></span>
          <span class="hljs-tag"><<span class="hljs-name">Channel</span> <span class="hljs-attr">channel</span>=<span class="hljs-string">{channel}</span>></span>
            <span class="hljs-tag"><<span class="hljs-name">Window</span>></span>
              <span class="hljs-tag"><<span class="hljs-name">MessageList</span> /></span>
              <span class="hljs-tag"><<span class="hljs-name">MessageInput</span> /></span>
            <span class="hljs-tag"></<span class="hljs-name">Window</span>></span>
            <span class="hljs-tag"><<span class="hljs-name">Thread</span> /></span>
          <span class="hljs-tag"></<span class="hljs-name">Channel</span>></span>
        <span class="hljs-tag"></<span class="hljs-name">Chat</span>></span></span>
      );
    

    In this JSX structure:

    • If the client is not ready, it displays a “Setting up client & connection…” message.

    • Once the client is ready, it renders the chat interface using:

      • <Chat>: Wraps the Stream Chat context with the initialized client.

      • <Channel>: Sets the active channel.

      • <Window>: Contains the main chat UI components:

        • <MessageList>: Displays the list of messages.

        • <MessageInput>: Uses a custom CustomMessageInput for sending messages.

      • <Thread>: Renders threaded replies.

    With this, we’ve set up our chat interface and channel, and we have a client ready. Here’s what our interface looks like so far:

    stream chat interface

    Adding AI to the Channel

    Remember, this chat application is designed to interact with an AI, so we need to be able to both add and remove the AI from the channel. On the UI, we’ll add a button in the channel header to enable users add and remove AI. But we still need to determine whether or not we already have it in the channel to know which option to display.

    For that we’ll create a custom hook called useWatchers. It monitors the presence of the AI using a concept called watchers:

    <span class="hljs-keyword">import</span> { useCallback, useEffect, useState } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>;
    <span class="hljs-keyword">import</span> { Channel } <span class="hljs-keyword">from</span> <span class="hljs-string">'stream-chat'</span>;
    
    <span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> useWatchers = <span class="hljs-function">(<span class="hljs-params">{ channel }: { channel: Channel }</span>) =></span> {
      <span class="hljs-keyword">const</span> [watchers, setWatchers] = useState<string[]>([]);
      <span class="hljs-keyword">const</span> [error, setError] = useState<<span class="hljs-built_in">Error</span> | <span class="hljs-literal">null</span>>(<span class="hljs-literal">null</span>);
    
      <span class="hljs-keyword">const</span> queryWatchers = useCallback(<span class="hljs-keyword">async</span> () => {
        setError(<span class="hljs-literal">null</span>);
    
        <span class="hljs-keyword">try</span> {
          <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> channel.query({ <span class="hljs-attr">watchers</span>: { <span class="hljs-attr">limit</span>: <span class="hljs-number">5</span>, <span class="hljs-attr">offset</span>: <span class="hljs-number">0</span> } });
          setWatchers(result?.watchers?.map(<span class="hljs-function">(<span class="hljs-params">watcher</span>) =></span> watcher.id).filter((id): id is string => id !== <span class="hljs-literal">undefined</span>) || [])
          <span class="hljs-keyword">return</span>;
        } <span class="hljs-keyword">catch</span> (err) {
          setError(err <span class="hljs-keyword">as</span> <span class="hljs-built_in">Error</span>);
        }
      }, [channel]);
    
      useEffect(<span class="hljs-function">() =></span> {
        queryWatchers();
      }, [queryWatchers]);
    
      useEffect(<span class="hljs-function">() =></span> {
        <span class="hljs-keyword">const</span> watchingStartListener = channel.on(<span class="hljs-string">'user.watching.start'</span>, <span class="hljs-function">(<span class="hljs-params">event</span>) =></span> {
          <span class="hljs-keyword">const</span> userId = event?.user?.id;
          <span class="hljs-keyword">if</span> (userId && userId.startsWith(<span class="hljs-string">'ai-bot'</span>)) {
            setWatchers(<span class="hljs-function">(<span class="hljs-params">prevWatchers</span>) =></span> [
              userId,
              ...(prevWatchers || []).filter(<span class="hljs-function">(<span class="hljs-params">watcherId</span>) =></span> watcherId !== userId),
            ]);
          }
        });
    
        <span class="hljs-keyword">const</span> watchingStopListener = channel.on(<span class="hljs-string">'user.watching.stop'</span>, <span class="hljs-function">(<span class="hljs-params">event</span>) =></span> {
          <span class="hljs-keyword">const</span> userId = event?.user?.id;
          <span class="hljs-keyword">if</span> (userId && userId.startsWith(<span class="hljs-string">'ai-bot'</span>)) {
            setWatchers(<span class="hljs-function">(<span class="hljs-params">prevWatchers</span>) =></span>
              (prevWatchers || []).filter(<span class="hljs-function">(<span class="hljs-params">watcherId</span>) =></span> watcherId !== userId)
            );
          }
        });
    
        <span class="hljs-keyword">return</span> <span class="hljs-function">() =></span> {
          watchingStartListener.unsubscribe();
          watchingStopListener.unsubscribe();
        };
      }, [channel]);
    
      <span class="hljs-keyword">return</span> { watchers, error };
    };
    

    Configuring the ChannelHeader

    We can now build a new channel header component by utilizing the useChannelStateContext hook to access the channel and initialize the custom useWatchers hook. Using the watchers’ data, we define an aiInChannel variable to display relevant text. Based on this variable, we invoke either the start-ai-agent or stop-ai-agent endpoint on the Node.js backend.

    <span class="hljs-keyword">import</span> { useChannelStateContext } <span class="hljs-keyword">from</span> <span class="hljs-string">'stream-chat-react'</span>;
    <span class="hljs-keyword">import</span> { useWatchers } <span class="hljs-keyword">from</span> <span class="hljs-string">'./useWatchers'</span>;
    
    <span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">ChannelHeader</span>(<span class="hljs-params"></span>) </span>{
      <span class="hljs-keyword">const</span> { channel } = useChannelStateContext();
      <span class="hljs-keyword">const</span> { watchers } = useWatchers({ channel });
    
      <span class="hljs-keyword">const</span> aiInChannel =
        (watchers ?? []).filter(<span class="hljs-function">(<span class="hljs-params">watcher</span>) =></span> watcher.includes(<span class="hljs-string">'ai-bot'</span>)).length > <span class="hljs-number">0</span>;
      <span class="hljs-keyword">return</span> (
        <span class="xml"><span class="hljs-tag"><<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">'my-channel-header'</span>></span>
          <span class="hljs-tag"><<span class="hljs-name">h2</span>></span>{(channel?.data as { name?: string })?.name ?? 'Voice-and-Text AI Chat'}<span class="hljs-tag"></<span class="hljs-name">h2</span>></span>
          <span class="hljs-tag"><<span class="hljs-name">button</span> <span class="hljs-attr">onClick</span>=<span class="hljs-string">{addOrRemoveAgent}</span>></span>
            {aiInChannel ? 'Remove AI' : 'Add AI'}
          <span class="hljs-tag"></<span class="hljs-name">button</span>></span>
        <span class="hljs-tag"></<span class="hljs-name">div</span>></span></span>
      );
    
      <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">addOrRemoveAgent</span>(<span class="hljs-params"></span>) </span>{
        <span class="hljs-keyword">if</span> (!channel) <span class="hljs-keyword">return</span>;
        <span class="hljs-keyword">const</span> endpoint = aiInChannel ? <span class="hljs-string">'stop-ai-agent'</span> : <span class="hljs-string">'start-ai-agent'</span>;
        <span class="hljs-keyword">await</span> fetch(<span class="hljs-string">`http://127.0.0.1:3000/<span class="hljs-subst">${endpoint}</span>`</span>, {
          <span class="hljs-attr">method</span>: <span class="hljs-string">'POST'</span>,
          <span class="hljs-attr">headers</span>: { <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span> },
          <span class="hljs-attr">body</span>: <span class="hljs-built_in">JSON</span>.stringify({ <span class="hljs-attr">channel_id</span>: channel.id, <span class="hljs-attr">platform</span>: <span class="hljs-string">'openai'</span> }),
        });
      }
    }
    

    Adding an AI State Indicator

    AIs take a bit of time to process information, so while the AI is processing, we add an indicator to reflect its status. We create a AIStateIndicator that does that for us:

    <span class="hljs-keyword">import</span> { AIState } <span class="hljs-keyword">from</span> <span class="hljs-string">'stream-chat'</span>;
    <span class="hljs-keyword">import</span> { useAIState, useChannelStateContext } <span class="hljs-keyword">from</span> <span class="hljs-string">'stream-chat-react'</span>;
    
    <span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">MyAIStateIndicator</span>(<span class="hljs-params"></span>) </span>{
      <span class="hljs-keyword">const</span> { channel } = useChannelStateContext();
      <span class="hljs-keyword">const</span> { aiState } = useAIState(channel);
      <span class="hljs-keyword">const</span> text = textForState(aiState);
      <span class="hljs-keyword">return</span> text && <span class="xml"><span class="hljs-tag"><<span class="hljs-name">p</span> <span class="hljs-attr">className</span>=<span class="hljs-string">'my-ai-state-indicator'</span>></span>{text}<span class="hljs-tag"></<span class="hljs-name">p</span>></span></span>;
    
      <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">textForState</span>(<span class="hljs-params">aiState: AIState</span>): <span class="hljs-title">string</span> </span>{
        <span class="hljs-keyword">switch</span> (aiState) {
          <span class="hljs-keyword">case</span> <span class="hljs-string">'AI_STATE_ERROR'</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">'Something went wrong...'</span>;
          <span class="hljs-keyword">case</span> <span class="hljs-string">'AI_STATE_CHECKING_SOURCES'</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">'Checking external resources...'</span>;
          <span class="hljs-keyword">case</span> <span class="hljs-string">'AI_STATE_THINKING'</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">"I'm currently thinking..."</span>;
          <span class="hljs-keyword">case</span> <span class="hljs-string">'AI_STATE_GENERATING'</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">'Generating an answer for you...'</span>;
          <span class="hljs-keyword">default</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">''</span>;
        }
      }
    }
    

    Building the Speech to Text Functionality

    Up to this point, we have a functional chat application that sends messages and receives feedback from an AI. Now, we want to enable voice interaction, allowing users to speak to the AI instead of typing manually.

    To achieve this, we’ll set up speech-to-text functionality within a CustomMessageInput component. Let’s walk through the entire process, step by step, to understand how to achieve it.

    Initial States Configuration

    When the CustomMessageInput component first mounts, it begins by establishing its foundational state structure:

      <span class="hljs-keyword">const</span> [isRecording, setIsRecording] = useState<boolean>(<span class="hljs-literal">false</span>);
      <span class="hljs-keyword">const</span> [isRecognitionReady, setIsRecognitionReady] = useState<boolean>(<span class="hljs-literal">false</span>);
      <span class="hljs-keyword">const</span> recognitionRef = useRef<any>(<span class="hljs-literal">null</span>);
      <span class="hljs-keyword">const</span> isManualStopRef = useRef<boolean>(<span class="hljs-literal">false</span>);
      <span class="hljs-keyword">const</span> currentTranscriptRef = useRef<string>(<span class="hljs-string">""</span>);
    

    This initialization step is crucial because it establishes the component’s ability to track multiple concurrent states: whether recording is active, whether the speech API is ready, and various persistence mechanisms for managing the speech recognition lifecycle.

    Context Integration

    In Stream Chat, the MessageInputContext is established within the MessageInput component. It provides data to the Input UI component and its children. Since we want to use the values stored within the MessageInputContext to build our own custom input UI component, we’ll be calling the useMessageInputContext custom hook:

      <span class="hljs-comment">// Access the MessageInput context</span>
      <span class="hljs-keyword">const</span> { handleSubmit, textareaRef } = useMessageInputContext();
    

    This step ensures that the voice input feature integrates seamlessly with the existing chat infrastructure, sharing the same textarea reference and submission mechanisms that other input methods use.

    Web Speech API Detection and Initialization

    The Web Speech API is not supported by some browsers, which is why we need to check if the browser running this application is compatible. The component’s first major process involves detecting and initializing the Web Speech API:

     <span class="hljs-keyword">const</span> SpeechRecognition = (<span class="hljs-built_in">window</span> <span class="hljs-keyword">as</span> any).SpeechRecognition||(<span class="hljs-built_in">window</span> <span class="hljs-keyword">as</span> any).webkitSpeechRecognition;
    

    Once the API is detected, the component configures the speech recognition service with optimal settings.

    Event Handler Configuration

    We’ll have two event handlers: the result processing handler and the lifecycle event handler.

    The result processing handler processes speech recognition output. It demonstrates a two-phase processing approach where interim results provide immediate feedback while final results are accumulated for accuracy.

          recognition.onresult = <span class="hljs-function">(<span class="hljs-params">event: any</span>) =></span> {
            <span class="hljs-keyword">let</span> finalTranscript = <span class="hljs-string">""</span>;
            <span class="hljs-keyword">let</span> interimTranscript = <span class="hljs-string">""</span>;
    
            <span class="hljs-comment">// Process all results from the last processed index</span>
            <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = event.resultIndex; i < event.results.length; i++) {
              <span class="hljs-keyword">const</span> transcriptSegment = event.results[i][<span class="hljs-number">0</span>].transcript;
              <span class="hljs-keyword">if</span> (event.results[i].isFinal) {
                finalTranscript += transcriptSegment + <span class="hljs-string">" "</span>;
              } <span class="hljs-keyword">else</span> {
                interimTranscript += transcriptSegment;
              }
            }
    
            <span class="hljs-comment">// Update the current transcript</span>
            <span class="hljs-keyword">if</span> (finalTranscript) {
              currentTranscriptRef.current += finalTranscript;
            }
    
            <span class="hljs-comment">// Combine stored final transcript with current interim results</span>
            <span class="hljs-keyword">const</span> combinedTranscript = (currentTranscriptRef.current + interimTranscript).trim();
    
            <span class="hljs-comment">// Update the textarea</span>
            <span class="hljs-keyword">if</span> (combinedTranscript) {
              updateTextareaValue(combinedTranscript);
            }
          };
    

    The lifecycle event handler ensures that the component responds appropriately to each phase of the speech recognition lifecycle events (onstart, onend and onerror):

          recognition.onstart = <span class="hljs-function">() =></span> {
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Speech recognition started"</span>);
            setIsRecording(<span class="hljs-literal">true</span>);
            currentTranscriptRef.current = <span class="hljs-string">""</span>; <span class="hljs-comment">// Reset transcript on start</span>
          };
    
          recognition.onend = <span class="hljs-function">() =></span> {
            <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Speech recognition ended"</span>);
            setIsRecording(<span class="hljs-literal">false</span>);
    
            <span class="hljs-comment">// If it wasn't manually stopped and we're still supposed to be recording, restart</span>
            <span class="hljs-keyword">if</span> (!isManualStopRef.current && isRecording) {
              <span class="hljs-keyword">try</span> {
                recognition.start();
              } <span class="hljs-keyword">catch</span> (error) {
                <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Error restarting recognition:"</span>, error);
              }
            }
    
            isManualStopRef.current = <span class="hljs-literal">false</span>;
          };
    
          recognition.onerror = <span class="hljs-function">(<span class="hljs-params">event: any</span>) =></span> {
            <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Speech recognition error:"</span>, event.error);
            setIsRecording(<span class="hljs-literal">false</span>);
            isManualStopRef.current = <span class="hljs-literal">false</span>;
    
            <span class="hljs-keyword">switch</span> (event.error) {
              <span class="hljs-keyword">case</span> <span class="hljs-string">"no-speech"</span>:
                <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">"No speech detected"</span>);
                <span class="hljs-comment">// Don't show alert for no-speech, just log it</span>
                <span class="hljs-keyword">break</span>;
              <span class="hljs-keyword">case</span> <span class="hljs-string">"not-allowed"</span>:
                alert(
                  <span class="hljs-string">"Microphone access denied. Please allow microphone permissions."</span>,
                );
                <span class="hljs-keyword">break</span>;
              <span class="hljs-keyword">case</span> <span class="hljs-string">"network"</span>:
                alert(<span class="hljs-string">"Network error occurred. Please check your connection."</span>);
                <span class="hljs-keyword">break</span>;
              <span class="hljs-keyword">case</span> <span class="hljs-string">"aborted"</span>:
                <span class="hljs-built_in">console</span>.log(<span class="hljs-string">"Speech recognition aborted"</span>);
                <span class="hljs-keyword">break</span>;
              <span class="hljs-keyword">default</span>:
                <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Speech recognition error:"</span>, event.error);
            }
          };
    
          recognitionRef.current = recognition;
          setIsRecognitionReady(<span class="hljs-literal">true</span>);
          } <span class="hljs-keyword">else</span> {
          <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">"Web Speech API not supported in this browser."</span>);
          setIsRecognitionReady(<span class="hljs-literal">false</span>);
          }
    

    Starting Voice Input

    When a user clicks the microphone button, the component initiates a multi-step process that involves requesting microphone permissions and providing clear error handling if users deny access.

     <span class="hljs-keyword">const</span> toggleRecording = <span class="hljs-keyword">async</span> (): <span class="hljs-built_in">Promise</span><<span class="hljs-keyword">void</span>> => {
        <span class="hljs-keyword">if</span> (!recognitionRef.current) {
          alert(<span class="hljs-string">"Speech recognition not available"</span>);
          <span class="hljs-keyword">return</span>;
        }
    
        <span class="hljs-keyword">if</span> (isRecording) {
          <span class="hljs-comment">// Stop recording</span>
          isManualStopRef.current = <span class="hljs-literal">true</span>;
          recognitionRef.current.stop();
        } <span class="hljs-keyword">else</span> {
          <span class="hljs-keyword">try</span> {
            <span class="hljs-comment">// Request microphone permission</span>
            <span class="hljs-keyword">await</span> navigator.mediaDevices.getUserMedia({ <span class="hljs-attr">audio</span>: <span class="hljs-literal">true</span> });
    
            <span class="hljs-comment">// Clear current text and reset transcript before starting</span>
            currentTranscriptRef.current = <span class="hljs-string">""</span>;
            updateTextareaValue(<span class="hljs-string">""</span>);
    
            <span class="hljs-comment">// Start recognition</span>
            recognitionRef.current.start();
          } <span class="hljs-keyword">catch</span> (error) {
            <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Microphone access error:"</span>, error);
            alert(
              <span class="hljs-string">"Unable to access microphone. Please check permissions and try again."</span>,
            );
          }
        }
      };
    

    Resetting State and Start Recognition

    Before beginning speech recognition, the component resets its internal state. This reset ensures that each new voice input session starts with a clean slate, preventing interference from previous sessions.

    currentTranscriptRef.current = <span class="hljs-string">""</span>;
    updateTextareaValue(<span class="hljs-string">""</span>);
    recognitionRef.current.start();
    

    Real-Time Speech Processing

    Two things happen simultaneously during this process:

    1. Continuous Result Processing : As the user speaks, the component continuously processes incoming speech data through a sophisticated pipeline:

      • Each speech segment is classified as either interim (temporary) or final (confirmed).

      • Final results are accumulated in the persistent transcript reference.

      • Interim results are combined with accumulated finals for immediate display.

    2. Dynamic Textarea Updates: The component updates the textarea in real-time using a custom DOM manipulation approach:

       <span class="hljs-keyword">const</span> updateTextareaValue = <span class="hljs-function">(<span class="hljs-params">value: string</span>) =></span> {
         <span class="hljs-keyword">const</span> nativeInputValueSetter = <span class="hljs-built_in">Object</span>.getOwnPropertyDescriptor(
           <span class="hljs-built_in">window</span>.HTMLTextAreaElement.prototype,
           <span class="hljs-string">'value'</span>
         )?.set;
      
         <span class="hljs-keyword">if</span> (nativeInputValueSetter) {
           nativeInputValueSetter.call(textareaRef.current, value);
           <span class="hljs-keyword">const</span> inputEvent = <span class="hljs-keyword">new</span> Event(<span class="hljs-string">'input'</span>, { <span class="hljs-attr">bubbles</span>: <span class="hljs-literal">true</span> });
           textareaRef.current.dispatchEvent(inputEvent);
         }
       };
      

      This step involves bypassing React’s conventional controlled component behavior to provide immediate feedback, while still maintaining compatibility with React’s event system.

    User Interface Feedback

    To make voice interactions feel smoother for users, we’ll add some visual feedback features. These include:

    1. Toggling between mic and stop icons

      We show a microphone icon when idle and a stop icon when recording is active. This provides a clear indication of the recording state.

       <button
         className={<span class="hljs-string">`voice-input-button <span class="hljs-subst">${isRecording ? <span class="hljs-string">'recording'</span> : <span class="hljs-string">'idle'</span>}</span>`</span>}
         title={isRecording ? <span class="hljs-string">"Stop recording"</span> : <span class="hljs-string">"Start voice input"</span>}
       >
         {isRecording ? (
           <span class="xml"><span class="hljs-tag"><<span class="hljs-name">Square</span> <span class="hljs-attr">size</span>=<span class="hljs-string">{20}</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"voice-icon recording-icon"</span> /></span></span>
         ) : (
           <span class="xml"><span class="hljs-tag"><<span class="hljs-name">Mic</span> <span class="hljs-attr">size</span>=<span class="hljs-string">{20}</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"voice-icon idle-icon"</span> /></span></span>
         )}
       </button>
      
    2. Recording notification banner

      A notification banner appears at the top of the screen to indicate that voice recording is in progress. This notification ensures users are aware when the microphone is active, addressing privacy and usability concerns.

       {isRecording && (
         <span class="xml"><span class="hljs-tag"><<span class="hljs-name">div</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"recording-notification show"</span>></span>
           <span class="hljs-tag"><<span class="hljs-name">span</span> <span class="hljs-attr">className</span>=<span class="hljs-string">"recording-icon"</span>></span>🎤<span class="hljs-tag"></<span class="hljs-name">span</span>></span>
           Recording... Click stop when finished
         <span class="hljs-tag"></<span class="hljs-name">div</span>></span></span>
       )}
      

    Message Integration and Submission

    The transcribed text integrates seamlessly with the existing chat system through the shared textarea reference and context-provided submission handler:

    <SendButton sendMessage={handleSubmit} />
    

    This integration means that voice-generated messages follow the same submission pathway as typed messages, maintaining consistency with the chat system’s behavior. After message submission, the component ensures proper cleanup of its internal state, preparing for the next voice input session.

    Passing the CustomMessageInput component

    Having built our custom messaging input component, we’ll now pass it to the Input prop of the MessageInput component in our App.tsx:

    <MessageInput Input={CustomMessageInput} />
    

    Complete Process Flow

    Here’s how the application works:

    1. After the component mounts, you add the AI to the chat by clicking the Add AI button.

    2. Click the mic icon to start recording.

    3. Your browser will ask for permission to use the microphone.

    4. If you deny permission, recording won’t begin.

    5. If you allow permission, recording and transcription start simultaneously.

    6. Click the stop (square) icon to end the recording.

    7. Click the send button to submit your message.

    8. The AI processes your input and generates a response.

    Conclusion

    In this tutorial, you’ve learned how to build a powerful conversational chatbot using Stream Chat and React. The application supports both text and voice inputs.

    If you want to create your own engaging chat experiences, you can explore Stream Chat and Video features to take your projects to the next level.

    Get the full source code for this project here. If you enjoyed reading this article, connect with me on LinkedIn or follow me on X for more programming-related posts and articles.

    See you on the next one!

    Source: freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleIs your Pixel glitchy after Android 16? Here’s the best workaround we’ve found so far
    Next Article The Logic, Philosophy, and Science of Software Testing – A Handbook for Developers

    Related Posts

    Development

    AI Momentum and Perficient’s Inclusion in Analyst Reports – Highlights From 2025 So Far

    September 18, 2025
    Development

    Shopping Portal using Python Django & MySQL

    September 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Windows 11’s new Start menu is ready for testing, and it’s a massive upgrade

    News & Updates

    This $400 Motorola stylus phone shouldn’t be this good, but I’m seriously impressed

    News & Updates

    The Designer’s Hierarchy of Career Needs

    Web Development

    Building your first MCP server: How to extend AI tools with custom capabilities

    News & Updates

    Highlights

    News & Updates

    EXCLUSIVE: Xbox’s first-party handheld has been sidelined (for now), as Microsoft doubles down on ‘Kennan’ and Windows 11 PC gaming optimization

    May 30, 2025

    Xbox’s handheld ambitions continue unabated, but the focus is shifting towards improving Windows 11 for…

    ConnectWise Cyberattack

    May 31, 2025

    Rilasciati Wine 10.8 e GE-Proton 10.1: tutte le novità per GNU/Linux

    May 17, 2025

    Dev runs Windows 11 ARM on an iPad Air M2 using UTM with JIT, and it’s decent

    April 21, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.