Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      The Value-Driven AI Roadmap

      September 9, 2025

      This week in AI updates: Mistral’s new Le Chat features, ChatGPT updates, and more (September 5, 2025)

      September 6, 2025

      Designing For TV: Principles, Patterns And Practical Guidance (Part 2)

      September 5, 2025

      Neo4j introduces new graph architecture that allows operational and analytics workloads to be run together

      September 5, 2025

      ‘Job Hugging’ Trend Emerges as Workers Confront AI Uncertainty

      September 8, 2025

      Distribution Release: MocaccinoOS 25.09

      September 8, 2025

      Composition in CSS

      September 8, 2025

      DataCrunch raises €55M to boost EU AI sovereignty with green cloud infrastructure

      September 8, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Finally, safe array methods in JavaScript

      September 9, 2025
      Recent

      Finally, safe array methods in JavaScript

      September 9, 2025

      Perficient Interviewed for Forrester Report on AI’s Transformative Role in DXPs

      September 9, 2025

      Perficient’s “What If? So What?” Podcast Wins Gold Stevie® Award for Technology Podcast

      September 9, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Distribution Release: MocaccinoOS 25.09

      September 8, 2025
      Recent

      Distribution Release: MocaccinoOS 25.09

      September 8, 2025

      Speed Isn’t Everything When Buying SSDs – Here’s What Really Matters!

      September 8, 2025

      14 Themes for Beautifying Your Ghostty Terminal

      September 8, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

    Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

    July 25, 2025

    This is Part 3 of a three-part series (links at the bottom).

    In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity.

    In this final installment, we’ll build a Next.js chatbot interface that streams GPT‑4 responses powered by your indexed content and demonstrates how to use GPT‑4 function‑calling (“tool calling”) for type‑safe, server‑side operations. Along the way, we’ll integrate polished components from shadcn UI to level‑up the front‑end experience.


    Overview

    Prerequisite You should already have the rag‑chatbot‑demo repo from Parts 1 & 2, with Dockerised PostgreSQL 17 + pgvector, the content_chunks schema, and embeddings ingested. Link to the repo here.

    By the end of this guide you will:

    1. Install new dependencies — `zod` for schema validation, `@openai/agents` for tool definitions, and `shadcn/ui` for UI components.
    2. Define a vectorSearch tool using Zod to embed a user query, run a pgvector search, and return the top results.
    3. Extend the RAG API route so GPT‑4 can decide when to call `vectorSearch`, merging the tool’s output into its streamed answer.
    4. Build a streaming chat UI that swaps vanilla elements for shadcn UI inputs, buttons, and cards.
    5. Deploy to Vercel with a single click.

    If you already have the project folder from earlier parts, skip straight to Install Dependencies.

    Tool Calling Explained

    Before diving into implementation, it’s helpful to understand what tool calling is and why it matters for a robust RAG-based chatbot.

    Tool calling lets your LLM not only generate free-form text but also invoke predefined functions, or “tools”, with strictly validated arguments. By exposing only a controlled set of server-side capabilities (for example, looking up the current time, querying an external API, or managing user sessions), you:

    1. Keep responses grounded in live data or protected operations, reducing hallucinations.
    2. Enforce type safety at runtime via Zod schemas, so GPT-4 can’t supply malformed parameters.
    3. Enable multi-step workflows, where the model reasons about which tool to call, what arguments to pass, and how to incorporate the tool’s output back into its natural-language answer.

    In our setup, we register each tool with a name, description, and a Zod schema that describes the allowed parameters. When GPT-4 decides to call a tool, the AI SDK intercepts that intent, validates the arguments against the Zod schema, runs the tool’s execute function on the server, and then feeds the result back into the model’s next generation step. This orchestration happens entirely within the streaming response, so the user sees a seamless, conversational experience even when live data or actions are involved.

    Tool-Augmented RAG Flow

    1. User question is sent to the chat endpoint.
    2. GPT-4 analyzes the prompt and, if it requires external knowledge, emits a tool call to vector_search with a `query` argument.
    3. The vector_search tool embeds that query, performs a pgvector cosine search in `content_chunks`, and returns the most relevant snippets.
    4. GPT-4 receives those snippets, constructs a final prompt that includes the retrieved context, and generates a grounded answer.
    5. The response is streamed back to the client UI, giving users a real-time chat experience enriched by your custom knowledge base.

    Rag Flow


    Installing Dependencies

    npm install ai @ai-sdk/openai @openai/agents zod shadcn-ui pg
    
    PackagePurpose
    `@openai/agents`Registers functions as callable tools
    `zod`Runtime schema validation
    `shadcn-ui`Tailwind‑friendly React components
    `ai` & `@ai-sdk/openai`Manage LLM calls & streaming
    `pg`PostgreSQL client

    Initialise shadcn UI and select a few components:

    npx shadcn@latest init
    npx shadcn@latest add button input card scroll-area
    

    Defining the `vectorSearch` Tool

    Use the OpenAI Agents SDK to create a vectorSearch tool that embeds user queries, searches your Postgres vector store, and returns results:

    // tools/vectorSearch.ts
    import { embed, tool } from 'ai';
    import { openai } from '@ai-sdk/openai';
    import { z } from 'zod';
    import { Pool } from 'pg';
    
    const db = new Pool({ connectionString: process.env.DATABASE_URL });
    
    // Define the vector search tool
    export const vectorSearchTool = tool({
      description: 'Search for relevant information in the knowledge base',
      parameters: z.object({
        query: z.string().describe('The search query to find relevant information'),
      }),
      execute: async ({ query }) => {
        console.log('Searching for:', query);
    
        // Embed the search query
        const { embedding: qVec } = await embed({
          model: openai.embedding('text-embedding-3-small'),
          value: query,
        });
    
        const qVecString = `[${qVec.join(',')}]`;
    
        // Retrieve top-5 most similar chunks
        const { rows } = await db.query<{ content: string; source: string }>(
          `SELECT content, source
             FROM content_chunks
         ORDER BY embedding <=> $1
            LIMIT 5`,
          [qVecString]
        );
    
        const results = rows.map((r, i) => ({
          content: r.content,
          source: r.source,
          rank: i + 1,
        }));
    
        return { results };
      },
    });
    
    • We declare a Zod schema `{ query: string }` to validate incoming parameters.
    • The tool embeds text and runs an indexed cosine search in pgvector.

    Extending the RAG API Route with Function Calling

    Modify app/api/chat/route.ts to register the tool and let the model decide when to call it:

    import { streamText } from 'ai';
    import { openai } from '@ai-sdk/openai';
    import { NextRequest } from 'next/server';
    import { vectorSearchTool } from '@/tools/vectorSearch';
    
    export const POST = async (req: NextRequest) => {
      const { messages } = await req.json();
    
      const systemMsg = {
        role: 'system',
        content: `You are a helpful support assistant.
        When users ask questions, use the vector search tool to find relevant information from the knowledge base.
        Base your answers on the search results.
        Always provide a response after using the tool.
        If the user asks a question that is not related to the knowledge base, say that you are not sure about the answer.`,
      };
    
      try {
        // Stream GPT-4's response with tool calling
        const result = streamText({
          model: openai('gpt-4.1'),
          messages: [systemMsg, ...messages],
          tools: {
            vectorSearch: vectorSearchTool,
          },
          maxSteps: 5, // Allow multiple tool calls and responses
        });
    
        return result.toDataStreamResponse();
      } catch (error) {
        console.error('Error in chat API:', error);
        return new Response('Internal Server Error', { status: 500 });
      }
    };
    
    1. We import `vectorSearch` and register it in the `tools` array.
    2. The SDK uses Zod to validate tool arguments (`strict: true`).
    3. GPT-4 can now output a JSON payload `{tool: “vector_search”, arguments:{query:”…”}}` to trigger your function.

    Building the Streaming Chat UI with Shadcn UI

    Create app/chat/page.tsx, selectively importing Shadcn components and wiring up useChat:

    'use client';
    
    import { useChat } from '@ai-sdk/react';
    import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
    import { ScrollArea } from '@/components/ui/scroll-area';
    import { Input } from '@/components/ui/input';
    import { Button } from '@/components/ui/button';
    import { useState } from 'react';
    
    export default function Chat() {
      const { messages, input, handleInputChange, handleSubmit } = useChat({
        api: '/api/chat',
      });
    
      const customHandleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        await handleSubmit(e); // Call the handleSubmit from useChat
      };
    
      const renderMessage = (message: any, index: number) => {
        const isUser = message.role === 'user';
        const hasToolInvocations = message.toolInvocations && message.toolInvocations.length > 0;
    
        return (
          <div className={`mb-4 ${isUser ? 'text-right' : 'text-left'}`}>
            <div className={`inline-block p-2 rounded-lg ${isUser ? 'bg-primary text-primary-foreground' : 'bg-muted'}`}>{message.content}</div>
    
            {/* Debug section for tool calls */}
            {!isUser && hasToolInvocations && <ToolCallDebugSection toolInvocations={message.toolInvocations} />}
          </div>
        );
      };
    
      return (
        <Card className="w-full max-w-2xl mx-auto">
          <CardHeader>
            <CardTitle>Chat with AI</CardTitle>
          </CardHeader>
          <CardContent>
            <ScrollArea className="h-[60vh] mb-4 p-4 border rounded">
              {messages.map((message, index) => (
                <div key={index}>{renderMessage(message, index)}</div>
              ))}
            </ScrollArea>
            <form onSubmit={customHandleSubmit} className="flex space-x-2">
              <Input type="text" value={input} onChange={handleInputChange} placeholder="Type your message here..." className="flex-1" />
              <Button type="submit">Send</Button>
            </form>
          </CardContent>
        </Card>
      );
    }
    
    function ToolCallDebugSection({ toolInvocations }: { toolInvocations: any[] }) {
      const [isExpanded, setIsExpanded] = useState(false);
    
      return (
        <div className="mt-2 text-left">
          <button onClick={() => setIsExpanded(!isExpanded)} className="text-xs text-gray-500 hover:text-gray-700 flex items-center gap-1">
            <span>{isExpanded ? '▼' : '▶'}</span>
            <span>Debug: Tool calls ({toolInvocations.length})</span>
          </button>
    
          {isExpanded && (
            <div className="mt-2 space-y-2 text-xs bg-gray-50 dark:bg-gray-900 p-2 rounded border">
              {toolInvocations.map((tool: any, index: number) => (
                <div key={index} className="bg-white dark:bg-gray-800 p-2 rounded border">
                  <div className="font-semibold text-blue-600 dark:text-blue-400 mb-1">🔧 {tool.toolName}</div>
                  <div className="text-gray-600 dark:text-gray-300 mb-2">
                    <strong>Query:</strong> {tool.args?.query}
                  </div>
                  {tool.result && (
                    <div>
                      <div className="font-semibold text-green-600 dark:text-green-400 mb-1">Results:</div>
                      <div className="space-y-1 max-h-32 overflow-y-auto">
                        {tool.result.results?.map((result: any, idx: number) => (
                          <div key={idx} className="bg-gray-100 dark:bg-gray-700 p-1 rounded">
                            <div className="text-gray-800 dark:text-gray-200 text-xs">{result.content}</div>
                            <div className="text-gray-500 text-xs mt-1">
                              Source: {result.source} | Rank: {result.rank}
                            </div>
                          </div>
                        ))}
                      </div>
                    </div>
                  )}
                </div>
              ))}
            </div>
          )}
        </div>
      );
    }
    
    • We import `Button` and `Input` from Shadcn to style controls.
    • `useChat()` auto-posts to our API route, handling tool calls under the hood.

    One‑Click Deploy to Vercel

    1. Push your repo to GitHub.
    2. In Vercel → Add Project, import the repo and set environment variables:
      • `DATABASE_URL`
      • `OPENAI_API_KEY`
    3. Click Deploy. Vercel auto‑detects Next.js and streams responses out‑of‑the‑box.

    Your chatbot now features type‑safe tool calling, a vector‑powered knowledge base, and a refined shadcn UI front‑end—ready for users.


    References:

    Part 1: Vector Search Embeddings and RAG

    Part 2: Postgres RAG Stack: Embedding, Chunking & Vector Search

    Repo: https://github.com/aberhamm/rag-chatbot-demo

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleThe Intersection of Agile and Accessibility – Writing Inclusive User Stories and Acceptance Criteria
    Next Article ‘Hard but Necessary Decisions’ — Intel to Cut Jobs, Cancel Expansions

    Related Posts

    Development

    Leading the QA Charge: Multi-Agent Systems Redefining Automation

    September 9, 2025
    Development

    Stop Duct-Taping AI Agents Together: Meet SmythOS

    September 9, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-54227 – Adobe InDesign Out-of-Bounds Read Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-2580 – Bit Form WordPress Contact Form Stored Cross-Site Scripting Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    This AI Paper from ByteDance Introduces a Hybrid Reward System Combining Reasoning Task Verifiers (RTV) and a Generative Reward Model (GenRM) to Mitigate Reward Hacking

    Machine Learning

    CVE-2025-9377 – “TP-Link Archer C7/EU and TL-WR841N/ND(MS) Remote Command Execution Vulnerability”

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    News & Updates

    Driving Content Delivery Efficiency Through Classifying Cache Misses

    July 2, 2025

    By Vipul Marlecha, Lara Deek, Thiara Ortiz The mission of Open Connect, our dedicated content delivery…

    CVE-2025-38002 – Linux Kernel io_uring fdinfo Lock Bypass Vulnerability

    June 6, 2025

    Inside the MSHTML Exploit: A SOC Analyst’s Walkthrough of CVE-2021–40444

    June 24, 2025

    Kritiek AMI MegaRAC SP-X authenticatie-lek misbruikt bij aanvallen

    June 26, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.