Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Designing Better UX For Left-Handed People

      July 25, 2025

      This week in AI dev tools: Gemini 2.5 Flash-Lite, GitLab Duo Agent Platform beta, and more (July 25, 2025)

      July 25, 2025

      Tenable updates Vulnerability Priority Rating scoring method to flag fewer vulnerabilities as critical

      July 24, 2025

      Google adds updated workspace templates in Firebase Studio that leverage new Agent mode

      July 24, 2025

      Trump’s AI plan says a lot about open source – but here’s what it leaves out

      July 25, 2025

      Google’s new Search mode puts classic results back on top – how to access it

      July 25, 2025

      These AR swim goggles I tested have all the relevant metrics (and no subscription)

      July 25, 2025

      Google’s new AI tool Opal turns prompts into apps, no coding required

      July 25, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Laravel Scoped Route Binding for Nested Resource Management

      July 25, 2025
      Recent

      Laravel Scoped Route Binding for Nested Resource Management

      July 25, 2025

      Add Reactions Functionality to Your App With Laravel Reactions

      July 25, 2025

      saasykit/laravel-open-graphy

      July 25, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Sam Altman won’t trust ChatGPT with his “medical fate” unless a doctor is involved — “Maybe I’m a dinosaur here”

      July 25, 2025
      Recent

      Sam Altman won’t trust ChatGPT with his “medical fate” unless a doctor is involved — “Maybe I’m a dinosaur here”

      July 25, 2025

      “It deleted our production database without permission”: Bill Gates called it — coding is too complex to replace software engineers with AI

      July 25, 2025

      Top 6 new features and changes coming to Windows 11 in August 2025 — from AI agents to redesigned BSOD screens

      July 25, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

    Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

    July 25, 2025

    This is Part 3 of a three-part series (links at the bottom).

    In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity.

    In this final installment, we’ll build a Next.js chatbot interface that streams GPT‑4 responses powered by your indexed content and demonstrates how to use GPT‑4 function‑calling (“tool calling”) for type‑safe, server‑side operations. Along the way, we’ll integrate polished components from shadcn UI to level‑up the front‑end experience.


    Overview

    Prerequisite You should already have the rag‑chatbot‑demo repo from Parts 1 & 2, with Dockerised PostgreSQL 17 + pgvector, the content_chunks schema, and embeddings ingested. Link to the repo here.

    By the end of this guide you will:

    1. Install new dependencies — `zod` for schema validation, `@openai/agents` for tool definitions, and `shadcn/ui` for UI components.
    2. Define a vectorSearch tool using Zod to embed a user query, run a pgvector search, and return the top results.
    3. Extend the RAG API route so GPT‑4 can decide when to call `vectorSearch`, merging the tool’s output into its streamed answer.
    4. Build a streaming chat UI that swaps vanilla elements for shadcn UI inputs, buttons, and cards.
    5. Deploy to Vercel with a single click.

    If you already have the project folder from earlier parts, skip straight to Install Dependencies.

    Tool Calling Explained

    Before diving into implementation, it’s helpful to understand what tool calling is and why it matters for a robust RAG-based chatbot.

    Tool calling lets your LLM not only generate free-form text but also invoke predefined functions, or “tools”, with strictly validated arguments. By exposing only a controlled set of server-side capabilities (for example, looking up the current time, querying an external API, or managing user sessions), you:

    1. Keep responses grounded in live data or protected operations, reducing hallucinations.
    2. Enforce type safety at runtime via Zod schemas, so GPT-4 can’t supply malformed parameters.
    3. Enable multi-step workflows, where the model reasons about which tool to call, what arguments to pass, and how to incorporate the tool’s output back into its natural-language answer.

    In our setup, we register each tool with a name, description, and a Zod schema that describes the allowed parameters. When GPT-4 decides to call a tool, the AI SDK intercepts that intent, validates the arguments against the Zod schema, runs the tool’s execute function on the server, and then feeds the result back into the model’s next generation step. This orchestration happens entirely within the streaming response, so the user sees a seamless, conversational experience even when live data or actions are involved.

    Tool-Augmented RAG Flow

    1. User question is sent to the chat endpoint.
    2. GPT-4 analyzes the prompt and, if it requires external knowledge, emits a tool call to vector_search with a `query` argument.
    3. The vector_search tool embeds that query, performs a pgvector cosine search in `content_chunks`, and returns the most relevant snippets.
    4. GPT-4 receives those snippets, constructs a final prompt that includes the retrieved context, and generates a grounded answer.
    5. The response is streamed back to the client UI, giving users a real-time chat experience enriched by your custom knowledge base.

    Rag Flow


    Installing Dependencies

    npm install ai @ai-sdk/openai @openai/agents zod shadcn-ui pg
    
    Package Purpose
    `@openai/agents` Registers functions as callable tools
    `zod` Runtime schema validation
    `shadcn-ui` Tailwind‑friendly React components
    `ai` & `@ai-sdk/openai` Manage LLM calls & streaming
    `pg` PostgreSQL client

    Initialise shadcn UI and select a few components:

    npx shadcn@latest init
    npx shadcn@latest add button input card scroll-area
    

    Defining the `vectorSearch` Tool

    Use the OpenAI Agents SDK to create a vectorSearch tool that embeds user queries, searches your Postgres vector store, and returns results:

    // tools/vectorSearch.ts
    import { embed, tool } from 'ai';
    import { openai } from '@ai-sdk/openai';
    import { z } from 'zod';
    import { Pool } from 'pg';
    
    const db = new Pool({ connectionString: process.env.DATABASE_URL });
    
    // Define the vector search tool
    export const vectorSearchTool = tool({
      description: 'Search for relevant information in the knowledge base',
      parameters: z.object({
        query: z.string().describe('The search query to find relevant information'),
      }),
      execute: async ({ query }) => {
        console.log('Searching for:', query);
    
        // Embed the search query
        const { embedding: qVec } = await embed({
          model: openai.embedding('text-embedding-3-small'),
          value: query,
        });
    
        const qVecString = `[${qVec.join(',')}]`;
    
        // Retrieve top-5 most similar chunks
        const { rows } = await db.query<{ content: string; source: string }>(
          `SELECT content, source
             FROM content_chunks
         ORDER BY embedding <=> $1
            LIMIT 5`,
          [qVecString]
        );
    
        const results = rows.map((r, i) => ({
          content: r.content,
          source: r.source,
          rank: i + 1,
        }));
    
        return { results };
      },
    });
    
    • We declare a Zod schema `{ query: string }` to validate incoming parameters.
    • The tool embeds text and runs an indexed cosine search in pgvector.

    Extending the RAG API Route with Function Calling

    Modify app/api/chat/route.ts to register the tool and let the model decide when to call it:

    import { streamText } from 'ai';
    import { openai } from '@ai-sdk/openai';
    import { NextRequest } from 'next/server';
    import { vectorSearchTool } from '@/tools/vectorSearch';
    
    export const POST = async (req: NextRequest) => {
      const { messages } = await req.json();
    
      const systemMsg = {
        role: 'system',
        content: `You are a helpful support assistant.
        When users ask questions, use the vector search tool to find relevant information from the knowledge base.
        Base your answers on the search results.
        Always provide a response after using the tool.
        If the user asks a question that is not related to the knowledge base, say that you are not sure about the answer.`,
      };
    
      try {
        // Stream GPT-4's response with tool calling
        const result = streamText({
          model: openai('gpt-4.1'),
          messages: [systemMsg, ...messages],
          tools: {
            vectorSearch: vectorSearchTool,
          },
          maxSteps: 5, // Allow multiple tool calls and responses
        });
    
        return result.toDataStreamResponse();
      } catch (error) {
        console.error('Error in chat API:', error);
        return new Response('Internal Server Error', { status: 500 });
      }
    };
    
    1. We import `vectorSearch` and register it in the `tools` array.
    2. The SDK uses Zod to validate tool arguments (`strict: true`).
    3. GPT-4 can now output a JSON payload `{tool: “vector_search”, arguments:{query:”…”}}` to trigger your function.

    Building the Streaming Chat UI with Shadcn UI

    Create app/chat/page.tsx, selectively importing Shadcn components and wiring up useChat:

    'use client';
    
    import { useChat } from '@ai-sdk/react';
    import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
    import { ScrollArea } from '@/components/ui/scroll-area';
    import { Input } from '@/components/ui/input';
    import { Button } from '@/components/ui/button';
    import { useState } from 'react';
    
    export default function Chat() {
      const { messages, input, handleInputChange, handleSubmit } = useChat({
        api: '/api/chat',
      });
    
      const customHandleSubmit = async (e: React.FormEvent) => {
        e.preventDefault();
        await handleSubmit(e); // Call the handleSubmit from useChat
      };
    
      const renderMessage = (message: any, index: number) => {
        const isUser = message.role === 'user';
        const hasToolInvocations = message.toolInvocations && message.toolInvocations.length > 0;
    
        return (
          <div className={`mb-4 ${isUser ? 'text-right' : 'text-left'}`}>
            <div className={`inline-block p-2 rounded-lg ${isUser ? 'bg-primary text-primary-foreground' : 'bg-muted'}`}>{message.content}</div>
    
            {/* Debug section for tool calls */}
            {!isUser && hasToolInvocations && <ToolCallDebugSection toolInvocations={message.toolInvocations} />}
          </div>
        );
      };
    
      return (
        <Card className="w-full max-w-2xl mx-auto">
          <CardHeader>
            <CardTitle>Chat with AI</CardTitle>
          </CardHeader>
          <CardContent>
            <ScrollArea className="h-[60vh] mb-4 p-4 border rounded">
              {messages.map((message, index) => (
                <div key={index}>{renderMessage(message, index)}</div>
              ))}
            </ScrollArea>
            <form onSubmit={customHandleSubmit} className="flex space-x-2">
              <Input type="text" value={input} onChange={handleInputChange} placeholder="Type your message here..." className="flex-1" />
              <Button type="submit">Send</Button>
            </form>
          </CardContent>
        </Card>
      );
    }
    
    function ToolCallDebugSection({ toolInvocations }: { toolInvocations: any[] }) {
      const [isExpanded, setIsExpanded] = useState(false);
    
      return (
        <div className="mt-2 text-left">
          <button onClick={() => setIsExpanded(!isExpanded)} className="text-xs text-gray-500 hover:text-gray-700 flex items-center gap-1">
            <span>{isExpanded ? '▼' : '▶'}</span>
            <span>Debug: Tool calls ({toolInvocations.length})</span>
          </button>
    
          {isExpanded && (
            <div className="mt-2 space-y-2 text-xs bg-gray-50 dark:bg-gray-900 p-2 rounded border">
              {toolInvocations.map((tool: any, index: number) => (
                <div key={index} className="bg-white dark:bg-gray-800 p-2 rounded border">
                  <div className="font-semibold text-blue-600 dark:text-blue-400 mb-1">🔧 {tool.toolName}</div>
                  <div className="text-gray-600 dark:text-gray-300 mb-2">
                    <strong>Query:</strong> {tool.args?.query}
                  </div>
                  {tool.result && (
                    <div>
                      <div className="font-semibold text-green-600 dark:text-green-400 mb-1">Results:</div>
                      <div className="space-y-1 max-h-32 overflow-y-auto">
                        {tool.result.results?.map((result: any, idx: number) => (
                          <div key={idx} className="bg-gray-100 dark:bg-gray-700 p-1 rounded">
                            <div className="text-gray-800 dark:text-gray-200 text-xs">{result.content}</div>
                            <div className="text-gray-500 text-xs mt-1">
                              Source: {result.source} | Rank: {result.rank}
                            </div>
                          </div>
                        ))}
                      </div>
                    </div>
                  )}
                </div>
              ))}
            </div>
          )}
        </div>
      );
    }
    
    • We import `Button` and `Input` from Shadcn to style controls.
    • `useChat()` auto-posts to our API route, handling tool calls under the hood.

    One‑Click Deploy to Vercel

    1. Push your repo to GitHub.
    2. In Vercel → Add Project, import the repo and set environment variables:
      • `DATABASE_URL`
      • `OPENAI_API_KEY`
    3. Click Deploy. Vercel auto‑detects Next.js and streams responses out‑of‑the‑box.

    Your chatbot now features type‑safe tool calling, a vector‑powered knowledge base, and a refined shadcn UI front‑end—ready for users.


    References:

    Part 1: Vector Search Embeddings and RAG

    Part 2: Postgres RAG Stack: Embedding, Chunking & Vector Search

    Repo: https://github.com/aberhamm/rag-chatbot-demo

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleThe Intersection of Agile and Accessibility – Writing Inclusive User Stories and Acceptance Criteria
    Next Article How to turn Open Source into a Job with Nick Taylor [Podcast #181]

    Related Posts

    Development

    Laravel Scoped Route Binding for Nested Resource Management

    July 25, 2025
    Development

    Add Reactions Functionality to Your App With Laravel Reactions

    July 25, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Emergency patch for potential SAP zero-day that could grant full system control

    Security

    CVE-2025-6851 – “WordPress Broken Link Notifier SSRF”

    Common Vulnerabilities and Exposures (CVEs)

    lanos/laravel-cashier-stripe-connect

    Development

    Google’s New IDE Redefines Developer Productivity

    Web Development

    Highlights

    CVE-2025-52709 – Everest Forms Object Injection Vulnerability

    June 27, 2025

    CVE ID : CVE-2025-52709

    Published : June 27, 2025, 12:15 p.m. | 2 hours, 14 minutes ago

    Description : Deserialization of Untrusted Data vulnerability in wpeverest Everest Forms allows Object Injection. This issue affects Everest Forms: from n/a through 3.2.2.

    Severity: 9.8 | CRITICAL

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Cisco ISE-servers via kritieke kwetsbaarheden volledig over te nemen

    June 26, 2025

    Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

    July 21, 2025

    How to Open NVIDIA Control Panel on Windows 11 Super Fast

    July 24, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.