This is Part 3 of a three-part series (links at the bottom).
In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity.
In this final installment, we’ll build a Next.js chatbot interface that streams GPT‑4 responses powered by your indexed content and demonstrates how to use GPT‑4 function‑calling (“tool calling”) for type‑safe, server‑side operations. Along the way, we’ll integrate polished components from shadcn UI to level‑up the front‑end experience.
Overview
Prerequisite You should already have the rag‑chatbot‑demo
repo from Parts 1 & 2, with Dockerised PostgreSQL 17 + pgvector, the content_chunks
schema, and embeddings ingested. Link to the repo here.
By the end of this guide you will:
- Install new dependencies — `zod` for schema validation, `@openai/agents` for tool definitions, and `shadcn/ui` for UI components.
- Define a vectorSearch tool using Zod to embed a user query, run a pgvector search, and return the top results.
- Extend the RAG API route so GPT‑4 can decide when to call `vectorSearch`, merging the tool’s output into its streamed answer.
- Build a streaming chat UI that swaps vanilla elements for shadcn UI inputs, buttons, and cards.
- Deploy to Vercel with a single click.
If you already have the project folder from earlier parts, skip straight to Install Dependencies.
Tool Calling Explained
Before diving into implementation, it’s helpful to understand what tool calling is and why it matters for a robust RAG-based chatbot.
Tool calling lets your LLM not only generate free-form text but also invoke predefined functions, or “tools”, with strictly validated arguments. By exposing only a controlled set of server-side capabilities (for example, looking up the current time, querying an external API, or managing user sessions), you:
- Keep responses grounded in live data or protected operations, reducing hallucinations.
- Enforce type safety at runtime via Zod schemas, so GPT-4 can’t supply malformed parameters.
- Enable multi-step workflows, where the model reasons about which tool to call, what arguments to pass, and how to incorporate the tool’s output back into its natural-language answer.
In our setup, we register each tool with a name, description, and a Zod schema that describes the allowed parameters. When GPT-4 decides to call a tool, the AI SDK intercepts that intent, validates the arguments against the Zod schema, runs the tool’s execute
function on the server, and then feeds the result back into the model’s next generation step. This orchestration happens entirely within the streaming response, so the user sees a seamless, conversational experience even when live data or actions are involved.
Tool-Augmented RAG Flow
- User question is sent to the chat endpoint.
- GPT-4 analyzes the prompt and, if it requires external knowledge, emits a tool call to vector_search with a `query` argument.
- The vector_search tool embeds that query, performs a pgvector cosine search in `content_chunks`, and returns the most relevant snippets.
- GPT-4 receives those snippets, constructs a final prompt that includes the retrieved context, and generates a grounded answer.
- The response is streamed back to the client UI, giving users a real-time chat experience enriched by your custom knowledge base.
Installing Dependencies
npm install ai @ai-sdk/openai @openai/agents zod shadcn-ui pg
Package | Purpose |
---|---|
`@openai/agents` | Registers functions as callable tools |
`zod` | Runtime schema validation |
`shadcn-ui` | Tailwind‑friendly React components |
`ai` & `@ai-sdk/openai` | Manage LLM calls & streaming |
`pg` | PostgreSQL client |
Initialise shadcn UI and select a few components:
npx shadcn@latest init
npx shadcn@latest add button input card scroll-area
Defining the `vectorSearch` Tool
Use the OpenAI Agents SDK to create a vectorSearch tool that embeds user queries, searches your Postgres vector store, and returns results:
// tools/vectorSearch.ts
import { embed, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import { Pool } from 'pg';
const db = new Pool({ connectionString: process.env.DATABASE_URL });
// Define the vector search tool
export const vectorSearchTool = tool({
description: 'Search for relevant information in the knowledge base',
parameters: z.object({
query: z.string().describe('The search query to find relevant information'),
}),
execute: async ({ query }) => {
console.log('Searching for:', query);
// Embed the search query
const { embedding: qVec } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: query,
});
const qVecString = `[${qVec.join(',')}]`;
// Retrieve top-5 most similar chunks
const { rows } = await db.query<{ content: string; source: string }>(
`SELECT content, source
FROM content_chunks
ORDER BY embedding <=> $1
LIMIT 5`,
[qVecString]
);
const results = rows.map((r, i) => ({
content: r.content,
source: r.source,
rank: i + 1,
}));
return { results };
},
});
- We declare a Zod schema `{ query: string }` to validate incoming parameters.
- The tool embeds text and runs an indexed cosine search in pgvector.
Extending the RAG API Route with Function Calling
Modify app/api/chat/route.ts
to register the tool and let the model decide when to call it:
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { NextRequest } from 'next/server';
import { vectorSearchTool } from '@/tools/vectorSearch';
export const POST = async (req: NextRequest) => {
const { messages } = await req.json();
const systemMsg = {
role: 'system',
content: `You are a helpful support assistant.
When users ask questions, use the vector search tool to find relevant information from the knowledge base.
Base your answers on the search results.
Always provide a response after using the tool.
If the user asks a question that is not related to the knowledge base, say that you are not sure about the answer.`,
};
try {
// Stream GPT-4's response with tool calling
const result = streamText({
model: openai('gpt-4.1'),
messages: [systemMsg, ...messages],
tools: {
vectorSearch: vectorSearchTool,
},
maxSteps: 5, // Allow multiple tool calls and responses
});
return result.toDataStreamResponse();
} catch (error) {
console.error('Error in chat API:', error);
return new Response('Internal Server Error', { status: 500 });
}
};
- We import `vectorSearch` and register it in the `tools` array.
- The SDK uses Zod to validate tool arguments (`strict: true`).
- GPT-4 can now output a JSON payload `{tool: “vector_search”, arguments:{query:”…”}}` to trigger your function.
Building the Streaming Chat UI with Shadcn UI
Create app/chat/page.tsx
, selectively importing Shadcn components and wiring up useChat
:
'use client';
import { useChat } from '@ai-sdk/react';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { ScrollArea } from '@/components/ui/scroll-area';
import { Input } from '@/components/ui/input';
import { Button } from '@/components/ui/button';
import { useState } from 'react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat({
api: '/api/chat',
});
const customHandleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
await handleSubmit(e); // Call the handleSubmit from useChat
};
const renderMessage = (message: any, index: number) => {
const isUser = message.role === 'user';
const hasToolInvocations = message.toolInvocations && message.toolInvocations.length > 0;
return (
<div className={`mb-4 ${isUser ? 'text-right' : 'text-left'}`}>
<div className={`inline-block p-2 rounded-lg ${isUser ? 'bg-primary text-primary-foreground' : 'bg-muted'}`}>{message.content}</div>
{/* Debug section for tool calls */}
{!isUser && hasToolInvocations && <ToolCallDebugSection toolInvocations={message.toolInvocations} />}
</div>
);
};
return (
<Card className="w-full max-w-2xl mx-auto">
<CardHeader>
<CardTitle>Chat with AI</CardTitle>
</CardHeader>
<CardContent>
<ScrollArea className="h-[60vh] mb-4 p-4 border rounded">
{messages.map((message, index) => (
<div key={index}>{renderMessage(message, index)}</div>
))}
</ScrollArea>
<form onSubmit={customHandleSubmit} className="flex space-x-2">
<Input type="text" value={input} onChange={handleInputChange} placeholder="Type your message here..." className="flex-1" />
<Button type="submit">Send</Button>
</form>
</CardContent>
</Card>
);
}
function ToolCallDebugSection({ toolInvocations }: { toolInvocations: any[] }) {
const [isExpanded, setIsExpanded] = useState(false);
return (
<div className="mt-2 text-left">
<button onClick={() => setIsExpanded(!isExpanded)} className="text-xs text-gray-500 hover:text-gray-700 flex items-center gap-1">
<span>{isExpanded ? '▼' : '▶'}</span>
<span>Debug: Tool calls ({toolInvocations.length})</span>
</button>
{isExpanded && (
<div className="mt-2 space-y-2 text-xs bg-gray-50 dark:bg-gray-900 p-2 rounded border">
{toolInvocations.map((tool: any, index: number) => (
<div key={index} className="bg-white dark:bg-gray-800 p-2 rounded border">
<div className="font-semibold text-blue-600 dark:text-blue-400 mb-1">🔧 {tool.toolName}</div>
<div className="text-gray-600 dark:text-gray-300 mb-2">
<strong>Query:</strong> {tool.args?.query}
</div>
{tool.result && (
<div>
<div className="font-semibold text-green-600 dark:text-green-400 mb-1">Results:</div>
<div className="space-y-1 max-h-32 overflow-y-auto">
{tool.result.results?.map((result: any, idx: number) => (
<div key={idx} className="bg-gray-100 dark:bg-gray-700 p-1 rounded">
<div className="text-gray-800 dark:text-gray-200 text-xs">{result.content}</div>
<div className="text-gray-500 text-xs mt-1">
Source: {result.source} | Rank: {result.rank}
</div>
</div>
))}
</div>
</div>
)}
</div>
))}
</div>
)}
</div>
);
}
- We import `Button` and `Input` from Shadcn to style controls.
- `useChat()` auto-posts to our API route, handling tool calls under the hood.
One‑Click Deploy to Vercel
- Push your repo to GitHub.
- In Vercel → Add Project, import the repo and set environment variables:
- `DATABASE_URL`
- `OPENAI_API_KEY`
- Click Deploy. Vercel auto‑detects Next.js and streams responses out‑of‑the‑box.
Your chatbot now features type‑safe tool calling, a vector‑powered knowledge base, and a refined shadcn UI front‑end—ready for users.
References:
Part 1: Vector Search Embeddings and RAG
Part 2: Postgres RAG Stack: Embedding, Chunking & Vector Search
Repo: https://github.com/aberhamm/rag-chatbot-demo
Source: Read MoreÂ