A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI

In this tutorial, we introduce an advanced, interactive web intelligence agent powered by Tavily and Google’s Gemini AI. We’ll learn how to configure and use this smart agent to seamlessly extract structured content from web pages, perform sophisticated AI-driven analyses, and present insightful results. With user-friendly, interactive prompts, robust error handling, and a visually appealing terminal interface, this tool offers an intuitive and powerful environment for exploring web content extraction and AI-based content analysis.

Copy CodeCopiedUse a different Browser

import os
import json
import asyncio
from typing import List, Dict, Any
from dataclasses import dataclass
from rich.console import Console
from rich.progress import track
from rich.panel import Panel
from rich.markdown import Markdown

We import and set up essential libraries for handling data structures, asynchronous programming, and type annotations, alongside a rich library that enables visually appealing terminal outputs. These modules collectively facilitate efficient, structured, and interactive execution of web intelligence tasks within the notebook.

Copy CodeCopiedUse a different Browser

from langchain_tavily import TavilyExtract
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent

We initialize essential LangChain components: TavilyExtract enables advanced web content retrieval, init_chat_model sets up the Gemini AI-powered chat model, and create_react_agent builds a dynamic, reasoning-based agent capable of intelligent decision-making during web analysis tasks. Together, these tools form the core engine for sophisticated AI-driven web intelligence workflows.

Copy CodeCopiedUse a different Browser

@dataclass
class WebIntelligence:
    """Web Intelligence Configuration"""
    tavily_key: str = os.getenv("TAVILY_API_KEY", "")
    google_key: str = os.getenv("GOOGLE_API_KEY", "")
    extract_depth: str = "advanced"
    max_urls: int = 10

Check out the Notebook here

The WebIntelligence dataclass serves as a structured configuration container, holding API keys for Tavily and Google Gemini, and setting extraction parameters like extract_depth and the maximum number of URLs (max_urls). It simplifies the management and access of crucial settings, ensuring seamless integration and customization of web content extraction tasks within the intelligence agent.

Copy CodeCopiedUse a different Browser

@dataclass
class WebIntelligence:
    """Web Intelligence Configuration"""
    tavily_key: str = os.getenv("TAVILY_API_KEY", "")
    google_key: str = os.getenv("GOOGLE_API_KEY", "")
    extract_depth: str = "advanced"
    max_urls: int = 10
The WebIntelligence dataclass serves as a structured configuration container, holding API keys for Tavily and Google Gemini, and setting extraction parameters like extract_depth and the maximum number of URLs (max_urls). It simplifies the management and access of crucial settings, ensuring seamless integration and customization of web content extraction tasks within the intelligence agent.

class SmartWebAgent:
    """Intelligent Web Content Extraction & Analysis Agent"""
   
    def __init__(self, config: WebIntelligence):
        self.config = config
        self.console = Console()
        self._setup_environment()
        self._initialize_tools()
   
    def _setup_environment(self):
        """Setup API keys with interactive prompts"""
        if not self.config.tavily_key:
            self.config.tavily_key = input(" Enter Tavily API Key: ")
            os.environ["TAVILY_API_KEY"] = self.config.tavily_key
           
        if not self.config.google_key:
            self.config.google_key = input(" Enter Google Gemini API Key: ")
            os.environ["GOOGLE_API_KEY"] = self.config.google_key
   
    def _initialize_tools(self):
        """Initialize AI tools and agents"""
        self.console.print("  Initializing AI Tools...", style="bold blue")
       
        try:
            self.extractor = TavilyExtract(
                extract_depth=self.config.extract_depth,
                include_images=False,  
                include_raw_content=False,
                max_results=3
            )
           
            self.llm = init_chat_model(
                "gemini-2.0-flash",
                model_provider="google_genai",
                temperature=0.3,
                max_tokens=1024
            )
           
            test_response = self.llm.invoke("Say 'AI tools initialized successfully!'")
            self.console.print(f" LLM Test: {test_response.content}", style="green")
           
            self.agent = create_react_agent(self.llm, [self.extractor])
           
            self.console.print(" AI Agent Ready!", style="bold green")
           
        except Exception as e:
            self.console.print(f" Initialization Error: {e}", style="bold red")
            self.console.print(" Check your API keys and internet connection", style="yellow")
            raise
   
    def extract_content(self, urls: List[str]) -> Dict[str, Any]:
        """Extract and structure content from URLs"""
        results = {}
       
        for url in track(urls, description=" Extracting content..."):
            try:
                response = self.extractor.invoke({"urls": [url]})
                content = json.loads(response.content) if isinstance(response.content, str) else response.content
                results[url] = {
                    "status": "success",
                    "data": content,
                    "summary": content.get("summary", "No summary available")[:200] + "..."
                }
            except Exception as e:
                results[url] = {"status": "error", "error": str(e)}
       
        return results
   
    def analyze_with_ai(self, query: str, urls: List[str] = None) -> str:
        """Intelligent analysis using AI agent"""
        try:
            if urls:
                message = f"Use the tavily_extract tool to analyze these URLs and answer: {query}nURLs: {urls}"
            else:
                message = query
               
            self.console.print(f" AI Analysis: {query}", style="bold magenta")
           
            messages = [{"role": "user", "content": message}]
           
            all_content = []
            with self.console.status(" AI thinking..."):
                try:
                    for step in self.agent.stream({"messages": messages}, stream_mode="values"):
                        if "messages" in step and step["messages"]:
                            for msg in step["messages"]:
                                if hasattr(msg, 'content') and msg.content and msg.content not in all_content:
                                    all_content.append(str(msg.content))
                except Exception as stream_error:
                    self.console.print(f" Stream error: {stream_error}", style="yellow")
           
            if not all_content:
                self.console.print(" Trying direct AI invocation...", style="yellow")
                try:
                    response = self.llm.invoke(message)
                    return str(response.content) if hasattr(response, 'content') else str(response)
                except Exception as direct_error:
                    self.console.print(f" Direct error: {direct_error}", style="yellow")
                   
                    if urls:
                        self.console.print(" Extracting content first...", style="blue")
                        extracted = self.extract_content(urls)
                        content_summary = "n".join([
                            f"URL: {url}nContent: {result.get('summary', 'No content')}n"
                            for url, result in extracted.items() if result.get('status') == 'success'
                        ])
                       
                        fallback_query = f"Based on this content, {query}:nn{content_summary}"
                        response = self.llm.invoke(fallback_query)
                        return str(response.content) if hasattr(response, 'content') else str(response)
           
            return "n".join(all_content) if all_content else " Unable to generate response. Please check your API keys and try again."
           
        except Exception as e:
            return f" Analysis failed: {str(e)}nnTip: Make sure your API keys are valid and you have internet connectivity."
   
    def display_results(self, results: Dict[str, Any]):
        """Beautiful result display"""
        for url, result in results.items():
            if result["status"] == "success":
                panel = Panel(
                    f" [bold blue]{url}[/bold blue]nn{result['summary']}",
                    title=" Extracted Content",
                    border_style="green"
                )
            else:
                panel = Panel(
                    f" [bold red]{url}[/bold red]nn Error: {result['error']}",
                    title=" Extraction Failed",
                    border_style="red"
                )
            self.console.print(panel)

Check out the Notebook here

The SmartWebAgent class encapsulates an intelligent web content extraction and analysis system, utilizing APIs from Tavily and Google’s Gemini AI. It interactively sets up essential tools, securely handles API keys, extracts structured data from provided URLs, and leverages an AI-driven agent to perform insightful content analyses. Also, it utilizes rich visual outputs to communicate results, thereby enhancing readability and user experience during interactive tasks.

Copy CodeCopiedUse a different Browser

def run_async_safely(coro):
    """Run async function safely in any environment"""
    try:
        loop = asyncio.get_running_loop()
        import nest_asyncio
        nest_asyncio.apply()
        return asyncio.run(coro)
    except RuntimeError:
        return asyncio.run(coro)
    except ImportError:
        print("  Running in sync mode. Install nest_asyncio for better performance.")
        return None

Check out the Notebook here

The run_async_safely function ensures that asynchronous functions execute reliably across diverse Python environments, such as standard scripts and interactive notebooks. It attempts to adapt existing event loops with the help of nest_asyncio; if unavailable, it gracefully handles the scenario, informing the user and defaulting to synchronous execution as a fallback.

Copy CodeCopiedUse a different Browser

def main():
    """Interactive Web Intelligence Demo"""
    console = Console()
    console.print(Panel(" Web Intelligence Agent", style="bold cyan", subtitle="Powered by Tavily & Gemini"))
   
    config = WebIntelligence()
    agent = SmartWebAgent(config)
   
    demo_urls = [
        "https://en.wikipedia.org/wiki/Artificial_intelligence",
        "https://en.wikipedia.org/wiki/Machine_learning",
        "https://en.wikipedia.org/wiki/Quantum_computing"
    ]
   
    while True:
        console.print("n" + "="*60)
        console.print(" Choose an option:", style="bold yellow")
        console.print("1. Extract content from URLs")
        console.print("2. AI-powered analysis")
        console.print("3. Demo with sample URLs")
        console.print("4. Exit")
       
        choice = input("nEnter choice (1-4): ").strip()
       
        if choice == "1":
            urls_input = input("Enter URLs (comma-separated): ")
            urls = [url.strip() for url in urls_input.split(",")]
            results = agent.extract_content(urls)
            agent.display_results(results)
           
        elif choice == "2":
            query = input("Enter your analysis query: ")
            urls_input = input("Enter URLs to analyze (optional, comma-separated): ")
            urls = [url.strip() for url in urls_input.split(",") if url.strip()] if urls_input.strip() else None
           
            try:
                response = agent.analyze_with_ai(query, urls)
                console.print(Panel(Markdown(response), title=" AI Analysis", border_style="blue"))
            except Exception as e:
                console.print(f" Analysis failed: {e}", style="bold red")
           
        elif choice == "3":
            console.print(" Running demo with AI & Quantum Computing URLs...")
            results = agent.extract_content(demo_urls)
            agent.display_results(results)
           
            response = agent.analyze_with_ai(
                "Compare AI, ML, and Quantum Computing. What are the key relationships?",
                demo_urls
            )
            console.print(Panel(Markdown(response), title=" Comparative Analysis", border_style="magenta"))
           
        elif choice == "4":
            console.print(" Goodbye!", style="bold green")
            break
        else:
            console.print(" Invalid choice!", style="bold red")


if __name__ == "__main__":
    main()

Check out the Notebook here

The main function provides an interactive command-line demonstration of the Smart Web Intelligence Agent. It presents users with an intuitive menu that allows them to extract web content from custom URLs, perform sophisticated AI-driven analyses on selected topics, or explore predefined demos involving AI, machine learning, and quantum computing. Rich visual formatting enhances user engagement, making complex web analysis tasks straightforward and user-friendly.

In conclusion, by following this comprehensive tutorial, we’ve built an enhanced Tavily Web Intelligence Agent capable of sophisticated web content extraction and intelligent analysis using Google’s Gemini AI. Through structured data extraction, dynamic AI queries, and visually appealing results, this powerful agent streamlines research tasks, enriches your data analytics workflows, and fosters deeper insights from web content. With this foundation, we are now equipped to extend this agent further, customize it for specific use cases, and harness the combined power of AI and web intelligence to enhance productivity and decision-making in our projects.

Check out the Notebook here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.

The post A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI appeared first on MarkTechPost.

Source: Read MoreÂ

Node.js vs. Python for Backend: 7 Reasons C-Level Leaders Choose Node.js Talent

Handling JavaScript Event Listeners With Parameters

ChatGPT now has an agent mode

Scrum Alliance and Kanban University partner to offer new course that teaches both methodologies

Is ChatGPT down? You’re not alone. Here’s what OpenAI is saying

I found a tablet that could replace my iPad and Kindle – and it’s worth every penny

The best CRM software with email marketing in 2025: Expert tested and reviewed

This multi-port car charger can power 4 gadgets at once – and it’s surprisingly cheap

Execute Ping Commands and Get Back Structured Data in PHP

Execute Ping Commands and Get Back Structured Data in PHP

The Intersection of Agile and Accessibility – A Series on Designing for Everyone

Zero Trust & Cybersecurity Mesh: Your Org’s Survival Guide

I Made Kitty Terminal Even More Awesome by Using These 15 Customization Tips and Tweaks

I Made Kitty Terminal Even More Awesome by Using These 15 Customization Tips and Tweaks

Microsoft confirms active cyberattacks on SharePoint servers

How to Manually Check & Install Windows 11 Updates (Best Guide)

A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Boolformer: Symbolic Regression of Logic Functions with Transformers

Google Fixed Cloud Run Vulnerability Allowing Unauthorized Image Access via IAM Misuse

CVE-2025-5940 – Osom Blocks – WordPress Stored Cross-Site Scripting

As a golfer, this is the smartwatch I recommend most on the course

CVE-2025-46813 – Discourse Data Leak Vulnerability

Cyberagentschap VS meldt actief misbruik van lek in Linux-kernel

Rilasciato APT 3.0: il gestore di pacchetti software di Debian si rinnova

Automate customer support with Amazon Bedrock, LangGraph, and Mistral models

Google Workspace Adds Real-Time Data Migration Logs to Admin Console, BigQuery Support Rolling

A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI

Related Posts