Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Designing For TV: Principles, Patterns And Practical Guidance (Part 2)

      September 5, 2025

      Neo4j introduces new graph architecture that allows operational and analytics workloads to be run together

      September 5, 2025

      Beyond the benchmarks: Understanding the coding personalities of different LLMs

      September 5, 2025

      Top 10 Use Cases of Vibe Coding in Large-Scale Node.js Applications

      September 3, 2025

      Building smarter interactions with MCP elicitation: From clunky tool calls to seamless user experiences

      September 4, 2025

      From Zero to MCP: Simplifying AI Integrations with xmcp

      September 4, 2025

      Distribution Release: Linux Mint 22.2

      September 4, 2025

      Coded Smorgasbord: Basically, a Smorgasbord

      September 4, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Drupal 11’s AI Features: What They Actually Mean for Your Team

      September 5, 2025
      Recent

      Drupal 11’s AI Features: What They Actually Mean for Your Team

      September 5, 2025

      Why Data Governance Matters More Than Ever in 2025?

      September 5, 2025

      Perficient Included in the IDC Market Glance for Digital Business Professional Services, 3Q25

      September 5, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      How DevOps Teams Are Redefining Reliability with NixOS and OSTree-Powered Linux

      September 5, 2025
      Recent

      How DevOps Teams Are Redefining Reliability with NixOS and OSTree-Powered Linux

      September 5, 2025

      Distribution Release: Linux Mint 22.2

      September 4, 2025

      ‘Cronos: The New Dawn’ was by far my favorite experience at Gamescom 2025 — Bloober might have cooked an Xbox / PC horror masterpiece

      September 4, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI

    A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI

    June 4, 2025

    In this tutorial, we introduce an advanced, interactive web intelligence agent powered by Tavily and Google’s Gemini AI. We’ll learn how to configure and use this smart agent to seamlessly extract structured content from web pages, perform sophisticated AI-driven analyses, and present insightful results. With user-friendly, interactive prompts, robust error handling, and a visually appealing terminal interface, this tool offers an intuitive and powerful environment for exploring web content extraction and AI-based content analysis.

    Copy CodeCopiedUse a different Browser
    import os
    import json
    import asyncio
    from typing import List, Dict, Any
    from dataclasses import dataclass
    from rich.console import Console
    from rich.progress import track
    from rich.panel import Panel
    from rich.markdown import Markdown

    We import and set up essential libraries for handling data structures, asynchronous programming, and type annotations, alongside a rich library that enables visually appealing terminal outputs. These modules collectively facilitate efficient, structured, and interactive execution of web intelligence tasks within the notebook.

    Copy CodeCopiedUse a different Browser
    from langchain_tavily import TavilyExtract
    from langchain.chat_models import init_chat_model
    from langgraph.prebuilt import create_react_agent

    We initialize essential LangChain components: TavilyExtract enables advanced web content retrieval, init_chat_model sets up the Gemini AI-powered chat model, and create_react_agent builds a dynamic, reasoning-based agent capable of intelligent decision-making during web analysis tasks. Together, these tools form the core engine for sophisticated AI-driven web intelligence workflows.

    Copy CodeCopiedUse a different Browser
    @dataclass
    class WebIntelligence:
        """Web Intelligence Configuration"""
        tavily_key: str = os.getenv("TAVILY_API_KEY", "")
        google_key: str = os.getenv("GOOGLE_API_KEY", "")
        extract_depth: str = "advanced"
        max_urls: int = 10

    Check out the Notebook here

    The WebIntelligence dataclass serves as a structured configuration container, holding API keys for Tavily and Google Gemini, and setting extraction parameters like extract_depth and the maximum number of URLs (max_urls). It simplifies the management and access of crucial settings, ensuring seamless integration and customization of web content extraction tasks within the intelligence agent.

    Copy CodeCopiedUse a different Browser
    @dataclass
    class WebIntelligence:
        """Web Intelligence Configuration"""
        tavily_key: str = os.getenv("TAVILY_API_KEY", "")
        google_key: str = os.getenv("GOOGLE_API_KEY", "")
        extract_depth: str = "advanced"
        max_urls: int = 10
    The WebIntelligence dataclass serves as a structured configuration container, holding API keys for Tavily and Google Gemini, and setting extraction parameters like extract_depth and the maximum number of URLs (max_urls). It simplifies the management and access of crucial settings, ensuring seamless integration and customization of web content extraction tasks within the intelligence agent.
    
    class SmartWebAgent:
        """Intelligent Web Content Extraction & Analysis Agent"""
       
        def __init__(self, config: WebIntelligence):
            self.config = config
            self.console = Console()
            self._setup_environment()
            self._initialize_tools()
       
        def _setup_environment(self):
            """Setup API keys with interactive prompts"""
            if not self.config.tavily_key:
                self.config.tavily_key = input("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f511.png" alt="🔑" class="wp-smiley" /> Enter Tavily API Key: ")
                os.environ["TAVILY_API_KEY"] = self.config.tavily_key
               
            if not self.config.google_key:
                self.config.google_key = input("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f511.png" alt="🔑" class="wp-smiley" /> Enter Google Gemini API Key: ")
                os.environ["GOOGLE_API_KEY"] = self.config.google_key
       
        def _initialize_tools(self):
            """Initialize AI tools and agents"""
            self.console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f6e0.png" alt="🛠" class="wp-smiley" />  Initializing AI Tools...", style="bold blue")
           
            try:
                self.extractor = TavilyExtract(
                    extract_depth=self.config.extract_depth,
                    include_images=False,  
                    include_raw_content=False,
                    max_results=3
                )
               
                self.llm = init_chat_model(
                    "gemini-2.0-flash",
                    model_provider="google_genai",
                    temperature=0.3,
                    max_tokens=1024
                )
               
                test_response = self.llm.invoke("Say 'AI tools initialized successfully!'")
                self.console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/2705.png" alt="✅" class="wp-smiley" /> LLM Test: {test_response.content}", style="green")
               
                self.agent = create_react_agent(self.llm, [self.extractor])
               
                self.console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/2705.png" alt="✅" class="wp-smiley" /> AI Agent Ready!", style="bold green")
               
            except Exception as e:
                self.console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Initialization Error: {e}", style="bold red")
                self.console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f4a1.png" alt="💡" class="wp-smiley" /> Check your API keys and internet connection", style="yellow")
                raise
       
        def extract_content(self, urls: List[str]) -> Dict[str, Any]:
            """Extract and structure content from URLs"""
            results = {}
           
            for url in track(urls, description="<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f310.png" alt="🌐" class="wp-smiley" /> Extracting content..."):
                try:
                    response = self.extractor.invoke({"urls": [url]})
                    content = json.loads(response.content) if isinstance(response.content, str) else response.content
                    results[url] = {
                        "status": "success",
                        "data": content,
                        "summary": content.get("summary", "No summary available")[:200] + "..."
                    }
                except Exception as e:
                    results[url] = {"status": "error", "error": str(e)}
           
            return results
       
        def analyze_with_ai(self, query: str, urls: List[str] = None) -> str:
            """Intelligent analysis using AI agent"""
            try:
                if urls:
                    message = f"Use the tavily_extract tool to analyze these URLs and answer: {query}nURLs: {urls}"
                else:
                    message = query
                   
                self.console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f916.png" alt="🤖" class="wp-smiley" /> AI Analysis: {query}", style="bold magenta")
               
                messages = [{"role": "user", "content": message}]
               
                all_content = []
                with self.console.status("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f504.png" alt="🔄" class="wp-smiley" /> AI thinking..."):
                    try:
                        for step in self.agent.stream({"messages": messages}, stream_mode="values"):
                            if "messages" in step and step["messages"]:
                                for msg in step["messages"]:
                                    if hasattr(msg, 'content') and msg.content and msg.content not in all_content:
                                        all_content.append(str(msg.content))
                    except Exception as stream_error:
                        self.console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Stream error: {stream_error}", style="yellow")
               
                if not all_content:
                    self.console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f504.png" alt="🔄" class="wp-smiley" /> Trying direct AI invocation...", style="yellow")
                    try:
                        response = self.llm.invoke(message)
                        return str(response.content) if hasattr(response, 'content') else str(response)
                    except Exception as direct_error:
                        self.console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Direct error: {direct_error}", style="yellow")
                       
                        if urls:
                            self.console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f504.png" alt="🔄" class="wp-smiley" /> Extracting content first...", style="blue")
                            extracted = self.extract_content(urls)
                            content_summary = "n".join([
                                f"URL: {url}nContent: {result.get('summary', 'No content')}n"
                                for url, result in extracted.items() if result.get('status') == 'success'
                            ])
                           
                            fallback_query = f"Based on this content, {query}:nn{content_summary}"
                            response = self.llm.invoke(fallback_query)
                            return str(response.content) if hasattr(response, 'content') else str(response)
               
                return "n".join(all_content) if all_content else "<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Unable to generate response. Please check your API keys and try again."
               
            except Exception as e:
                return f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Analysis failed: {str(e)}nnTip: Make sure your API keys are valid and you have internet connectivity."
       
        def display_results(self, results: Dict[str, Any]):
            """Beautiful result display"""
            for url, result in results.items():
                if result["status"] == "success":
                    panel = Panel(
                        f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f517.png" alt="🔗" class="wp-smiley" /> [bold blue]{url}[/bold blue]nn{result['summary']}",
                        title="<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/2705.png" alt="✅" class="wp-smiley" /> Extracted Content",
                        border_style="green"
                    )
                else:
                    panel = Panel(
                        f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f517.png" alt="🔗" class="wp-smiley" /> [bold red]{url}[/bold red]nn<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Error: {result['error']}",
                        title="<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Extraction Failed",
                        border_style="red"
                    )
                self.console.print(panel)

    Check out the Notebook here

    The SmartWebAgent class encapsulates an intelligent web content extraction and analysis system, utilizing APIs from Tavily and Google’s Gemini AI. It interactively sets up essential tools, securely handles API keys, extracts structured data from provided URLs, and leverages an AI-driven agent to perform insightful content analyses. Also, it utilizes rich visual outputs to communicate results, thereby enhancing readability and user experience during interactive tasks.

    Copy CodeCopiedUse a different Browser
    def run_async_safely(coro):
        """Run async function safely in any environment"""
        try:
            loop = asyncio.get_running_loop()
            import nest_asyncio
            nest_asyncio.apply()
            return asyncio.run(coro)
        except RuntimeError:
            return asyncio.run(coro)
        except ImportError:
            print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/26a0.png" alt="⚠" class="wp-smiley" />  Running in sync mode. Install nest_asyncio for better performance.")
            return None

    Check out the Notebook here

    The run_async_safely function ensures that asynchronous functions execute reliably across diverse Python environments, such as standard scripts and interactive notebooks. It attempts to adapt existing event loops with the help of nest_asyncio; if unavailable, it gracefully handles the scenario, informing the user and defaulting to synchronous execution as a fallback.

    Copy CodeCopiedUse a different Browser
    def main():
        """Interactive Web Intelligence Demo"""
        console = Console()
        console.print(Panel("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f680.png" alt="🚀" class="wp-smiley" /> Web Intelligence Agent", style="bold cyan", subtitle="Powered by Tavily & Gemini"))
       
        config = WebIntelligence()
        agent = SmartWebAgent(config)
       
        demo_urls = [
            "https://en.wikipedia.org/wiki/Artificial_intelligence",
            "https://en.wikipedia.org/wiki/Machine_learning",
            "https://en.wikipedia.org/wiki/Quantum_computing"
        ]
       
        while True:
            console.print("n" + "="*60)
            console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f3af.png" alt="🎯" class="wp-smiley" /> Choose an option:", style="bold yellow")
            console.print("1. Extract content from URLs")
            console.print("2. AI-powered analysis")
            console.print("3. Demo with sample URLs")
            console.print("4. Exit")
           
            choice = input("nEnter choice (1-4): ").strip()
           
            if choice == "1":
                urls_input = input("Enter URLs (comma-separated): ")
                urls = [url.strip() for url in urls_input.split(",")]
                results = agent.extract_content(urls)
                agent.display_results(results)
               
            elif choice == "2":
                query = input("Enter your analysis query: ")
                urls_input = input("Enter URLs to analyze (optional, comma-separated): ")
                urls = [url.strip() for url in urls_input.split(",") if url.strip()] if urls_input.strip() else None
               
                try:
                    response = agent.analyze_with_ai(query, urls)
                    console.print(Panel(Markdown(response), title="<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f916.png" alt="🤖" class="wp-smiley" /> AI Analysis", border_style="blue"))
                except Exception as e:
                    console.print(f"<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Analysis failed: {e}", style="bold red")
               
            elif choice == "3":
                console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f3ac.png" alt="🎬" class="wp-smiley" /> Running demo with AI & Quantum Computing URLs...")
                results = agent.extract_content(demo_urls)
                agent.display_results(results)
               
                response = agent.analyze_with_ai(
                    "Compare AI, ML, and Quantum Computing. What are the key relationships?",
                    demo_urls
                )
                console.print(Panel(Markdown(response), title="<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f9e0.png" alt="🧠" class="wp-smiley" /> Comparative Analysis", border_style="magenta"))
               
            elif choice == "4":
                console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/1f44b.png" alt="👋" class="wp-smiley" /> Goodbye!", style="bold green")
                break
            else:
                console.print("<img src="https://s.w.org/images/core/emoji/15.1.0/72x72/274c.png" alt="❌" class="wp-smiley" /> Invalid choice!", style="bold red")
    
    
    if __name__ == "__main__":
        main()
    

    Check out the Notebook here

    The main function provides an interactive command-line demonstration of the Smart Web Intelligence Agent. It presents users with an intuitive menu that allows them to extract web content from custom URLs, perform sophisticated AI-driven analyses on selected topics, or explore predefined demos involving AI, machine learning, and quantum computing. Rich visual formatting enhances user engagement, making complex web analysis tasks straightforward and user-friendly.

    In conclusion, by following this comprehensive tutorial, we’ve built an enhanced Tavily Web Intelligence Agent capable of sophisticated web content extraction and intelligent analysis using Google’s Gemini AI. Through structured data extraction, dynamic AI queries, and visually appealing results, this powerful agent streamlines research tasks, enriches your data analytics workflows, and fosters deeper insights from web content. With this foundation, we are now equipped to extend this agent further, customize it for specific use cases, and harness the combined power of AI and web intelligence to enhance productivity and decision-making in our projects.


    Check out the Notebook here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.

    The post A Coding Implementation to Build an Advanced Web Intelligence Agent with Tavily and Gemini AI appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleNVIDIA AI Releases Llama Nemotron Nano VL: A Compact Vision-Language Model Optimized for Document Understanding
    Next Article Can AI Site Builders Make WordPress Easier?

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    September 3, 2025
    Machine Learning

    Announcing the new cluster creation experience for Amazon SageMaker HyperPod

    September 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    UX Job Interview Helpers

    Tech & Work

    Capcom reports that its Steam game sales have risen massively — despite flagship titles like Monster Hunter Wilds receiving profuse backlash from PC players

    News & Updates

    Sam Altman says “OpenAI was forced to do a lot of unnatural things” to meet the Ghibli memes demand surge

    News & Updates

    Why Your Automation Needs AI Decision-Making (And How Wordware Delivers)

    Development

    Highlights

    CVE-2025-44141 – Backdrop CMS Cross-Site Scripting (XSS)

    June 26, 2025

    CVE ID : CVE-2025-44141

    Published : June 26, 2025, 4:15 p.m. | 51 minutes ago

    Description : A Cross-Site Scripting (XSS) vulnerability exists in the node creation form of Backdrop CMS 1.30.

    Severity: 0.0 | NA

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    With KB5055518, Windows 10 finally fixes a basic File Explorer issue

    April 9, 2025

    Route Optimization through Laravel’s Shallow Resource Architecture

    July 31, 2025

    qt-fsarchiver – backup and restore partitions

    April 30, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.