Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis

In this tutorial, we walk through building an advanced PaperQA2 AI Agent powered by Google’s Gemini model, designed specifically for scientific literature analysis. We set up the environment in Google Colab/Notebook, configure the Gemini API, and integrate it seamlessly with PaperQA2 to process and query multiple research papers. By the end of the setup, we have an intelligent agent capable of answering complex questions, performing multi-question analyses, and conducting comparative research across papers, all while providing clear answers with evidence from source documents. Check out the Full Codes here.

Copy CodeCopiedUse a different Browser

!pip install paper-qa>=5 google-generativeai requests pypdf2 -q


import os
import asyncio
import tempfile
import requests
from pathlib import Path
from paperqa import Settings, ask, agent_query
from paperqa.settings import AgentSettings
import google.generativeai as genai


GEMINI_API_KEY = "Use Your Own API Key Here"
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY


genai.configure(api_key=GEMINI_API_KEY)
print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Gemini API key configured successfully!")

We begin by installing the required libraries, including PaperQA2 and Google’s Generative AI SDK, and then import the necessary modules for our project. We set our Gemini API key as an environment variable and configure it, ensuring the integration is ready for use. Check out the Full Codes here.

Copy CodeCopiedUse a different Browser

def download_sample_papers():
   """Download sample AI/ML research papers for demonstration"""
   papers = {
       "attention_is_all_you_need.pdf": "https://arxiv.org/pdf/1706.03762.pdf",
       "bert_paper.pdf": "https://arxiv.org/pdf/1810.04805.pdf",
       "gpt3_paper.pdf": "https://arxiv.org/pdf/2005.14165.pdf"
   }
  
   papers_dir = Path("sample_papers")
   papers_dir.mkdir(exist_ok=True)
  
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4e5.png" alt="📥" class="wp-smiley" /> Downloading sample research papers...")
   for filename, url in papers.items():
       filepath = papers_dir / filename
       if not filepath.exists():
           try:
               response = requests.get(url, stream=True, timeout=30)
               response.raise_for_status()
               with open(filepath, 'wb') as f:
                   for chunk in response.iter_content(chunk_size=8192):
                       f.write(chunk)
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Downloaded: {filename}")
           except Exception as e:
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Failed to download {filename}: {e}")
       else:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4c4.png" alt="📄" class="wp-smiley" /> Already exists: {filename}")
  
   return str(papers_dir)


papers_directory = download_sample_papers()


def create_gemini_settings(paper_dir: str, temperature: float = 0.1):
   """Create optimized settings for PaperQA2 with Gemini models"""
  
   return Settings(
       llm="gemini/gemini-1.5-flash",
       summary_llm="gemini/gemini-1.5-flash",
      
       agent=AgentSettings(
           agent_llm="gemini/gemini-1.5-flash",
           search_count=6, 
           timeout=300.0, 
       ),
      
       embedding="gemini/text-embedding-004",
      
       temperature=temperature,
       paper_directory=paper_dir,
      
       answer=dict(
           evidence_k=8,            
           answer_max_sources=4,      
           evidence_summary_length="about 80 words",
           answer_length="about 150 words, but can be longer",
           max_concurrent_requests=2,
       ),
      
       parsing=dict(
           chunk_size=4000,
           overlap=200,
       ),
      
       verbosity=1,
   )

We download a set of well-known AI/ML research papers for our analysis and store them in a dedicated folder. We then create optimized PaperQA2 settings configured to use Gemini for all LLM and embedding tasks, fine-tuning parameters like search count, evidence retrieval, and parsing for efficient and accurate literature processing. Check out the Full Codes here.

Copy CodeCopiedUse a different Browser

class PaperQAAgent:
   """Advanced AI Agent for scientific literature analysis using PaperQA2"""
  
   def __init__(self, papers_directory: str, temperature: float = 0.1):
       self.settings = create_gemini_settings(papers_directory, temperature)
       self.papers_dir = papers_directory
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f916.png" alt="🤖" class="wp-smiley" /> PaperQA Agent initialized with papers from: {papers_directory}")
      
   async def ask_question(self, question: str, use_agent: bool = True):
       """Ask a question about the research papers"""
       print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2753.png" alt="❓" class="wp-smiley" /> Question: {question}")
       print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f50d.png" alt="🔍" class="wp-smiley" /> Searching through research papers...")
      
       try:
           if use_agent:
               response = await agent_query(query=question, settings=self.settings)
           else:
               response = ask(question, settings=self.settings)
              
           return response
          
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Error processing question: {e}")
           return None
  
   def display_answer(self, response):
       """Display the answer with formatting"""
       if response is None:
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> No response received")
           return
          
       print("n" + "="*60)
       print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4cb.png" alt="📋" class="wp-smiley" /> ANSWER:")
       print("="*60)
      
       answer_text = getattr(response, 'answer', str(response))
       print(f"n{answer_text}")
      
       contexts = getattr(response, 'contexts', getattr(response, 'context', []))
       if contexts:
           print("n" + "-"*40)
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> SOURCES USED:")
           print("-"*40)
           for i, context in enumerate(contexts[:3], 1):
               context_name = getattr(context, 'name', getattr(context, 'doc', f'Source {i}'))
               context_text = getattr(context, 'text', getattr(context, 'content', str(context)))
               print(f"n{i}. {context_name}")
               print(f"   Text preview: {context_text[:150]}...")
  
   async def multi_question_analysis(self, questions: list):
       """Analyze multiple questions in sequence"""
       results = {}
       for i, question in enumerate(questions, 1):
           print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f504.png" alt="🔄" class="wp-smiley" /> Processing question {i}/{len(questions)}")
           response = await self.ask_question(question)
           results = response
          
           if response:
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Completed: {question[:50]}...")
           else:
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Failed: {question[:50]}...")
              
       return results
  
   async def comparative_analysis(self, topic: str):
       """Perform comparative analysis across papers"""
       questions = [
           f"What are the key innovations in {topic}?",
           f"What are the limitations of current {topic} approaches?",
           f"What future research directions are suggested for {topic}?",
       ]
      
       print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f52c.png" alt="🔬" class="wp-smiley" /> Starting comparative analysis on: {topic}")
       return await self.multi_question_analysis(questions)


async def basic_demo():
   """Demonstrate basic PaperQA functionality"""
   agent = PaperQAAgent(papers_directory)
  
   question = "What is the transformer architecture and why is it important?"
   response = await agent.ask_question(question)
   agent.display_answer(response)


print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f680.png" alt="🚀" class="wp-smiley" /> Running basic demonstration...")
await basic_demo()


async def advanced_demo():
   """Demonstrate advanced multi-question analysis"""
   agent = PaperQAAgent(papers_directory, temperature=0.2)
  
   questions = [
       "How do attention mechanisms work in transformers?",
       "What are the computational challenges of large language models?",
       "How has pre-training evolved in natural language processing?"
   ]
  
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f9e0.png" alt="🧠" class="wp-smiley" /> Running advanced multi-question analysis...")
   results = await agent.multi_question_analysis(questions)
  
   for question, response in results.items():
       print(f"n{'='*80}")
       print(f"Q: {question}")
       print('='*80)
       if response:
           answer_text = getattr(response, 'answer', str(response))
           display_text = answer_text[:300] + "..." if len(answer_text) > 300 else answer_text
           print(display_text)
       else:
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> No answer available")


print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f680.png" alt="🚀" class="wp-smiley" /> Running advanced demonstration...")
await advanced_demo()


async def research_comparison_demo():
   """Demonstrate comparative research analysis"""
   agent = PaperQAAgent(papers_directory)
  
   results = await agent.comparative_analysis("attention mechanisms in neural networks")
  
   print("n" + "="*80)
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4ca.png" alt="📊" class="wp-smiley" /> COMPARATIVE ANALYSIS RESULTS")
   print("="*80)
  
   for question, response in results.items():
       print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f50d.png" alt="🔍" class="wp-smiley" /> {question}")
       print("-" * 50)
       if response:
           answer_text = getattr(response, 'answer', str(response))
           print(answer_text)
       else:
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Analysis unavailable")
       print()


print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f680.png" alt="🚀" class="wp-smiley" /> Running comparative research analysis...")
await research_comparison_demo()

̌We define a PaperQAAgent that uses our Gemini-tuned PaperQA2 settings to search papers, answer questions, and cite sources with clean display helpers. We then run basic, advanced multi-question, and comparative demos so we can interrogate literature end-to-end and summarize findings efficiently. Check out the Full Codes here.

Copy CodeCopiedUse a different Browser

def create_interactive_agent():
   """Create an interactive agent for custom queries"""
   agent = PaperQAAgent(papers_directory)
  
   async def query(question: str, show_sources: bool = True):
       """Interactive query function"""
       response = await agent.ask_question(question)
      
       if response:
           answer_text = getattr(response, 'answer', str(response))
           print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f916.png" alt="🤖" class="wp-smiley" /> Answer:n{answer_text}")
          
           if show_sources:
               contexts = getattr(response, 'contexts', getattr(response, 'context', []))
               if contexts:
                   print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> Based on {len(contexts)} sources:")
                   for i, ctx in enumerate(contexts[:3], 1):
                       ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
                       print(f"  {i}. {ctx_name}")
       else:
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Sorry, I couldn't find an answer to that question.")
          
       return response
  
   return query


interactive_query = create_interactive_agent()


print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f3af.png" alt="🎯" class="wp-smiley" /> Interactive agent ready! You can now ask custom questions:")
print("Example: await interactive_query('How do transformers handle long sequences?')")


def print_usage_tips():
   """Print helpful usage tips"""
   tips = """
   <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f3af.png" alt="🎯" class="wp-smiley" /> USAGE TIPS FOR PAPERQA2 WITH GEMINI:
  
   1. <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4dd.png" alt="📝" class="wp-smiley" /> Question Formulation:
      - Be specific about what you want to know
      - Ask about comparisons, mechanisms, or implications
      - Use domain-specific terminology
  
   2. <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f527.png" alt="🔧" class="wp-smiley" /> Model Configuration:
      - Gemini 1.5 Flash is free and reliable
      - Adjust temperature (0.0-1.0) for creativity vs precision
      - Use smaller chunk_size for better processing
  
   3. <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> Document Management:
      - Add PDFs to the papers directory
      - Use meaningful filenames
      - Mix different types of papers for better coverage
  
   4. <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a1.png" alt="⚡" class="wp-smiley" /> Performance Optimization:
      - Limit concurrent requests for free tier
      - Use smaller evidence_k values for faster responses
      - Cache results by saving the agent state
  
   5. <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f9e0.png" alt="🧠" class="wp-smiley" /> Advanced Usage:
      - Chain multiple questions for deeper analysis
      - Use comparative analysis for research reviews
      - Combine with other tools for complete workflows
  
   <img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4d6.png" alt="📖" class="wp-smiley" /> Example Questions to Try:
   - "Compare the attention mechanisms in BERT vs GPT models"
   - "What are the computational bottlenecks in transformer training?"
   - "How has pre-training evolved from word2vec to modern LLMs?"
   - "What are the key innovations that made transformers successful?"
   """
   print(tips)


print_usage_tips()


def save_analysis_results(results: dict, filename: str = "paperqa_analysis.txt"):
   """Save analysis results to a file"""
   with open(filename, 'w', encoding='utf-8') as f:
       f.write("PaperQA2 Analysis Resultsn")
       f.write("=" * 50 + "nn")
      
       for question, response in results.items():
           f.write(f"Question: {question}n")
           f.write("-" * 30 + "n")
           if response:
               answer_text = getattr(response, 'answer', str(response))
               f.write(f"Answer: {answer_text}n")
              
               contexts = getattr(response, 'contexts', getattr(response, 'context', []))
               if contexts:
                   f.write(f"nSources ({len(contexts)}):n")
                   for i, ctx in enumerate(contexts, 1):
                       ctx_name = getattr(ctx, 'name', getattr(ctx, 'doc', f'Source {i}'))
                       f.write(f"  {i}. {ctx_name}n")
           else:
               f.write("Answer: No response availablen")
           f.write("n" + "="*50 + "nn")
  
   print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4be.png" alt="💾" class="wp-smiley" /> Results saved to: {filename}")


print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Tutorial complete! You now have a fully functional PaperQA2 AI Agent with Gemini.")

We create an interactive query helper that allows us to ask custom questions on demand and optionally view cited sources. We also print practical usage tips and add a saver that writes every Q&A with source names to a results file, wrapping up the tutorial with a ready-to-use workflow.

In conclusion, we successfully created a fully functional AI research assistant that leverages the speed and versatility of Gemini with the robust paper processing capabilities of PaperQA2. We can now interactively explore scientific papers, run targeted queries, and even perform in-depth comparative analyses with minimal effort. This setup enhances our ability to digest complex research and also streamlines the entire literature review process, enabling us to focus on insights rather than manual searching.

Check out the Full Codes here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

Discuss on Hacker News

Join our ML Subreddit

Sponsor us

The post Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis appeared first on MarkTechPost.

Source: Read MoreÂ

CodeSOD: Across the 4th Dimension

Cursor vs GitHub Copilot (2025): Which AI Platform Wins for Your Node.js Dev Team?

NuGet adds support for Trusted Publishing

AWS launches IDE extension for building browser automation agents

Distribution Release: Kali Linux 2025.3

Distribution Release: SysLinuxOS 13

Development Release: MX Linux 25 Beta 1

DistroWatch Weekly, Issue 1140

Beyond Denial: How AI Concierge Services Can Transform Healthcare from Reactive to Proactive

Beyond Denial: How AI Concierge Services Can Transform Healthcare from Reactive to Proactive

IDC ServiceScape for Microsoft Power Apps Low-Code/No-Code Custom Application Development Services

A Stream-Oriented UI library for interactive web applications

FOSS Weekly #25.39: Kill Switch Phones, LMDE 7, Zorin OS 18 Beta, Polybar, Apt History and More Linux Stuff

FOSS Weekly #25.39: Kill Switch Phones, LMDE 7, Zorin OS 18 Beta, Polybar, Apt History and More Linux Stuff

Distribution Release: Kali Linux 2025.3

Distribution Release: SysLinuxOS 13

Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Announcing the new cluster creation experience for Amazon SageMaker HyperPod

WWE Cactus Jack x WWE Merchandise

CVE-2025-21479 and 27038 Actively Exploited, Google Issues Emergency Android Patches

CVE-2025-5068 – Google Chrome Blink Use-After-Free Vulnerability

How to Redesign a UI Without Losing Usability: A Case Study on Modernizing a Legacy App

Rilasciata Mesa 25.1: Grafica Open Source al Top con Vulkan 1.4 e PanVK

CVE-2025-55422 – FoxCMS Reflected Cross Site Scripting (XSS)

CVE-2025-28986 – Webaholicson Epicwin Plugin CSRF SQL Injection

This week in AI dev tools: Gemini 2.5 Flash-Lite, GitLab Duo Agent Platform beta, and more (July 25, 2025)

Building an Advanced PaperQA2 Research Agent with Google Gemini for Scientific Literature Analysis

Related Posts