A Coding Guide to Build an Intelligent Conversational AI Agent with Agent Memory Using Cognee and Free Hugging Face Models

In this tutorial, we delve into building an advanced AI agent with agent memory using Cognee and Hugging Face models, utilizing entirely free, open-source tools that work seamlessly in Google Colab and other notebook. We configure Cognee for memory storage and retrieval, integrate a lightweight conversational model for generating responses, and bring it all together into an intelligent agent that learns, reasons, and interacts naturally. Whether it’s processing documents across domains or engaging in dialogue with contextual understanding, we walk through each step to create a capable agent without relying on paid APIs. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

Copy CodeCopiedUse a different Browser

!pip install cognee transformers torch sentence-transformers accelerate


import asyncio
import os
import json
from typing import List, Dict, Any
from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch


import cognee

We begin by installing all the essential libraries, including Cognee, Transformers, Torch, and Sentence-Transformers, to power our AI agent. We then import the required modules to handle tokenization, model loading, asynchronous tasks, and memory integration. This setup ensures we have everything ready to build, train, and interact with our intelligent agent. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

Copy CodeCopiedUse a different Browser

async def setup_cognee():
   """Setup Cognee with proper configuration"""
   try:
       await cognee.config.set("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
       await cognee.config.set("EMBEDDING_PROVIDER", "sentence_transformers")
       print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Cognee configured successfully")
       return True
   except Exception as e:
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Cognee config error: {e}")
       try:
           os.environ["EMBEDDING_MODEL"] = "sentence-transformers/all-MiniLM-L6-v2"
           os.environ["EMBEDDING_PROVIDER"] = "sentence_transformers"
           print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Cognee configured via environment")
           return True
       except Exception as e2:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Alternative config failed: {e2}")
           return False

We set up Cognee by configuring the embedding model and provider to use all-MiniLM-L6-v2, a lightweight and efficient sentence-transformer. If the primary method fails, we fall back to manually setting environment variables, ensuring Cognee is always ready to process and store embeddings. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

Copy CodeCopiedUse a different Browser

class HuggingFaceLLM:
   def __init__(self, model_name="microsoft/DialoGPT-medium"):
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f916.png" alt="🤖" class="wp-smiley" /> Loading Hugging Face model: {model_name}")
       self.device = "cuda" if torch.cuda.is_available() else "cpu"
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4f1.png" alt="📱" class="wp-smiley" /> Using device: {self.device}")
      
       if "DialoGPT" in model_name:
           self.tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side='left')
           self.model = AutoModelForCausalLM.from_pretrained(model_name)
           if self.tokenizer.pad_token is None:
               self.tokenizer.pad_token = self.tokenizer.eos_token
       else:
           self.generator = pipeline(
               "text-generation",
               model="distilgpt2",
               device=0 if self.device == "cuda" else -1,
               max_length=150,
               do_sample=True,
               temperature=0.7
           )
           self.tokenizer = None
           self.model = None
      
       print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Model loaded successfully!")
  
   def generate_response(self, prompt: str, max_length: int = 100) -> str:
       try:
           if self.model is not None:
               inputs = self.tokenizer.encode(prompt + self.tokenizer.eos_token, return_tensors='pt')
              
               with torch.no_grad():
                   outputs = self.model.generate(
                       inputs,
                       max_length=inputs.shape[1] + max_length,
                       num_return_sequences=1,
                       temperature=0.7,
                       do_sample=True,
                       pad_token_id=self.tokenizer.eos_token_id
                   )
              
               response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
               response = response[len(prompt):].strip()
               return response if response else "I understand."
          
           else:
               result = self.generator(prompt, max_length=max_length, truncation=True)
               return result[0]['generated_text'][len(prompt):].strip()
              
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Generation error: {e}")
           return "I'm processing that information."


hf_llm = None

We define the HuggingFaceLLM class to handle text generation using lightweight Hugging Face models, such as DialoGPT or DistilGPT2. We detect whether a GPU is available and load the appropriate tokenizer and model accordingly. This setup enables our agent to generate intelligent and context-aware responses to user queries. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

Copy CodeCopiedUse a different Browser

class AdvancedAIAgent:
   """
   Advanced AI Agent with persistent memory, learning capabilities,
   and multi-domain knowledge processing using Cognee
   """
  
   def __init__(self, agent_name: str = "CogneeAgent"):
       self.name = agent_name
       self.memory_initialized = False
       self.knowledge_domains = []
       self.conversation_history = []
       self.manual_memory = [] 
      
   async def initialize_memory(self):
       """Initialize the agent's memory system and HF model"""
       global hf_llm
       if hf_llm is None:
           hf_llm = HuggingFaceLLM("microsoft/DialoGPT-medium")
      
       setup_success = await setup_cognee()
      
       try:
           await cognee.prune() 
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> {self.name} memory system initialized")
           self.memory_initialized = True
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Memory initialization warning: {e}")
           self.memory_initialized = True
  
   async def learn_from_text(self, text: str, domain: str = "general"):
       """Add knowledge to the agent's memory with domain tagging"""
       if not self.memory_initialized:
           await self.initialize_memory()
      
       enhanced_text = f"[DOMAIN: {domain}] [TIMESTAMP: {datetime.now().isoformat()}]n{text}"
      
       try:
           await cognee.add(enhanced_text)
           await cognee.cognify() 
           if domain not in self.knowledge_domains:
               self.knowledge_domains.append(domain)
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> Learned new knowledge in domain: {domain}")
           return True
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Learning error: {e}")
           try:
               await cognee.add(text)
               await cognee.cognify()
               if domain not in self.knowledge_domains:
                   self.knowledge_domains.append(domain)
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> Learned (simplified): {domain}")
               return True
           except Exception as e2:
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/274c.png" alt="❌" class="wp-smiley" /> Simplified learning failed: {e2}")
               if not hasattr(self, 'manual_memory'):
                   self.manual_memory = []
               self.manual_memory.append({"text": text, "domain": domain})
               if domain not in self.knowledge_domains:
                   self.knowledge_domains.append(domain)
               print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> Stored in manual memory: {domain}")
               return True
  
   async def learn_from_documents(self, documents: List[Dict[str, str]]):
       """Batch learning from multiple documents"""
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4d6.png" alt="📖" class="wp-smiley" /> Processing {len(documents)} documents...")
      
       for i, doc in enumerate(documents):
           text = doc.get("content", "")
           domain = doc.get("domain", "general")
           title = doc.get("title", f"Document_{i+1}")
          
           enhanced_content = f"Title: {title}n{text}"
           await self.learn_from_text(enhanced_content, domain)
          
           if i % 3 == 0:
               print(f"  Processed {i+1}/{len(documents)} documents")
  
   async def query_knowledge(self, question: str, domain_filter: str = None) -> List[str]:
       """Query the agent's knowledge base with optional domain filtering"""
       try:
           if domain_filter:
               enhanced_query = f"[DOMAIN: {domain_filter}] {question}"
           else:
               enhanced_query = question
              
           search_results = await cognee.search("SIMILARITY", enhanced_query)
          
           results = []
           for result in search_results:
               if hasattr(result, 'text'):
                   results.append(result.text)
               elif hasattr(result, 'content'):
                   results.append(result.content)
               elif hasattr(result, 'value'):
                   results.append(str(result.value))
               elif isinstance(result, dict):
                   content = result.get('text') or result.get('content') or result.get('data') or result.get('value')
                   if content:
                       results.append(str(content))
                   else:
                       results.append(str(result))
               elif isinstance(result, str):
                   results.append(result)
               else:
                   result_str = str(result)
                   if len(result_str) > 10: 
                       results.append(result_str)
          
           if not results and hasattr(self, 'manual_memory'):
               for item in self.manual_memory:
                   if domain_filter and item['domain'] != domain_filter:
                       continue
                   if any(word.lower() in item['text'].lower() for word in question.split()):
                       results.append(item['text'])
          
           return results[:5] 
          
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f50d.png" alt="🔍" class="wp-smiley" /> Search error: {e}")
           results = []
           if hasattr(self, 'manual_memory'):
               for item in self.manual_memory:
                   if domain_filter and item['domain'] != domain_filter:
                       continue
                   if any(word.lower() in item['text'].lower() for word in question.split()):
                       results.append(item['text'])
           return results[:5]
  
   async def reasoning_chain(self, question: str) -> Dict[str, Any]:
       """Advanced reasoning using retrieved knowledge"""
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f914.png" alt="🤔" class="wp-smiley" /> Processing question: {question}")
      
       relevant_info = await self.query_knowledge(question)
      
       analysis = {
           "question": question,
           "relevant_knowledge": relevant_info,
           "domains_searched": self.knowledge_domains,
           "confidence": min(len(relevant_info) / 3.0, 1.0), 
           "timestamp": datetime.now().isoformat()
       }
      
       if relevant_info and len(relevant_info) > 0:
           reasoning = self._synthesize_answer(question, relevant_info)
           analysis["reasoning"] = reasoning
           analysis["answer"] = self._extract_key_points(relevant_info)
       else:
           analysis["reasoning"] = "No relevant knowledge found in memory"
           analysis["answer"] = "I don't have information about this topic in my current knowledge base."
      
       return analysis




   def _synthesize_answer(self, question: str, knowledge_pieces: List[str]) -> str:
       """AI-powered answer synthesis using Hugging Face model"""
       global hf_llm
      
       if not knowledge_pieces:
           return "No relevant information found in my knowledge base."
      
       context = " ".join(knowledge_pieces[:2]) 
       context = context[:300] 
      
       prompt = f"Based on this information: {context}nnQuestion: {question}nAnswer:"
      
       try:
           if hf_llm:
               synthesized = hf_llm.generate_response(prompt, max_length=80)
               return synthesized if synthesized else f"Based on my knowledge: {context[:100]}..."
           else:
               return f"From my analysis: {context[:150]}..."
       except Exception as e:
           print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/26a0.png" alt="⚠" class="wp-smiley" /> Synthesis error: {e}")
           return f"Based on my knowledge: {context[:100]}..."
  
   def _extract_key_points(self, knowledge_pieces: List[str]) -> List[str]:
       """Extract key points from retrieved knowledge"""
       key_points = []
       for piece in knowledge_pieces:
           clean_piece = piece.replace("[DOMAIN:", "").replace("[TIMESTAMP:", "")
           sentences = clean_piece.split('.')
           if len(sentences) > 0 and len(sentences[0].strip()) > 10:
               key_points.append(sentences[0].strip() + ".")
      
       return key_points[:3] 


   async def conversational_agent(self, user_input: str) -> str:
       """Main conversational interface with HF model integration"""
       global hf_llm
       self.conversation_history.append({"role": "user", "content": user_input})
      
       if any(word in user_input.lower() for word in ["learn", "remember", "add", "teach"]):
           content_to_learn = user_input.replace("learn this:", "").replace("remember:", "").strip()
           await self.learn_from_text(content_to_learn, "conversation")
           response = "I've stored that information in my memory! What else would you like to teach me?"
          
       elif user_input.lower().startswith(("what", "how", "why", "when", "where", "who", "tell me")):
           analysis = await self.reasoning_chain(user_input)
          
           if analysis["relevant_knowledge"] and hf_llm:
               context = " ".join(analysis["relevant_knowledge"][:2])[:200]
               prompt = f"Question: {user_input}nKnowledge: {context}nFriendly response:"
               ai_response = hf_llm.generate_response(prompt, max_length=60)
               response = ai_response if ai_response else "Here's what I found in my knowledge base."
           else:
               response = "I don't have specific information about that topic in my current knowledge base."
              
       else:
           relevant_context = await self.query_knowledge(user_input)
          
           if hf_llm:
               context_info = ""
               if relevant_context:
                   context_info = f" I know that: {relevant_context[0][:100]}..."
              
               conversation_prompt = f"User says: {user_input}{context_info}nI respond:"
               response = hf_llm.generate_response(conversation_prompt, max_length=50)
              
               if not response or len(response.strip()) < 3:
                   response = "That's interesting! I'm learning from our conversation."
           else:
               response = "I'm listening and learning from our conversation."
      
       self.conversation_history.append({"role": "assistant", "content": response})
       return response

We now define the core of our system, the AdvancedAIAgent class, which brings together Cognee’s memory, domain-aware learning, knowledge retrieval, and Hugging Face-powered reasoning. We empower our agent to learn from both text and documents, retrieve contextually relevant knowledge, and respond to queries with synthesized, intelligent answers. Whether it’s remembering facts, answering questions, or engaging in conversation, this agent adapts, remembers, and responds with human-like fluency. Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications.

Copy CodeCopiedUse a different Browser

async def main():
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f680.png" alt="🚀" class="wp-smiley" /> Advanced AI Agent with Cognee Tutorial")
   print("=" * 50)
  
   agent = AdvancedAIAgent("TutorialAgent")
   await agent.initialize_memory()
  
   print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4da.png" alt="📚" class="wp-smiley" /> DEMO 1: Multi-domain Learning")
   sample_documents = [
       {
           "title": "Python Basics",
           "content": "Python is a high-level programming language known for its simplicity and readability.",
           "domain": "programming"
       },
       {
           "title": "Climate Science",
           "content": "Climate change",
           "domain": "science"
       },
       {
           "title": "AI Ethics",
           "content": "AI ethics involves ensuring artificial intelligence systems are developed and deployed responsibly, considering fairness, transparency, accountability, and potential societal impacts.",
           "domain": "technology"
       },
       {
           "title": "Sustainable Energy",
           "content": "Renewable energy sources are crucial for reducing carbon emissions",
           "domain": "environment"
       }
   ]
  
   await agent.learn_from_documents(sample_documents)
  
   print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f50d.png" alt="🔍" class="wp-smiley" /> DEMO 2: Knowledge Retrieval & Reasoning")
   test_questions = [
       "What do you know about Python programming?",
       "How does climate change relate to energy?",
       "What are the ethical considerations in AI?"
   ]
  
   for question in test_questions:
       print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2753.png" alt="❓" class="wp-smiley" /> Question: {question}")
       analysis = await agent.reasoning_chain(question)
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4a1.png" alt="💡" class="wp-smiley" /> Answer: {analysis.get('answer', 'No answer generated')}")
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f3af.png" alt="🎯" class="wp-smiley" /> Confidence: {analysis.get('confidence', 0):.2f}")
  
   print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4ac.png" alt="💬" class="wp-smiley" /> DEMO 3: Conversational Agent")
   conversation_inputs = [
       "Learn this: Machine learning is a subset of AI",
       "What is machine learning?",
       "How does it relate to Python?",
       "Remember that neural networks are inspired by biological neurons"
   ]
  
   for user_input in conversation_inputs:
       print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f5e3.png" alt="🗣" class="wp-smiley" /> User: {user_input}")
       response = await agent.conversational_agent(user_input)
       print(f"<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f916.png" alt="🤖" class="wp-smiley" /> Agent: {response}")
  
   print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4ca.png" alt="📊" class="wp-smiley" /> DEMO 4: Agent Knowledge Summary")
   print(f"Knowledge domains: {agent.knowledge_domains}")
   print(f"Conversation history: {len(agent.conversation_history)} exchanges")
  
   print(f"n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f3af.png" alt="🎯" class="wp-smiley" /> Domain-specific search:")
   programming_results = await agent.query_knowledge("programming concepts", "programming")
   print(f"Programming knowledge: {len(programming_results)} results found")


if __name__ == "__main__":
   print("Starting Advanced AI Agent Tutorial with Hugging Face Models...")
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f917.png" alt="🤗" class="wp-smiley" /> Using free models from Hugging Face Hub")
   print("<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4f1.png" alt="📱" class="wp-smiley" /> GPU acceleration available!" if torch.cuda.is_available() else "<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/1f4bb.png" alt="💻" class="wp-smiley" /> Running on CPU")
  
   try:
       await main()
   except RuntimeError:
       import nest_asyncio
       nest_asyncio.apply()
       asyncio.run(main())
  
   print("n<img src="https://s.w.org/images/core/emoji/16.0.1/72x72/2705.png" alt="✅" class="wp-smiley" /> Tutorial completed! You've learned:")
   print("• How to set up Cognee with Hugging Face models")
   print("• AI-powered response generation")
   print("• Multi-domain knowledge management")
   print("• Advanced reasoning and retrieval")
   print("• Conversational agent with memory")
   print("• Free GPU-accelerated inference")

We conclude the tutorial by running a comprehensive demonstration of our AI agent in action. We first teach it from multi-domain documents, then test its ability to retrieve knowledge and reason intelligently. Next, we engage it in a natural conversation, watching it learn and recall information taught by users. Finally, we view a summary of its memory, showcasing how it organizes and filters knowledge by domain, all with real-time inference using free Hugging Face models.

In conclusion, we’ve built a fully functional AI agent that can learn from structured data, recall and reason with stored knowledge, and converse intelligently using Hugging Face models. We configure Cognee for persistent memory, demonstrate domain-specific queries, and even simulate real conversations with the agent.

Check out the Full Codes here. Feel free to check other AI Agent and Agentic AI Codes and Tutorial for various applications. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post A Coding Guide to Build an Intelligent Conversational AI Agent with Agent Memory Using Cognee and Free Hugging Face Models appeared first on MarkTechPost.

Source: Read MoreÂ

CodeSOD: Functionally, a Date

Creating Elastic And Bounce Effects With Expressive Animator

Microsoft shares Insiders preview of Visual Studio 2026

From Data To Decisions: UX Strategies For Real-Time Dashboards

DistroWatch Weekly, Issue 1139

Building personal apps with open source and AI

What Can We Actually Do With corner-shape?

Craft, Clarity, and Care: The Story and Work of Mengchu Yao

Can I use React Server Components (RSCs) today?

Can I use React Server Components (RSCs) today?

Perficient Named among Notable Providers in Forrester’s Q3 2025 Commerce Services Landscape

Sarah McDowell Helps Clients Build a Strong AI Foundation Through Salesforce

I Ran Local LLMs on My Android Phone

I Ran Local LLMs on My Android Phone

DistroWatch Weekly, Issue 1139

sudo vs sudo-rs: What You Need to Know About the Rust Takeover of Classic Sudo Command

A Coding Guide to Build an Intelligent Conversational AI Agent with Agent Memory Using Cognee and Free Hugging Face Models

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Announcing the new cluster creation experience for Amazon SageMaker HyperPod

How to thrive as a junior engineer: Tips and insights

CVE-2024-51666 – Automattic Tours Missing Authorization Vulnerability

CVE-2025-23376 – Dell PowerProtect Data Manager Template Engine Template Injection Vulnerability

CVE-2025-47766 – Apache Apache HTTP Server Unvalidated Redirect

CVE-2025-46627 – Tenda RX2 Pro Weak Credential Vulnerability

CVE-2025-49186 – Apache HTTP Server Authentication Bypass

CVE-2025-3888 – “Jupiter X Core Stored Cross-Site Scripting Vulnerability”

Smashing Security podcast #413: Hacking the hackers… with a credit card?

A Coding Guide to Build an Intelligent Conversational AI Agent with Agent Memory Using Cognee and Free Hugging Face Models

Related Posts