Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 2, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 2, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 2, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 2, 2025

      How Red Hat just quietly, radically transformed enterprise server Linux

      June 2, 2025

      OpenAI wants ChatGPT to be your ‘super assistant’ – what that means

      June 2, 2025

      The best Linux VPNs of 2025: Expert tested and reviewed

      June 2, 2025

      One of my favorite gaming PCs is 60% off right now

      June 2, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      `document.currentScript` is more useful than I thought.

      June 2, 2025
      Recent

      `document.currentScript` is more useful than I thought.

      June 2, 2025

      Adobe Sensei and GenAI in Practice for Enterprise CMS

      June 2, 2025

      Over The Air Updates for React Native Apps

      June 2, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

      June 2, 2025
      Recent

      You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

      June 2, 2025

      Microsoft says Copilot can use location to change Outlook’s UI on Android

      June 2, 2025

      TempoMail — Command Line Temporary Email in Linux

      June 2, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Building a Retrieval-Augmented Generation (RAG) System with DeepSeek R1: A Step-by-Step Guide

    Building a Retrieval-Augmented Generation (RAG) System with DeepSeek R1: A Step-by-Step Guide

    January 27, 2025

    With the release of DeepSeek R1, there is a buzz in the AI community. The open-source model offers some best-in-class performance across many metrics, even at par with state-of-the-art proprietary models in many cases. Such huge success invites attention and curiosity to learn more about it. In this article, we will look into implementing a  Retrieval-Augmented Generation (RAG) system using DeepSeek R1. We will cover everything from setting up your environment to running queries with additional explanations and code snippets.

    As already widespread, RAG combines the strengths of retrieval-based and generation-based approaches. It retrieves relevant information from a knowledge base and uses it to generate accurate and contextually relevant responses to user queries.

    Some prerequisites for running the codes in this tutorial are as follows:

    • Python installed (preferably version 3.7 or higher).
    • Ollama installed: This framework allows running models like DeepSeek R1 locally.

    Now, let’s look into step-by-step implementation:

    Step 1: Install Ollama

    First, install Ollama by following the instructions on their website. Once installed, verify the installation by running:

    Copy CodeCopiedUse a different Browser
    # bash
    ollama --version

    Step 2: Run DeepSeek R1 Model

    To start the DeepSeek R1 model, open your terminal and execute:

    Copy CodeCopiedUse a different Browser
    # bash
    ollama run deepseek-r1:1.5b

    This command initializes the 1.5 billion parameter version of DeepSeek R1, which is suitable for various applications.

    Step 3: Prepare Your Knowledge Base

    A retrieval system requires a knowledge base from which it can pull information. This can be a collection of documents, articles, or any text data relevant to your domain.

    3.1 Load Your Documents

    You can load documents from various sources, such as text files, databases, or web scraping. Here’s an example of loading text files:

    Copy CodeCopiedUse a different Browser
    # python
    import os
    
    def load_documents(directory):
        documents = []
        for filename in os.listdir(directory):
            if filename.endswith('.txt'):
                with open(os.path.join(directory, filename), 'r') as file:
                    documents.append(file.read())
        return documents
    
    documents = load_documents('path/to/your/documents')

    Step 4: Create a Vector Store for Retrieval

    To enable efficient retrieval of relevant documents, you can use a vector store like FAISS (Facebook AI Similarity Search). This involves generating embeddings for your documents.

    4.1 Install Required Libraries

    Hostinger

    You may need to install additional libraries for embeddings and FAISS:

    Copy CodeCopiedUse a different Browser
    # bash
    pip install faiss-cpu huggingface-hub

    4.2 Generate Embeddings and Set Up FAISS

    Here’s how to generate embeddings and set up the FAISS vector store:

    Copy CodeCopiedUse a different Browser
    # python
    from huggingface_hub import HuggingFaceEmbeddings
    import faiss
    import numpy as np
    
    # Initialize the embeddings model
    embeddings_model = HuggingFaceEmbeddings()
    
    # Generate embeddings for all documents
    document_embeddings = [embeddings_model.embed(doc) for doc in documents]
    document_embeddings = np.array(document_embeddings).astype('float32')
    
    # Create FAISS index
    index = faiss.IndexFlatL2(document_embeddings.shape[1])  # L2 distance metric
    index.add(document_embeddings)  # Add document embeddings to the index

    Step 5: Set Up the Retriever

    You must create a retriever based on user queries to fetch the most relevant documents.

    Copy CodeCopiedUse a different Browser
    # python
    class SimpleRetriever:
        def __init__(self, index, embeddings_model):
            self.index = index
            self.embeddings_model = embeddings_model
        
        def retrieve(self, query, k=3):
            query_embedding = self.embeddings_model.embed(query)
            distances, indices = self.index.search(np.array([query_embedding]).astype('float32'), k)
            return [documents[i] for i in indices[0]]
    
    retriever = SimpleRetriever(index, embeddings_model)

    Step 6: Configure DeepSeek R1 for RAG

    Next, a prompt template will be set up to instruct DeepSeek R1 to respond based on retrieved context.

    Copy CodeCopiedUse a different Browser
    # python
    from ollama import Ollama
    from string import Template
    
    # Instantiate the model
    llm = Ollama(model="deepseek-r1:1.5b")
    
    # Craft the prompt template using string. Template for better readability
    prompt_template = Template("""
    Use ONLY the context below.
    If unsure, say "I don't know".
    Keep answers under 4 sentences.
    
    Context: $context
    Question: $question
    Answer:
    """)

    Step 7: Implement Query Handling Functionality

    Now, you can create a function that combines retrieval and generation to answer user queries:

    Copy CodeCopiedUse a different Browser
    # python
    def answer_query(question):
        # Retrieve relevant context from the knowledge base
        context = retriever.retrieve(question)
        
        # Combine retrieved contexts into a single string (if multiple)
        combined_context = "n".join(context)
        
        # Generate an answer using DeepSeek R1 with the combined context
        response = llm.generate(prompt_template.substitute(context=combined_context, question=question))
        
        return response.strip()

    Step 8: Running Your RAG System

    You can now test your RAG system by calling the `answer_query` function with any question about your knowledge base.

    Copy CodeCopiedUse a different Browser
    # python
    if __name__ == "__main__":
        user_question = "What are the key features of DeepSeek R1?"
        answer = answer_query(user_question)
        print("Answer:", answer)

    Access the Colab Notebook with the Complete code

    In conclusion, following these steps, you can successfully implement a Retrieval-Augmented Generation (RAG) system using DeepSeek R1. This setup allows you to retrieve information from your documents effectively and generate accurate responses based on that information. Also, explore the potential of the DeepSeek R1 model for your specific use case through this.

    Sources

    • https://arxiv.org/html/2501.12948v1 
    • https://ollama.com/
    • https://github.com/facebookresearch/faiss 
    • https://arxiv.org/pdf/2005.11401 
    • https://huggingface.co/blog/getting-started-with-embeddings

    The post Building a Retrieval-Augmented Generation (RAG) System with DeepSeek R1: A Step-by-Step Guide appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow to test that an image on a webpage changes every 24 hours
    Next Article This AI Paper Introduces IXC-2.5-Reward: A Multi-Modal Reward Model for Enhanced LVLM Alignment and Performance

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 2, 2025
    Machine Learning

    MiMo-VL-7B: A Powerful Vision-Language Model to Enhance General Visual Understanding and Multimodal Reasoning

    June 2, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    WCAG Testing Tutorial: Master Web Accessibility in 2024

    Development

    Maximize Compatibility: Testing Your Website in visionOS with Safari

    Development

    Hiring Kit: Multimedia Designer

    News & Updates

    Meta-Rewarding LLMs: A Self-Improving Alignment Technique Where the LLM Judges Its Own Judgements and Uses the Feedback to Improve Its Judgment Skills

    Development

    Highlights

    Samsung reveals new camera sensors ahead of Unpacked July 2024. Coming to Galaxy S25?

    June 28, 2024

    Samsung unveiled three new camera sensors ahead of the Unpacked July 2024 event, namely the…

    Do Reasoning Models Really Need Transformers?: Researchers from TogetherAI, Cornell, Geneva, and Princeton Introduce M1—A Hybrid Mamba-Based AI that Matches SOTA Performance at 3x Inference Speed

    April 18, 2025

    How to properly validate an ETL process?

    November 20, 2024

    When and How to Go Off-Grid in Design

    December 7, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.