Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      React.js for SaaS Platforms: How Top Development Teams Help Startups Launch Faster

      August 3, 2025

      Upwork Freelancers vs Dedicated React.js Teams: What’s Better for Your Project in 2025?

      August 1, 2025

      Is Agile dead in the age of AI?

      August 1, 2025

      Top 15 Enterprise Use Cases That Justify Hiring Node.js Developers in 2025

      July 31, 2025

      Unplugging these 7 common household devices helped reduce my electricity bills

      August 3, 2025

      DistroWatch Weekly, Issue 1133

      August 3, 2025

      Anthropic beats OpenAI as the top LLM provider for business – and it’s not even close

      August 2, 2025

      I bought Samsung’s Galaxy Watch Ultra 2025 – here’s why I have buyer’s remorse

      August 2, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The details of TC39’s last meeting

      August 3, 2025
      Recent

      The details of TC39’s last meeting

      August 3, 2025

      Enhancing Laravel Queries with Reusable Scope Patterns

      August 1, 2025

      Everything We Know About Livewire 4

      August 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      DistroWatch Weekly, Issue 1133

      August 3, 2025
      Recent

      DistroWatch Weekly, Issue 1133

      August 3, 2025

      Newelle, a ‘Virtual Assistant’ for GNOME, Hits Version 1.0

      August 3, 2025

      Bustle – visualize D-Bus activity

      August 3, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs

    EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs

    June 17, 2025

    The Challenge of Updating LLM Knowledge

    LLMs have shown outstanding performance for various tasks through extensive pre-training on vast datasets. However, these models frequently generate outdated or inaccurate information and can reflect biases during deployment, so their knowledge needs to be updated continuously. Traditional fine-tuning methods are expensive and susceptible to catastrophic forgetting. This has motivated lifelong model editing, which updates model knowledge efficiently and locally. To generate correct predictions, each edit requires reliability, generalizability, and localization. Methods like non-parametric achieve precise localized edits but poor generalization, while parametric methods offer better generalization but suffer from catastrophic forgetting.

    Limitations of Prior Model Editing Techniques

    Earlier works have explored sparse neural activations in continual learning, with methods like PackNet and Supermasks-in-Superposition allocating disjoint parameter subsets per task. Gradient-based approaches such as GPM and SPARCL improve efficiency through orthogonal updates but are limited to continual learning contexts. Parametric approaches such as ROME, MEMIT, and WISE modify weights through locating-then-editing strategies or auxiliary modules, but suffer from forgetting over extended edit sequences. Non-parametric methods like GRACE and LOKA store knowledge externally to preserve original weights, enabling precise local edits. However, these methods rely on exact input matches, limiting their generalization capabilities.

    Introducing MEMOIR: A Structured Approach to Model Editing

    Researchers from EPFL, Lausanne, Switzerland, have proposed MEMOIR (Model Editing with Minimal Overwrite and Informed Retention), which achieves an optimal balance between reliability, generalization, and locality for large-scale edits. It introduces a memory module that consists of a fully-connected layer within a single transformer block where all edits occur. MEMOIR solves catastrophic forgetting by allocating distinct parameter subsets to each edit and retrieving them during inference to activate only relevant knowledge for specific prompts. Moreover, the method utilizes structured sparsification with sample-dependent masks during editing, activating only prompt-specific parameter subsets. It distributes new knowledge across the parameter space, reducing overwriting and minimizing catastrophic forgetting.

    Evaluation and Experimental Results

    MEMOIR operates through a residual memory framework during inference, where the edited output integrates original layer outputs with residual memory outputs. It is evaluated against baselines such as GRACE for external knowledge storage, DEFER for inference-time routing, causal tracing methods like ROME, MEMIT, and ALPHAEDIT, and memory-based methods like WISE. Direct fine-tuning serves as an additional baseline comparison. Experiments are conducted on four autoregressive language models: LLaMA-3-8B-Instruct, Mistral-7B, LLaMA-2-7B, and GPT-J-6B, providing a comprehensive evaluation across different models and scales to show the effectiveness and generalizability of MOMOIR.

    On the ZsRE question-answering dataset, MEMOIR achieves an average metric of 0.95 on LLaMA-3 with 1000 edits, outperforming all prior methods by a margin of 0.16. Similar outcomes are seen with Mistral, where this method once again achieves the highest average score, highlighting its robustness and effectiveness across various LLMs. Moreover, MEMOIR maintains optimal balanced performance with increasing edit volumes for hallucination correction using the SelfCheckGPT dataset. MEMOIR sustains saturated locality scores under the most challenging scenario of 600 edits, while achieving perplexity metrics 57% and 77% lower than WISE, the second-best performing method, on LLaMA-3 and Mistral, respectively.

    Conclusion and Future Directions

    In conclusion, MEMOIR is a scalable framework for lifelong model editing that effectively balances reliability, generalization, and locality using innovative sparsification techniques. The method retrieves relevant updates through sparse activation pattern comparison, allowing edits to generalize to rephrased queries while maintaining model behavior on unrelated prompts. However, certain limitations exist, like modification of only single linear layers, which may restrict handling of long-horizon edits or knowledge requiring broader model changes. Future directions include extending the approach to multiple layers, hierarchical editing strategies, and application to multi-modal or encoder-decoder models beyond the current decoder-only transformer focus.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

    The post EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow to Use python-A2A to Create and Connect Financial Agents with Google’s Agent-to-Agent (A2A) Protocol
    Next Article OpenBMB Releases MiniCPM4: Ultra-Efficient Language Models for Edge Devices with Sparse Attention and Fast Inference

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    August 3, 2025
    Machine Learning

    Google AI Releases MLE-STAR: A State-of-the-Art Machine Learning Engineering Agent Capable of Automating Various AI Tasks

    August 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    DOOM: The Dark Ages is available now for Premium Edition owners

    News & Updates

    Microsoft UX Dark Patterns, CSS-only blurry image placeholders + Inclusive Design

    Web Development

    What’s the most pointless trend in modern web design?

    Web Development

    Microsoft 50th Anniversary Copilot Event interrupted by protester

    News & Updates

    Highlights

    CVE-2025-48493 – “Redis AUTH Credentials Exposed in Yii Logs”

    June 5, 2025

    CVE ID : CVE-2025-48493

    Published : June 5, 2025, 5:15 p.m. | 1 hour, 13 minutes ago

    Description : The Yii 2 Redis extension provides the redis key-value store support for the Yii framework 2.0. On failing connection, the extension writes commands sequence to logs. Prior to version 2.0.20, AUTH parameters are written in plain text exposing username and password. That might be an issue if attacker has access to logs. Version 2.0.20 fixes the issue.

    Severity: 0.0 | NA

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    CVE-2025-45618 – Jeeweb Mybatis Springboot Unauthenticated Information Disclosure

    May 5, 2025

    Multimodal Queries Require Multimodal RAG: Researchers from KAIST and DeepAuto.ai Propose UniversalRAG—A New Framework That Dynamically Routes Across Modalities and Granularities for Accurate and Efficient Retrieval-Augmented Generation

    May 5, 2025

    ‘Tientallen Nederlandse Citrix-servers bevatten kritieke kwetsbaarheden’

    June 30, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.