Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      ScyllaDB X Cloud’s autoscaling capabilities meet the needs of unpredictable workloads in real time

      June 17, 2025

      Parasoft C/C++test 2025.1, Secure Code Warrior AI Security Rules, and more – Daily News Digest

      June 17, 2025

      What I Wish Someone Told Me When I Was Getting Into ARIA

      June 17, 2025

      SD Times 100

      June 17, 2025

      Clair Obscur: Expedition 33 is a masterpiece, but I totally skipped parts of it (and I won’t apologize)

      June 17, 2025

      This Xbox game emotionally wrecked me in less than four hours… I’m going to go hug my cat now

      June 17, 2025

      Top 5 desktop PC case features that I can’t live without — and neither should you

      June 17, 2025

      ‘No aggressive monetization’ — Nexus Mods’ new ownership responds to worried members

      June 17, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Build AI Agents That Run Your Day – While You Focus on What Matters

      June 17, 2025
      Recent

      Build AI Agents That Run Your Day – While You Focus on What Matters

      June 17, 2025

      Faster Builds in Meteor 3.3: Modern Build Stack with SWC and Bundler Optimizations

      June 17, 2025

      How to Change Redirect After Login/Register in Laravel Breeze

      June 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Clair Obscur: Expedition 33 is a masterpiece, but I totally skipped parts of it (and I won’t apologize)

      June 17, 2025
      Recent

      Clair Obscur: Expedition 33 is a masterpiece, but I totally skipped parts of it (and I won’t apologize)

      June 17, 2025

      This Xbox game emotionally wrecked me in less than four hours… I’m going to go hug my cat now

      June 17, 2025

      Top 5 desktop PC case features that I can’t live without — and neither should you

      June 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs

    EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs

    June 17, 2025

    The Challenge of Updating LLM Knowledge

    LLMs have shown outstanding performance for various tasks through extensive pre-training on vast datasets. However, these models frequently generate outdated or inaccurate information and can reflect biases during deployment, so their knowledge needs to be updated continuously. Traditional fine-tuning methods are expensive and susceptible to catastrophic forgetting. This has motivated lifelong model editing, which updates model knowledge efficiently and locally. To generate correct predictions, each edit requires reliability, generalizability, and localization. Methods like non-parametric achieve precise localized edits but poor generalization, while parametric methods offer better generalization but suffer from catastrophic forgetting.

    Limitations of Prior Model Editing Techniques

    Earlier works have explored sparse neural activations in continual learning, with methods like PackNet and Supermasks-in-Superposition allocating disjoint parameter subsets per task. Gradient-based approaches such as GPM and SPARCL improve efficiency through orthogonal updates but are limited to continual learning contexts. Parametric approaches such as ROME, MEMIT, and WISE modify weights through locating-then-editing strategies or auxiliary modules, but suffer from forgetting over extended edit sequences. Non-parametric methods like GRACE and LOKA store knowledge externally to preserve original weights, enabling precise local edits. However, these methods rely on exact input matches, limiting their generalization capabilities.

    Introducing MEMOIR: A Structured Approach to Model Editing

    Researchers from EPFL, Lausanne, Switzerland, have proposed MEMOIR (Model Editing with Minimal Overwrite and Informed Retention), which achieves an optimal balance between reliability, generalization, and locality for large-scale edits. It introduces a memory module that consists of a fully-connected layer within a single transformer block where all edits occur. MEMOIR solves catastrophic forgetting by allocating distinct parameter subsets to each edit and retrieving them during inference to activate only relevant knowledge for specific prompts. Moreover, the method utilizes structured sparsification with sample-dependent masks during editing, activating only prompt-specific parameter subsets. It distributes new knowledge across the parameter space, reducing overwriting and minimizing catastrophic forgetting.

    Evaluation and Experimental Results

    MEMOIR operates through a residual memory framework during inference, where the edited output integrates original layer outputs with residual memory outputs. It is evaluated against baselines such as GRACE for external knowledge storage, DEFER for inference-time routing, causal tracing methods like ROME, MEMIT, and ALPHAEDIT, and memory-based methods like WISE. Direct fine-tuning serves as an additional baseline comparison. Experiments are conducted on four autoregressive language models: LLaMA-3-8B-Instruct, Mistral-7B, LLaMA-2-7B, and GPT-J-6B, providing a comprehensive evaluation across different models and scales to show the effectiveness and generalizability of MOMOIR.

    On the ZsRE question-answering dataset, MEMOIR achieves an average metric of 0.95 on LLaMA-3 with 1000 edits, outperforming all prior methods by a margin of 0.16. Similar outcomes are seen with Mistral, where this method once again achieves the highest average score, highlighting its robustness and effectiveness across various LLMs. Moreover, MEMOIR maintains optimal balanced performance with increasing edit volumes for hallucination correction using the SelfCheckGPT dataset. MEMOIR sustains saturated locality scores under the most challenging scenario of 600 edits, while achieving perplexity metrics 57% and 77% lower than WISE, the second-best performing method, on LLaMA-3 and Mistral, respectively.

    Conclusion and Future Directions

    In conclusion, MEMOIR is a scalable framework for lifelong model editing that effectively balances reliability, generalization, and locality using innovative sparsification techniques. The method retrieves relevant updates through sparse activation pattern comparison, allowing edits to generalize to rephrased queries while maintaining model behavior on unrelated prompts. However, certain limitations exist, like modification of only single linear layers, which may restrict handling of long-horizon edits or knowledge requiring broader model changes. Future directions include extending the approach to multiple layers, hierarchical editing strategies, and application to multi-modal or encoder-decoder models beyond the current decoder-only transformer focus.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

    The post EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow to Use python-A2A to Create and Connect Financial Agents with Google’s Agent-to-Agent (A2A) Protocol
    Next Article OpenBMB Releases MiniCPM4: Ultra-Efficient Language Models for Edge Devices with Sparse Attention and Fast Inference

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 17, 2025
    Machine Learning

    Build conversational interfaces for structured data using Amazon Bedrock Knowledge Bases

    June 17, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Chrome’s Tab Search Feature Finds a New Home in the Toolbar

    Operating Systems

    CVE-2025-4329 – 74CMS Path Traversal Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Sony WH-1000XM6 vs WH-1000XM5: Should you upgrade to the newest headphones?

    News & Updates

    Reflective Kerberos Relay Attack (CVE-2025-33073): NT AUTHORITYSYSTEM Privilege Escalation

    Security

    Highlights

    CVE-2025-46239 – Theme Switcha Cross-site Scripting Vulnerability

    April 22, 2025

    CVE ID : CVE-2025-46239

    Published : April 22, 2025, 10:15 a.m. | 58 minutes ago

    Description : Improper Neutralization of Input During Web Page Generation (‘Cross-site Scripting’) vulnerability in Jeff Starr Theme Switcha allows Stored XSS. This issue affects Theme Switcha: from n/a through 3.4.

    Severity: 6.5 | MEDIUM

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    30+ Best Free Illustrator Brush Sets for Digital Artists

    30+ Best Free Illustrator Brush Sets for Digital Artists

    April 9, 2025

    CVE-2025-3050 – IBM Db2 CPU Resource Allocation Denial of Service

    May 29, 2025

    CVE-2025-49183 – Apache HTTP Unencrypted Communication Vulnerability

    June 12, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.