Achieving Balance in Lifelong Learning: The WISE Memory Approach

LLMs demonstrate emergent intelligence with increased parameters, computes, and data, hinting at artificial general intelligence. Despite advancements, deployed LLMs still exhibit errors like hallucinations, bias, and factual inaccuracies. Also, the constant evolution of knowledge challenges their pretraining. Addressing errors promptly during deployment is crucial, as retraining or finetuning is often prohibitively costly, posing sustainability issues for accommodating lifelong knowledge growth.

While long-term memory can be updated through (re)pretraining, finetuning, and model editing, working memory aids inference, enhanced by methods like GRACE. However, debates persist on the efficacy of fine-tuning versus retrieval. Current knowledge injection methods face challenges like computational overhead and overfitting. Model editing techniques, including constrained finetuning and meta-learning, aim to efficiently edit LLMs. Recent advancements focus on lifelong editing but require extensive domain-specific training, posing challenges in predicting upcoming edits and accessing relevant data.

After studying the above issues and approaches thoroughly, researchers from Zhejiang University and Alibaba Group propose their method, WISE, a dual parametric memory scheme, comprising a main memory for pretrained knowledge and a side memory for edited knowledge. Only the side memory undergoes edits, with a router determining which memory to access for queries. For continual editing, WISE employs a knowledge-sharing mechanism, segregating edits into distinct parameter subspaces to prevent conflicts before merging them into a shared memory.

WISE comprises two main components: Side Memory Design and Knowledge Sharding and Merging. The former involves a side memory, initialized as a copy of a certain FFN layer of the LLM, storing edits, and a routing mechanism for memory selection during inference. The latter employs knowledge sharding to divide edits into random subspaces for editing and knowledge merging techniques to combine these subspaces into a unified side memory. Also, WISE introduces WISE-Retrieve, allowing retrieval among multiple side memories based on activation scores, enhancing lifelong editing scenarios.

WISE demonstrates superior performance compared to existing methods in both QA and Hallucination settings. It outperforms competitors, particularly in long editing sequences, achieving significant improvements in stability and managing sequential edits effectively. While methods like MEND and ROME are competitive initially, they falter as edit sequences lengthen. Directly editing long-term memory leads to significant declines in locality, impairing generalization. GRACE excels in locality but sacrifices generalization in continual editing. WISE achieves a balance between reliability, generalization, and locality, outperforming baselines across various tasks. In out-of-distribution evaluation, WISE exhibits excellent generalization performance, surpassing other methods.

This research identifies the challenge of achieving reliability, generalization, and locality simultaneously in current lifelong modeling editing approaches, attributing it to the gap between working and long-term memory. To overcome this issue, WISE is proposed, comprising side memory and model merging techniques. Results indicate that WISE shows promise in simultaneously achieving high metrics across various datasets and LLM models.

Check out theÂ Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 42k+ ML SubReddit

The post Achieving Balance in Lifelong Learning: The WISE Memory Approach appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Achieving Balance in Lifelong Learning: The WISE Memory Approach

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

Malicious PyPI Packages Stole Cloud Tokens—Over 14,100 Downloads Before Removal

I rescued my dying 2017 MacBook Pro with Ubuntu and it works like a charm (mostly)

Bug Fixing: Lazy loaded property value is not supported by the current property instance

Different ways to identify and change compatibility levels in SQL Server

Microsoft AI Open Sources TinyTroupe: A New Python Library for LLM-Powered Multiagent Simulation

Anthropic Open Sourced Model Context Protocol (MCP): Transforming AI Integration with Universal Data Connectivity for Smarter, Context-Aware, and Scalable Applications Across Industries

Must-Have Design Tools for Web Creators in 2024

openEuler is a Linux distribution for server and cloud environments

Achieving Balance in Lifelong Learning: The WISE Memory Approach

Related Posts