LMEraser: A Novel Machine Unlearning Method for Large Models Ensuring Privacy and Efficiency

Large models like BERT, GPT-3, and T5 boast billions of parameters and extensive training data, enabling them to discern intricate patterns and yield high accuracy. However, their widespread use raises privacy concerns regarding the unauthorized exposure of sensitive user information. Machine unlearning emerges as a solution, allowing for removing specific data from trained models without complete retraining. Yet, existing unlearning methods designed for smaller models need help with the complexities of larger models, facing challenges in pinpointing data influence, coping with computational demands, and maintaining overall performance amid data removal.

IEEE researchers have developed LMEraser, an efficient unlearning method for large models that address privacy concerns in machine learning. LMEraser employs a divide-and-conquer approach, partitioning the dataset into public and private segments. It utilizes adaptive prompt tuning to isolate data influence, reducing computational costs while maintaining model performance. By freezing the backbone parameters post-pre-training and employing an adaptive prompt tuning mechanism, LMEraser achieves precise unlearning with minimal impact on accuracy. Experimental results demonstrate a significant reduction in unlearning costs, making LMEraser a pioneering solution for large model privacy protection.

Prompt tuning is a technique to adapt pre-trained models for new tasks by adding small learnable vectors, or â€œprompts,â€ to input data, avoiding full model retraining. Itâ€™s computationally efficient, allowing a single model to handle multiple tasks. Vision Transformers (ViT) are commonly used in visual prompt tuning, with methods like VPT and VP integrating prompts into image embeddings. Machine unlearning removes specific data from trained models without complete retraining, which is crucial for privacy. Exact methods completely remove data influence, but theyâ€™re resource-intensive. Approximate methods aim to reduce influence efficiently, using techniques like influence functions, though they face scalability challenges.

LMEraser employs a multi-step method to efficiently handle data in large models, addressing privacy and unlearning challenges. Initially, it partitions the dataset into public and private segments to ensure sensitive data isolation. The model backbone is pre-trained solely on public data to avoid privacy risks and stabilize the model. Private data are then adaptively clustered based on diversity, allowing for tailored prompt tuning. This adaptive approach ensures efficient unlearning by re-optimizing prompts and classifier heads only for affected clusters when data removal is required. Thus, LMEraser achieves precise unlearning without full model retraining, maintaining performance and privacy.

The evaluation of LMEraser focuses on model utility and unlearning efficiency. The model utility is assessed by image classification accuracy, ensuring no compromise during unlearning. Unlearning efficiency is measured by time and computational costs. Using ImageNet-22K as the public dataset and smaller datasets like CIFAR-10, CIFAR-100, GTSRB, and SVHN as private datasets, LMEraser is compared with baselines like retraining from scratch and SISA. Tests are conducted on Nvidia Tesla V100-FHHL GPUs using PyTorch v2.1.2 and CUDA 12.1. Results demonstrate LMEraserâ€™s better performance and efficiency in handling unlearning requests.

In conclusion, LMEraser represents a breakthrough in exact unlearning techniques tailored for large-scale models. By leveraging prompt tuning, it adeptly sequesters the impact of private data, ensuring robust privacy protection. Its adaptive approach to prompt tuning strikes a delicate balance between efficient unlearning and safeguarding model performance. Extensive experiments affirm LMEraserâ€™s efficacy in achieving precise unlearning while upholding accuracy standards, underscoring its versatility across diverse datasets and expansive model architectures.

Check out theÂ Paper and Github.Â All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 40k+ ML SubReddit

For Content Partnership, Please Fill Out This Form Here..

The post LMEraser: A Novel Machine Unlearning Method for Large Models Ensuring Privacy and Efficiency appeared first on MarkTechPost.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

LMEraser: A Novel Machine Unlearning Method for Large Models Ensuring Privacy and Efficiency

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

Have storage problems? This handy phone accessory can save 4K videos on a small 2TB SSD

Multiple browser support

What are AI Agents? Demystifying Autonomous Software with a Human Touch

3 genius side hustles you can start with OpenAI’s Operator right now

Improve Your Next Experiment by Learning Better Proxy Metrics From Past Experiments

Linux Mint 22.1 Released, Here’s Everything New

How to handle this Particular date picker scenario where i want to traverse back in selenium

ServiceNow Releases AgentLab: A New Open-Source Python Package for Developing and Evaluating Web Agents

LMEraser: A Novel Machine Unlearning Method for Large Models Ensuring Privacy and Efficiency

Related Posts