Aligning Large Language Models with Diverse User Preferences Using Multifaceted System Messages: The JANUS Approach

Current methods for aligning LLMs often match the general publicâ€™s preferences, assuming this is ideal. However, this overlooks the diverse and nuanced nature of individual preferences, which are difficult to scale due to the need for extensive data collection and model training for each person. Techniques like RLHF and instruction fine-tuning help align LLMs with broad human values such as helpfulness and harmlessness. Yet, this approach needs to address conflicting individual preferences, leading to annotation disagreements and undesirable model traits like verbosity.

KAIST AI and Carnegie Mellon University researchers have developed a new paradigm where users specify their values in system messages to align LLMs with individual preferences better. Traditional LLMs, trained with uniform messages like â€œYou are a helpful assistant,â€ struggle to adapt to diverse system messages. They created the MULTIFACETED COLLECTION, a dataset with 192k unique system messages and 65k instructions to address this. Training a 7B LLM named JANUS on this dataset, they tested it against various benchmarks, achieving high performance and demonstrating that diverse system message training enhances alignment with individual and general public preferences. Their work is available on GitHub.

Aligning LLMs to diverse human preferences is crucial since individuals have varying values for the same task. Most research uses the RLHF pipeline, creating customized reward functions to reflect diverse perspectives better and reduce annotation disagreements. Some studies focus on learning multiple preference distributions or training separate models for user preferences. While these methods often involve impractical re-training, the proposed approach trains an LLM to adapt to explicitly stated preferences during test time. System messages, used to provide context and guide LLM behavior, have been shown to improve performance when diversified, but previous research has limited their scope. This work scales system messages to better align with user preferences.

Existing alignment datasets generally reflect broad preferences like helpfulness and harmlessness. The goal is to create a dataset capturing more specific preferences, such as â€œcode-centric styleâ€ or â€œensuring code ethicsâ€ for coding solutions. Preferences are detailed textual descriptions of desirable qualities in responses. Two requirements for a model to reflect diverse human preferences are multifacetedness and explicitness. A hierarchical preference augmentation strategy ensures a variety of preference facets. Multifaceted preferences are included in model inputs via system messages. Data construction involves selecting 65k instructions, generating 192k system messages, and crafting gold-standard responses using GPT-4 Turbo. Models are trained using several methods, including instruction tuning and preference optimization.

Benchmarks for evaluating the JANUS model include multifacetedness, helpfulness, and harmlessness. The MULTIFACETED BENCH enhances five existing benchmarks to assess context-specific nuances. Helpfulness is evaluated using Alpaca Eval 2.0, MT-Bench, and Arena Hard Auto v0.1, while harmlessness is assessed with RealToxicityPrompts. Baselines include various pre-trained, instruction-tuned, and preference-optimized models. Evaluations involve human and LLM assessments, showing that JANUS excels in generating personalized responses, maintaining helpfulness, and ensuring low toxicity. These results demonstrate JANUSâ€™s ability to adapt to diverse preferences and maintain alignment with general helpful values without compromising safety.

In conclusion, several ablation studies reveal JANUSâ€™s robust performance, both with and without system messages. JANUSâ€™s multifaceted capabilities allow it to generate quality responses regardless of context. Incorporating multifaceted system messages during training enhances performance in both multifacetedness and helpfulness. Training without system messages, however, poses challenges in capturing human preferences effectively. JANUS can also serve as a personalized reward model, improving performance on MULTIFACETED BENCH through best-of-n sampling. The method aligns LLMs with diverse user preferences using a unique system message protocol and the MULTIFACETED COLLECTION dataset, ensuring high performance and adaptability without continual retraining.

Check out theÂ Paper and GitHub. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 43k+ ML SubReddit | Also, check out our AI Events Platform

The post Aligning Large Language Models with Diverse User Preferences Using Multifaceted System Messages: The JANUS Approach appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Aligning Large Language Models with Diverse User Preferences Using Multifaceted System Messages: The JANUS Approach

LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

This secret Best Buy trade-in deal lets me buy a new Pixel 9 for around $100

Checking String Absence with Laravel’s doesntContain

How to choose the best HP laptop: OmniBook, EliteBook, OMEN, ZBook, and more

Better Digital Banking Experiences with AI and MongoDB

MongoDB Helps Asian Retailers Scale and Innovate at Speed

CVE-2025-4196 – SourceCodester Patient Record Management System SQL Injection Vulnerability

I Shall Call To You Next

Aligning Large Language Models with Diverse User Preferences Using Multifaceted System Messages: The JANUS Approach

Related Posts