Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Predicting and Interpreting In-Context Learning Curves Through Bayesian Scaling Laws

    Predicting and Interpreting In-Context Learning Curves Through Bayesian Scaling Laws

    November 4, 2024

    Large Language Models (LLMs) have demonstrated remarkable in-context learning (ICL) capabilities, where they can learn tasks from demonstrations without requiring additional training. A critical challenge in this field is understanding and predicting the relationship between the number of demonstrations provided and the model’s performance improvement, known as the ICL curve. This relationship needs to be better understood despite its significant implications for various applications. Accurate prediction of ICL curves holds crucial importance for determining optimal demonstration quantities, anticipating potential alignment failures in many-shot scenarios, and assessing the fine-tuning required to control undesired behaviours. The ability to model these learning curves effectively would enhance decision-making in deployment strategies and help mitigate potential risks associated with LLM implementations.

    Various research approaches have attempted to decode the underlying mechanisms of in-context learning in Large Language Models, with divergent theories emerging. Some studies suggest LMs trained on synthetic data behave like Bayesian learners, while others propose they follow gradient descent patterns, and some indicate the learning algorithm varies based on task complexity, model scale, and training progress. Power laws have emerged as a predominant framework for modeling LM behavior, including ICL curves across different settings. However, existing research has notable limitations. No previous work has directly modeled the ICL curve based on fundamental learning algorithm assumptions. Also, post-training modifications have proven largely ineffective, with studies revealing that such changes are often superficial and easily circumvented, particularly concerning since ICL can reinstate behaviors that were supposedly suppressed through fine-tuning.

    Researchers propose a that introduces Bayesian laws to model and predict in-context learning curves across different language model scenarios. The study evaluates these laws using both synthetic data experiments with GPT-2 models and real-world testing on standard benchmarks. The approach extends beyond simple curve fitting, providing interpretable parameters that capture the prior task distribution, ICL efficiency, and example probabilities across different tasks. The research methodology encompasses two main experimental phases: first comparing the Bayesian laws’ performance against existing power law models in curve prediction, and second, analyzing how post-training modifications affect ICL behavior in both favored and disfavored tasks. The study culminates in comprehensive testing across large-scale models ranging from 1B to 405B parameters, including evaluation of capabilities, safety benchmarks, and a robust many-shot jailbreaking dataset.

    The architecture of the Bayesian scaling laws for ICL is built upon fundamental assumptions about how language models process and learn from in-context examples. The framework begins by treating ICL as a Bayesian learning process, applying Bayes’ theorem iteratively to model how each new in-context example updates the task prior. A key innovation in the architecture is the introduction of parameter reduction techniques to prevent overfitting. This includes two distinct approaches to parameter tying, sampling-wise and scoring-wise, which help maintain model efficiency while scaling linearly with the number of distributions. The architecture incorporates an ICL efficiency coefficient ‘K’ that accounts for the token-by-token processing nature of LLMs and variations in example informativeness, effectively modulating the strength of Bayesian updates based on example length and complexity.

    The experimental results demonstrate superior performance of the Bayesian scaling laws compared to existing approaches. In interpolation tests, the original Bayesian scaling law achieved significantly lower Normalized Root Mean Square Error (NRMSE) across model scales and trajectory lengths, only matched by a strong logistic baseline. The scoring-wise Bayesian law particularly excelled in extrapolation tasks, showing the best performance when predicting the remaining 90% of ICL curves using only the first 10% of data points. Beyond numerical superiority, the Bayesian laws offer interpretable parameters that provide meaningful insights into model behavior. The results reveal that prior distributions align with uniform pretraining distributions, and ICL efficiency correlates positively with both model depth and example length, indicating that larger models achieve faster in-context learning, especially with more informative examples.

    Comparing Llama 3.1 8B Base and Instruct versions revealed crucial insights about the effectiveness of instruction-tuning. Results show that while instruction-tuning successfully reduces the prior probability of unsafe behaviors across various evaluation metrics (including harmbench and persona evaluations), it fails to prevent many-shot jailbreaking effectively. The Bayesian scaling law demonstrates that posterior probabilities are eventually saturated, regardless of the reduced prior probabilities achieved through instruction-tuning. This suggests that instruction-tuning primarily modifies task priors rather than fundamentally altering the model’s underlying task knowledge, possibly due to the relatively limited computational resources allocated to instruction-tuning compared to pretraining.

    The research successfully bridges two fundamental questions about in-context learning by developing and validating Bayesian scaling laws. These laws demonstrate remarkable effectiveness in modeling ICL behavior across both small-scale LMs trained on synthetic data and large-scale models trained on natural language. The key contribution lies in the interpretability of the Bayesian formulation, which provides clear insights into priors, learning efficiency, and task-conditional probabilities. This framework has proven valuable for understanding scale-dependent ICL capabilities, analyzing the impact of fine-tuning on knowledge retention, and comparing base models with their instruction-tuned counterparts. The success of this approach suggests that continued investigation of scaling laws could yield further crucial insights into the nature and behavior of in-context learning, paving the way for more effective and controllable language models.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

    [Sponsorship Opportunity with us] Promote Your Research/Product/Webinar with 1Million+ Monthly Readers and 500k+ Community Members

    The post Predicting and Interpreting In-Context Learning Curves Through Bayesian Scaling Laws appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMDAgents: A Dynamic Multi-Agent Framework for Enhanced Medical Decision-Making with Large Language Models
    Next Article AI in UX Research Report 2024

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    React Native 0.77 – New Styling Features, Android’s 16KB page support, Swift Template

    Development

    The best flashlights of 2024: Expert tested

    Development

    Xbox fans in Brazil woke up to this horrible news that Microsoft is suspending sales

    Operating Systems

    Why do designers become managers?

    Development

    Highlights

    Dissolving the line between design and engineering

    February 5, 2025

    So this conversation is a deep dive into what it looks like for designers to…

    Top 7 Emerging Software Testing Trends That Will Dominate in 2025

    January 30, 2025

    Microsoft doesn’t want you to bypass Windows 11 requirements on Windows 10

    February 9, 2025

    AWS Summit: AWS App Studio, Amazon Q Apps, and more

    July 10, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.