Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Researchers from Stanford and Duolingo Demonstrate Effective Strategies for Generating at a Desired Proficiency Level Using Proprietary Models such as GPT4 and Open-Source Techniques

    Researchers from Stanford and Duolingo Demonstrate Effective Strategies for Generating at a Desired Proficiency Level Using Proprietary Models such as GPT4 and Open-Source Techniques

    June 15, 2024

    Controlling the language proficiency levels in texts generated by large language models (LLMs) is a significant challenge in AI research. Ensuring that generated content is appropriate for various proficiency levels is crucial for applications in language learning, education, and other contexts where users may not be fully proficient in the target language. Without effective proficiency control, the usability and effectiveness of LLM-generated content are significantly hindered, especially for non-native speakers, children, and language learners.

    Current methods to tackle this challenge include few-shot prompting, supervised finetuning, and reinforcement learning (RL). Few-shot prompting involves providing the model with a few examples to guide its output, while supervised finetuning adjusts the model using a labeled dataset. RL, specifically Proximal Policy Optimization (PPO), further refined the model’s outputs based on a reward system. However, these methods have limitations: few-shot prompting with open-source models often results in high computational costs and suboptimal performance, and supervised fine-tuning requires extensive labeled data, which may not be readily available. Moreover, RL techniques can be unstable and computationally intensive, making them less practical for large-scale applications.

    A team of researchers from Stanford and Duolingo propose developing the CEFR-Aligned Language Model (CALM), which combines finetuning and PPO to align the output proficiency levels with the Common European Framework of Reference for Languages (CEFR) standards. This approach specifically addresses the limitations of existing methods by bridging the performance gap between proprietary models like GPT-4 and open-source alternatives. CALM is designed to generate high-quality, proficiency-controlled content at a fraction of the cost of using proprietary models. This represents a significant contribution to the field by making proficiency-controlled text generation more accessible and cost-effective.

    The proposed method involves finetuning open-source models such as LLama-2-7B and Mistral-7B using a dataset generated by effective GPT-4 prompting strategies. The dataset, called TinyTolkien, consists of short stories with varying CEFR levels. Further training with PPO aligns the model outputs with the desired proficiency levels. Additionally, a sampling strategy was introduced to boost model performance by selecting the best output from multiple generations. The technical aspects crucial for understanding this approach include the use of linguistic features for automated CEFR scoring and the application of RL techniques to minimize ControlError, which measures the deviation of the generated text from the target proficiency level.

    The results demonstrate that the proposed CALM model achieves a ControlError comparable to GPT-4 while significantly reducing costs. Evaluation metrics included ControlError, QualityScore, and computational cost. The findings were validated through both automatic scoring and a small-scale human study, which showed high ratings for quality and proficiency alignment. The key table below compares various prompting strategies and models, highlighting CALM’s superior performance in both ControlError and quality metrics. For instance, CALM with top-3 sampling achieved a ControlError of 0.15, outperforming other models and strategies.

    In conclusion, the researchers addressed the critical challenge of controlling the proficiency level of LLM-generated content. They proposed a novel approach combining finetuning and PPO, validated through rigorous evaluation, which significantly advances the field by providing an efficient, cost-effective solution for generating proficiency-controlled text. This work has the potential to enhance applications in education and language learning, making advanced AI tools more accessible to a broader audience.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 44k+ ML SubReddit

    The post Researchers from Stanford and Duolingo Demonstrate Effective Strategies for Generating at a Desired Proficiency Level Using Proprietary Models such as GPT4 and Open-Source Techniques appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleWith 700,000 Large Language Models (LLMs) On Hugging Face Already, Where Is The Future of Artificial Intelligence AI Headed?
    Next Article This AI Paper from China Proposes a Novel dReLU-based Sparsification Method that Increases Model Sparsity to 90% while Maintaining Performance, Achieving a 2-5× Speedup in Inference

    Related Posts

    Machine Learning

    LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

    May 17, 2025
    Machine Learning

    This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Gemini Live voice released and new ChatGPT-4o tops LMSYS

    Artificial Intelligence

    Top 20 Free ChatGPT Alternatives with Unlimited Usage in 2025

    Artificial Intelligence

    Why you need a task queue

    Learning Resources

    Understanding Domain Names and Their Importance

    Development

    Highlights

    Development

    Central Bank Argentina Data Breach: Hackers Allegedly Offer Customer Info for Sale

    April 29, 2024

    A threat actor purports to be selling the database of the Central Bank of Argentina…

    Using Agents for Amazon Bedrock to interactively generate infrastructure as code

    July 11, 2024

    CVE-2025-37730 – Logstash SSL Verification MitM Vulnerability

    May 6, 2025

    Google Chrome Vulnerability Let Attackers Escape Payload from Sandbox – Technical Details Disclosed

    April 29, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.