Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 2, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 2, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 2, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 2, 2025

      How Red Hat just quietly, radically transformed enterprise server Linux

      June 2, 2025

      OpenAI wants ChatGPT to be your ‘super assistant’ – what that means

      June 2, 2025

      The best Linux VPNs of 2025: Expert tested and reviewed

      June 2, 2025

      One of my favorite gaming PCs is 60% off right now

      June 2, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      `document.currentScript` is more useful than I thought.

      June 2, 2025
      Recent

      `document.currentScript` is more useful than I thought.

      June 2, 2025

      Adobe Sensei and GenAI in Practice for Enterprise CMS

      June 2, 2025

      Over The Air Updates for React Native Apps

      June 2, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

      June 2, 2025
      Recent

      You can now open ChatGPT on Windows 11 with Win+C (if you change the Settings)

      June 2, 2025

      Microsoft says Copilot can use location to change Outlook’s UI on Android

      June 2, 2025

      TempoMail — Command Line Temporary Email in Linux

      June 2, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Apple Researchers Introduce Instruction-Following Pruning (IFPruning): A Dynamic AI Approach to Efficient and Scalable LLM Optimization

    Apple Researchers Introduce Instruction-Following Pruning (IFPruning): A Dynamic AI Approach to Efficient and Scalable LLM Optimization

    January 14, 2025

    Large language models (LLMs) have become crucial tools for applications in natural language processing, computational mathematics, and programming. Such models often require large-scale computational resources to execute inference and train the model efficiently. To reduce this, many researchers have devised ways to optimize the techniques used with these models.

    A strong challenge in LLM optimization arises from the fact that traditional pruning methods are fixed. Static Pruning removes unnecessary parameters based on a prespecified mask. They cannot be applied if the required skill for an application is coding or solving mathematical problems. These methods lack flexibility, as the performance is usually not maintained for several tasks while optimizing the computational resources.

    Historically, techniques such as static structured Pruning and mixture-of-experts (MoE) architectures have been used to counter the computational inefficiencies of LLMs. Structured Pruning removes components like channels or attention heads from specific layers. Although these methods are hardware-friendly, they require full retraining to avoid a loss of model accuracy. MoE models, in turn, activate parts of the model during inference but incur huge overheads from frequent parameter reloading.

    Apple AI and UC Santa Barbara researchers have introduced a new technique called Instruction-Following Pruning (IFPruning), which dynamically adapts LLMs to the needs of a particular task. IFPruning uses a sparsity predictor that generates input-dependent pruning masks, selecting only the most relevant parameters for a given task. Unlike traditional methods, this dynamic approach focuses on feed-forward neural network (FFN) layers, allowing the model to adapt to diverse tasks while reducing computational demands efficiently.

    The researchers propose a two-stage training process for IFPruning: First, continue pre-training dense models on large data, maximizing the sparsity predictor and the LLM. This produces a strong starting point for subsequent fine-tuning. In stage two, training is performed only on supervised fine-tuning datasets, using highly varied task prompts and multiple examples. Masking is still dynamic due to the online generation of sparsity predictors pruning out unnecessary weights without affecting model performance. This eliminates the need for parameter reloading, a limitation observed in prior dynamic methods.

    The performance of IFPruning was rigorously evaluated across multiple benchmarks. For instance, pruning a 9B parameter model to 3B improved coding task accuracy by 8% compared to a dense 3B model, closely rivaling the unpruned 9B model. On mathematical datasets like GSM8K and MATH, the dynamic pruning approach yielded a 5% increase in accuracy. It exhibited consistent gains on instruction-following evaluation in both IFEval and AlpacaEval for around 4-6 percent points. Even with multi-task benchmarks like MMLU, it showed promising robust results of IFPruning, displaying versatility across other domains.

    These results underpin the IFPruning approach’s scalability since models with varying sizes, namely 6B, 9B, and 12B parameters, have been tested; in all, important performance improvements post-pruning are achieved. Scaling from a 6B dense model to a 12B dense model showed that, under the same condition, efficiency was improved along with task-specific accuracy. It further outperformed traditional structured pruning methods like Pruning + Distill due to the use of a dynamic sparsity mechanism.

    The introduction of IFPruning marks a significant advancement in optimizing LLMs, providing a method that dynamically balances efficiency and performance. The approach addresses the limitations of static pruning and MoE architectures, setting a new standard for resource-efficient language models. With its ability to adapt to varied inputs without sacrificing accuracy, IFPruning presents a promising solution for deploying LLMs on resource-constrained devices.

    This research will point out further developments in model pruning, which include optimizing other components, such as attention heads and hidden layers. Even though the methodology presented today tackles many of the computational challenges, further research in server-side applications and multi-task Pruning can broaden its scope of applicability. As a dynamic and efficient framework, IFPruning opens up possibilities for more adaptive and accessible large-scale language models.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

    🚨 FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation Intelligence–Join this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.

    The post Apple Researchers Introduce Instruction-Following Pruning (IFPruning): A Dynamic AI Approach to Efficient and Scalable LLM Optimization appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleUC Berkeley Researchers Released Sky-T1-32B-Preview: An Open-Source Reasoning LLM Trained for Under $450 Surpasses OpenAI-o1 on Benchmarks like Math500, AIME, and Livebench
    Next Article How to Handle a Compromised WordPress Plugin

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 2, 2025
    Machine Learning

    MiMo-VL-7B: A Powerful Vision-Language Model to Enhance General Visual Understanding and Multimodal Reasoning

    June 2, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Turla Group Deploys LunarWeb and LunarMail Backdoors in Diplomatic Missions

    Development

    Disturbed The Sickness 25th Anniversary Tour 2025 Shirt https://viralstyle.com/butterbuys/disturbed-the-sickness-25th-anniversary https://www.pinterest.com/etsyshoprrr/disturbed-the-sickness-25th-anniversary-tour-2025 Crafted from high-quality fabrics, the Disturbed The Sickness 25th Anniversary Tour 2025 Shirt is a must-have for fans of this iconic band. Available in a variety of styles like hoodies, long sleeves, men’s and women’s V-necks, sweatshirts, and the premium unisex tee, this collection combines comfort and style for every occasion. Perfect for concerts or casual outings, this Disturbed The Sickness 25th Anniversary Tour 2025 T Shirt celebrates the electrifying energy of the tour.

    Development

    Enhance Your Digital Presence with Tech.us’s Expert Website Development Services

    Web Development

    Best Free and Open Source Alternatives to Progress Chef

    Linux

    Highlights

    Development

    Top AgentOps Tools in 2025

    November 22, 2024

    As AI agents become increasingly sophisticated and autonomous, the need for robust tools to manage…

    j-Exec UI component

    June 17, 2024

    MCalc – calculator for performing simple mathematical operations

    January 26, 2025

    Search PDFs at Scale with MongoDB and Nomic

    April 30, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.