Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 17, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 17, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 17, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 17, 2025

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025

      Save $400 on the best Samsung TVs, laptops, tablets, and more when you sign up for Verizon 5G Home or Home Internet

      May 17, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025
      Recent

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025

      Big Changes at Meteor Software: Our Next Chapter

      May 17, 2025

      Apps in Generative AI – Transforming the Digital Experience

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025
      Recent

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Cutting Costs, Not Performance: Structured FeedForward Networks FFNs in Transformer-Based LLMs

    Cutting Costs, Not Performance: Structured FeedForward Networks FFNs in Transformer-Based LLMs

    July 1, 2024

    Optimizing the efficiency of Feedforward Neural Networks (FFNs) within Transformer architectures is a significant challenge in AI. Large language models (LLMs) are highly resource-intensive, requiring substantial computational power and energy, which restricts their applicability and raises environmental concerns. Efficiently addressing this challenge is crucial for promoting sustainable AI practices and making advanced AI technologies more accessible by reducing operational costs.

    Current methods to enhance FFN efficiency typically involve low-rank approximations and structured matrices. Approaches such as LowRank and BlockDense decompositions have been proposed to reduce parameters and FLOPs. However, these methods often face limitations in practical scenarios. For instance, low-rank approximations can suffer from poor optimization dynamics due to increased symmetries leading to saddle points, and structured matrices can result in suboptimal training dynamics and reduced efficiency in online decoding due to poor parallelism on GPUs. These limitations make the existing methods less suitable for real-time applications and large-scale deployments.

    A team of researchers from Google DeepMind and EPFL propose a hybrid structure combining low-rank and block-diagonal matrices with a technique termed ‘self-guided training.’ This new method aims to mitigate the optimization issues by introducing a dense matrix during the initial training phase, which is gradually phased out, allowing the structured matrices to take over. This approach ensures better training stability and faster convergence. The hybrid method not only addresses computational efficiency but also ensures that optimization dynamics are smooth, reducing the occurrence of loss spikes and instability and thus representing a significant advancement over existing methods.

    The research employs structured linear parameterization, where the FFN layers are approximated using combinations of low-rank and block-diagonal matrices. The key innovation is the ‘self-guided training’ method, where the dense matrix aids in the early training stages, progressively transitioning to efficient structured forms. The training utilizes the RefinedWeb dataset, which includes 600B tokens, and employs advanced GPU optimizations like mixed precision training, Flash Attention, and rotary embeddings. Hyperparameters such as learning rates and dropout rates are meticulously tuned to ensure optimal performance. The proposed models are tested at scales ranging from 110M to 1.3B parameters, demonstrating scalability and robustness.

    The innovative method significantly enhances training and inference efficiency. The structured FFN models achieved a 1.35× speed-up in training and a 2.5× faster FFN at inference with only a slight increase in perplexity. The ‘self-guided training’ technique resulted in a 0.4 reduction in perplexity on a 1.3B parameter model with consistent training FLOPs. The approach demonstrated improved performance metrics, including lower perplexity and higher throughput, validating its efficacy and superiority over traditional FFNs.

    In conclusion, this research presents a significant contribution to optimizing large language models by introducing a hybrid structured FFN approach combined with self-guided training. This innovation addresses critical limitations of existing methods, resulting in improved training efficiency and model performance. The findings suggest that this advancement could propel AI research forward by making large-scale models more computationally efficient and accessible, thereby promoting sustainable and democratized AI development.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 45k+ ML SubReddit

    The post Cutting Costs, Not Performance: Structured FeedForward Networks FFNs in Transformer-Based LLMs appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMeet Rakis: A Decentralized Verifiable Artificial Intelligence AI Network in the Browser
    Next Article Researchers at Brown University Explore Zero-Shot Cross-Lingual Generalization of Preference Tuning in Detoxifying LLMs

    Related Posts

    Development

    February 2025 Baseline monthly digest

    May 17, 2025
    Development

    Learn A1 Level Spanish

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    The ROI of Security Investments: How Cybersecurity Leaders Prove It

    Development

    How to Stop Zoom from Altering Text Size

    Development

    CVE-2025-32887 – GoTenna Frequency Hopping Command Channel Interception Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Elden Ring DLC: What level should you be for Shadow of the Erdtree?

    Development

    Highlights

    Development

    Gemini AI Now Accessible Through the OpenAI Library for Streamlined Use

    November 9, 2024

    In an exciting update for developers, Google has launched Gemini, a new AI model that…

    How to Build an Application with AWS Lambda

    January 28, 2025

    ChatGPT’s Advanced Voice Mode gets a big upgrade (for free users, too)

    March 25, 2025

    This AI Paper by Inria Introduces the Tree of Problems: A Simple Yet Effective Framework for Complex Reasoning in Language Models

    November 8, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.