Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      IBM launches new integration to help unify AI security and governance

      June 18, 2025

      Meet Accessible UX Research, A Brand-New Smashing Book

      June 18, 2025

      Modernizing your approach to governance, risk and compliance

      June 18, 2025

      ScyllaDB X Cloud’s autoscaling capabilities meet the needs of unpredictable workloads in real time

      June 17, 2025

      RAIDOU Remastered: The Mystery of the Soulless Army Review (PC) – A well-done action-RPG remaster that makes me hopeful for more revivals of classic Atlus titles

      June 18, 2025

      With Windows 10 circling the drain, Windows 11 sees a long-overdue surge

      June 18, 2025

      This PC app boosts FPS in any game on any GPU for only $7 — and it just got a major update

      June 18, 2025

      Sam Altman claims Meta is trying to poach OpenAI staffers with $100 million bonuses, but “none of our best people have decided to take them up on that”

      June 18, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Why Inclusive Design Solutions Are important for Accessibility

      June 18, 2025
      Recent

      Why Inclusive Design Solutions Are important for Accessibility

      June 18, 2025

      Microsoft Copilot for Power Platform

      June 18, 2025

      Integrate Coveo Atomic CLI-Based Hosted Search Page into Adobe Experience Manager (AEM)

      June 18, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      RAIDOU Remastered: The Mystery of the Soulless Army Review (PC) – A well-done action-RPG remaster that makes me hopeful for more revivals of classic Atlus titles

      June 18, 2025
      Recent

      RAIDOU Remastered: The Mystery of the Soulless Army Review (PC) – A well-done action-RPG remaster that makes me hopeful for more revivals of classic Atlus titles

      June 18, 2025

      With Windows 10 circling the drain, Windows 11 sees a long-overdue surge

      June 18, 2025

      This PC app boosts FPS in any game on any GPU for only $7 — and it just got a major update

      June 18, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Why Small Language Models (SLMs) Are Poised to Redefine Agentic AI: Efficiency, Cost, and Practical Deployment

    Why Small Language Models (SLMs) Are Poised to Redefine Agentic AI: Efficiency, Cost, and Practical Deployment

    June 18, 2025

    The Shift in Agentic AI System Needs

    LLMs are widely admired for their human-like capabilities and conversational skills. However, with the rapid growth of agentic AI systems, LLMs are increasingly being utilized for repetitive, specialized tasks. This shift is gaining momentum—over half of major IT companies now use AI agents, with significant funding and projected market growth. These agents rely on LLMs for decision-making, planning, and task execution, typically through centralized cloud APIs. Massive investments in LLM infrastructure reflect confidence that this model will remain foundational to AI’s future. 

    SLMs: Efficiency, Suitability, and the Case Against Over-Reliance on LLMs

    Researchers from NVIDIA and Georgia Tech argue that small language models (SLMs) are not only powerful enough for many agent tasks but also more efficient and cost-effective than large models. They believe SLMs are better suited for the repetitive and simple nature of most agentic operations. While large models remain essential for more general, conversational needs, they propose using a mix of models depending on task complexity. They challenge the current reliance on LLMs in agentic systems and offer a framework for transitioning from LLMs to SLMs. They invite open discussion to encourage more resource-conscious AI deployment. 

    Why SLMs are Sufficient for Agentic Operations

    The researchers argue that SLMs are not only capable of handling most tasks within AI agents but are also more practical and cost-effective than LLMs. They define SLMs as models that can run efficiently on consumer devices, highlighting their strengths—lower latency, reduced energy consumption, and easier customization. Since many agent tasks are repetitive and focused, SLMs are often sufficient and even preferable. The paper suggests a shift toward modular agentic systems using SLMs by default and LLMs only when necessary, promoting a more sustainable, flexible, and inclusive approach to building intelligent systems. 

    Arguments for LLM Dominance

    Some argue that LLMs will always outperform small models (SLMs) in general language tasks due to superior scaling and semantic abilities. Others claim centralized LLM inference is more cost-efficient due to economies of scale. There is also a belief that LLMs dominate simply because they had an early start, drawing the majority of the industry’s attention. However, the study counters that SLMs are highly adaptable, cheaper to run, and can handle well-defined subtasks in agent systems effectively. Still, the broader adoption of SLMs faces hurdles, including existing infrastructure investments, evaluation bias toward LLM benchmarks, and lower public awareness. 

    Framework for Transitioning from LLMs to SLMs

    To smoothly shift from LLMs to smaller, specialized ones (SLMs) in agent-based systems, the process starts by securely collecting usage data while ensuring privacy. Next, the data is cleaned and filtered to remove sensitive details. Using clustering, common tasks are grouped to identify where SLMs can take over. Based on task needs, suitable SLMs are chosen and fine-tuned with tailored datasets, often utilizing efficient techniques such as LoRA. In some cases, LLM outputs guide SLM training. This isn’t a one-time process—models should be regularly updated and refined to stay aligned with evolving user interactions and tasks. 

    Conclusion: Toward Sustainable and Resource-Efficient Agentic AI

    In conclusion, the researchers believe that shifting from large to SLMs could significantly improve the efficiency and sustainability of agentic AI systems, especially for tasks that are repetitive and narrowly focused. They argue that SLMs are often powerful enough, more cost-effective, and better suited for such roles compared to general-purpose LLMs. In cases requiring broader conversational abilities, using a mix of models is recommended. To encourage progress and open dialogue, they invite feedback and contributions to their stance, committing to share responses publicly. The goal is to inspire more thoughtful and resource-efficient use of AI technologies in the future. 


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

    The post Why Small Language Models (SLMs) Are Poised to Redefine Agentic AI: Efficiency, Cost, and Practical Deployment appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow to Build an Advanced BrightData Web Scraper with Google Gemini for AI-Powered Data Extraction
    Next Article Meeting summarization and action item extraction with Amazon Nova

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 18, 2025
    Machine Learning

    Accelerate threat modeling with generative AI

    June 18, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Multimodal Queries Require Multimodal RAG: Researchers from KAIST and DeepAuto.ai Propose UniversalRAG—A New Framework That Dynamically Routes Across Modalities and Granularities for Accurate and Efficient Retrieval-Augmented Generation

    Machine Learning

    CVE-2025-27524 – Hitachi JP1/IT Desktop Management 2 – Smart Device Manager Weak Encryption Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    This new Android 16 feature might give a big performance boost to older phones – here’s how

    News & Updates

    You’ll hate Microsoft’s new Outlook less — after facing an annoying Outlook Classic bug cranking CPU usage to 50% in Windows 11 when typing

    News & Updates

    Highlights

    Critical RCE Flaws in MICI NetFax Server Unpatched, Vendor Refuses Fix

    June 1, 2025

    Critical RCE Flaws in MICI NetFax Server Unpatched, Vendor Refuses Fix

    Image: Rapid7
    Security researchers at Rapid7 have uncovered a troubling trio of vulnerabilities in MICI Network Co., Ltd.’s NetFax server (versions
    Read more

    Published Date:
    Jun 02, 2025 (3 hours, 1 minute ago)

    Vulnerabilities has been mentioned in this article.

    CVE-2025-48047

    CVE-2025-48046

    CVE-2025-48045

    CVE-2024-8456

    CVE-2024-38094

    CVE-2022-26923

    Zas is a simple static site generator

    June 17, 2025

    Using Ollama to Run LLMs Locally [FREE]

    April 16, 2025

    CVE-2025-46220 – Apache HTTP Server Unvalidated User Input

    April 23, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.