Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Tiny Screens, Big Impact: The Forgotten Art Of Developing Web Apps For Feature Phones

      July 16, 2025

      Kong AI Gateway 3.11 introduces new method for reducing token costs

      July 16, 2025

      Native vs hybrid vs cross-platform: Resolving the trilemma

      July 16, 2025

      JetBrains updates Junie, Gemini API adds embedding model, and more – Daily News Digest

      July 16, 2025

      Cyberpunk 2077 Update 2.3 is bringing more vehicle customization, photo mode options, and one amazing new feature — launching this week

      July 16, 2025

      The cheapest place to get my games just got even cheaper — get an extra 10% off while you can

      July 16, 2025

      Destiny 2: The Edge of Fate reviews open ‘Mixed’ on Steam, with a player count only a fraction of The Final Shape’s — I’m surprised it’s this low after a new expansion

      July 16, 2025

      A rare opportunity is here to get an HP gaming laptop for only $500 — NVIDIA RTX graphics and a 144Hz display at a bargain price

      July 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The details of TC39’s last meeting

      July 17, 2025
      Recent

      The details of TC39’s last meeting

      July 17, 2025

      Vector Search Embeddings and RAG

      July 16, 2025

      Python Meets Power Automate: Trigger via URL

      July 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      FOSS Weekly #25.29: End of Ubuntu 24.10, AUR Issue, Terminal Tips, Screenshot Editing and More Linux Stuff

      July 17, 2025
      Recent

      FOSS Weekly #25.29: End of Ubuntu 24.10, AUR Issue, Terminal Tips, Screenshot Editing and More Linux Stuff

      July 17, 2025

      Cyberpunk 2077 Update 2.3 is bringing more vehicle customization, photo mode options, and one amazing new feature — launching this week

      July 16, 2025

      The cheapest place to get my games just got even cheaper — get an extra 10% off while you can

      July 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»THUDM Releases GLM 4: A 32B Parameter Model Competing Head-to-Head with GPT-4o and DeepSeek-V3

    THUDM Releases GLM 4: A 32B Parameter Model Competing Head-to-Head with GPT-4o and DeepSeek-V3

    April 14, 2025

    In the rapidly evolving landscape of large language models (LLMs), researchers and organizations face significant challenges. These include enhancing reasoning abilities, providing robust multilingual support, and efficiently managing complex, open-ended tasks. Although smaller models are often more accessible and cost-effective, they typically fall short in performance when compared to their larger counterparts. Hence, there is a growing emphasis on developing mid-sized models that effectively balance computational efficiency with strong reasoning and instruction-following capabilities.

    The recent release of GLM 4 from Tsinghua University, particularly the GLM-Z1-32B-0414 variant, addresses these challenges effectively. Trained on a substantial dataset of 15 trillion tokens, GLM 4 is designed to offer reliable multilingual capabilities and incorporates innovative reasoning strategies referred to as “thinking mode.” This release positions GLM 4 alongside other notable models like DeepSeek Distill, QwQ, and O1-mini, and is distributed under the widely respected MIT license. Notably, despite its relatively moderate parameter size of 32 billion, GLM 4 demonstrates performance comparable to much larger models such as GPT-4o and DeepSeek-V3, which contain up to 671 billion parameters, particularly in reasoning-centric benchmarks.

    On a technical level, GLM-Z1-32B-0414 leverages extensive high-quality training data, including synthetically generated reasoning tasks, to strengthen analytical capabilities. The model integrates sophisticated techniques such as rejection sampling and reinforcement learning (RL) to improve performance in agent-based tasks, coding, function calling, and search-driven question-answering tasks. Additionally, its “Deep Reasoning Model” variation further refines this by employing cold-start methods combined with extended RL training, specifically targeted at complex mathematical, logical, and coding tasks. Pairwise ranking feedback mechanisms are employed during training to enhance the model’s general reasoning effectiveness.

    An advanced variant, GLM-Z1-Rumination-32B-0414, introduces a novel approach termed “rumination,” enabling prolonged reflective reasoning for tackling open-ended, complex queries like comparative AI-driven urban analysis. This variant integrates advanced search tools with multi-objective reinforcement learning, significantly enhancing its utility in research-intensive tasks and complex retrieval-based scenarios. Complementing these larger models, the GLM-Z1-9B-0414 version, with its 9 billion parameters, provides strong mathematical and general reasoning capabilities, demonstrating the practicality of smaller-scale models.

    Performance data from benchmark evaluations emphasize the strengths of the GLM 4 series. Specifically, GLM-4-32B-0414 shows robust results compared to GPT-4o, DeepSeek-V3, and Qwen2.5-Max across multiple benchmarks. On the IFEval instruction-following benchmark, GLM 4 scores an impressive 87.6. In task automation benchmarks such as TAU-Bench, GLM 4 achieves strong scores in scenarios like retail (68.7) and airline (51.2). For search-augmented question-answering tasks, as evaluated by SimpleQA, the model records a high score of 88.1. Additionally, GLM 4 closely matches GPT-4o’s performance in function-calling tasks evaluated by the BFCL-v3 benchmark, securing an overall score of 69.6. In practical code repair scenarios tested through SWE-bench with the Moatless framework, GLM 4 achieves a success rate of 33.8%, underscoring its practical value.

    In summary, GLM 4 presents itself as an effective family of language models, successfully bridging the performance gap between smaller, more accessible models and the traditionally superior larger-scale counterparts. The GLM-Z1 series, especially the 32B variant, exemplifies this balanced approach by providing powerful reasoning capabilities while maintaining computational affordability. With the added advantage of its permissive MIT license, GLM 4 is positioned as a robust tool for research and enterprise applications requiring high-performance AI solutions without the extensive computational overhead traditionally associated with larger models.


    Check out GLM-4-Z1-32B-0414 Model and Other Models. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.

    The post THUDM Releases GLM 4: A 32B Parameter Model Competing Head-to-Head with GPT-4o and DeepSeek-V3 appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleUbuntu MATE 25.04 Release Notes
    Next Article Multimodal Models Don’t Need Late Fusion: Apple Researchers Show Early-Fusion Architectures are more Scalable, Efficient, and Modality-Agnostic

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 17, 2025
    Machine Learning

    Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents

    July 16, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    SteelSeries’ latest wireless mouse is cheap and colorful — but is this the one to spend your money on?

    News & Updates

    CVE-2025-3841 – Wix Incubator Jam Jinja2 Template Handler Template Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-52902 – Apache File Browser Stored XSS Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CURE: A Reinforcement Learning Framework for Co-Evolving Code and Unit Test Generation in LLMs

    Machine Learning

    Highlights

    Development

    Qilin Ransomware Ranked Highest in April 2025 with Over 45 Data Leak Disclosures

    May 8, 2025

    Threat actors with ties to the Qilin ransomware family have leveraged malware known as SmokeLoader…

    Cyber Brief 25-05 – April 2025

    May 2, 2025

    Community News: Latest PEAR Releases (03.10.2025)

    April 15, 2025

    Oracle Cloud EPM: Transitioning to Forms 2.0, Dashboards 2.0 by October 2025

    July 10, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.