Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      The Power Of The Intl API: A Definitive Guide To Browser-Native Internationalization

      August 8, 2025

      This week in AI dev tools: GPT-5, Claude Opus 4.1, and more (August 8, 2025)

      August 8, 2025

      Elastic simplifies log analytics for SREs and developers with launch of Log Essentials

      August 7, 2025

      OpenAI launches GPT-5

      August 7, 2025

      Lenovo Legion Go 2 vs Legion Go — How Do These Gaming Handhelds Compare Based on Rumored Specs?

      August 8, 2025

      9 Default Settings in Windows 11 You Didn’t Know Could Affect Performance and Privacy

      August 8, 2025

      DICE Responds to Battlefield 6 Community: Key Updates on Map Flow and Class Mechanics

      August 8, 2025

      Microsoft Teases ‘Agentic OS’ in 2030 Vision—Is Windows About to Think for Itself?

      August 8, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      How to install stream to IoT platform — Total.js

      August 8, 2025
      Recent

      How to install stream to IoT platform — Total.js

      August 8, 2025

      Heart Disease Prediction using Python & Machine Learning

      August 8, 2025

      How JavaScript really evolves, the inside story

      August 8, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Lenovo Legion Go 2 vs Legion Go — How Do These Gaming Handhelds Compare Based on Rumored Specs?

      August 8, 2025
      Recent

      Lenovo Legion Go 2 vs Legion Go — How Do These Gaming Handhelds Compare Based on Rumored Specs?

      August 8, 2025

      9 Default Settings in Windows 11 You Didn’t Know Could Affect Performance and Privacy

      August 8, 2025

      DICE Responds to Battlefield 6 Community: Key Updates on Map Flow and Class Mechanics

      August 8, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone)

    OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone)

    August 6, 2025

    OpenAI has just sent seismic waves through the AI world: for the first time since GPT-2 hit the scene in 2019, the company is releasing not one, but TWO open-weight language models. Meet gpt-oss-120b and gpt-oss-20b—models that anyone can download, inspect, fine-tune, and run on their own hardware. This launch doesn’t just shift the AI landscape; it detonates a new era of transparency, customization, and raw computational power for researchers, developers, and enthusiasts everywhere.

    Why Is This Release a Big Deal?

    OpenAI has long cultivated a reputation for both jaw-dropping model capabilities and a fortress-like approach to proprietary tech. That changed on August 5, 2025. These new models are distributed under the permissive Apache 2.0 license, making them open for commercial and experimental use. The difference? Instead of hiding behind cloud APIs, anyone can now put OpenAI-grade models under their microscope—or put them directly to work on problems at the edge, in enterprise, or even on consumer devices.

    Meet the Models: Technical Marvels with Real-World Muscle

    gpt-oss-120B

    • Size: 117 billion parameters (with 5.1 billion active parameters per token, thanks to Mixture-of-Experts tech)
    • Performance: Punches at the level of OpenAI’s o4-mini (or better) in real-world benchmarks.
    • Hardware: Runs on a single high-end GPU—think Nvidia H100, or 80GB-class cards. No server farm required.
    • Reasoning: Features chain-of-thought and agentic capabilities—ideal for research automation, technical writing, code generation, and more.
    • Customization: Supports configurable “reasoning effort” (low, medium, high), so you can dial up power when needed or save resources when you don’t.
    • Context: Handles up to a massive 128,000 tokens—enough text to read entire books at a time.
    • Fine-Tuning: Built for easy customization and local/private inference—no rate limits, full data privacy, and total deployment control.

    gpt-oss-20B

    • Size: 21 billion parameters (with 3.6 billion active parameters per token, also Mixture-of-Experts).
    • Performance: Sits squarely between o3-mini and o4-mini in reasoning tasks—on par with the best “small” models available.
    • Hardware: Runs on consumer-grade laptops—with just 16GB RAM or equivalent, it’s the most powerful open-weight reasoning model you can fit on a phone or local PC.
    • Mobile Ready: Specifically optimized to deliver low-latency, private on-device AI for smartphones (including Qualcomm Snapdragon support), edge devices, and any scenario needing local inference minus the cloud.
    • Agentic Powers: Like its big sibling, 20B can use APIs, generate structured outputs, and execute Python code on demand.

    Technical Details: Mixture-of-Experts and MXFP4 Quantization

    Both models use a Mixture-of-Experts (MoE) architecture, only activating a handful of “expert” subnetworks per token. The result? Enormous parameter counts with modest memory usage and lightning-fast inference—perfect for today’s high-performance consumer and enterprise hardware.

    Add to that native MXFP4 quantization, shrinking model memory footprints without sacrificing accuracy. The 120B model fits snugly onto a single advanced GPU; the 20B model can run comfortably on laptops, desktops, and even mobile hardware.

    Real-World Impact: Tools for Enterprise, Developers, and Hobbyists

    • For Enterprises: On-premises deployment for data privacy and compliance. No more black-box cloud AI: financial, healthcare, and legal sectors can now own and secure every bit of their LLM workflow.
    • For Developers: Freedom to tinker, fine-tune, and extend. No API limits, no SaaS bills, just pure, customizable AI with full control over latency or cost.
    • For the Community: Models are already available on Hugging Face, Ollama, and more—go from download to deployment in minutes.

    How Does GPT-OSS Stack Up?

    Here’s the kicker: gpt-oss-120B is the first freely available open-weight model that matches the performance of top-tier commercial models like o4-mini. The 20B variant not only bridges the performance gap for on-device AI but will likely accelerate innovation and push boundaries on what’s possible with local LLMs.

    The Future Is Open (Again)

    OpenAI’s GPT-OSS isn’t just a release; it’s a clarion call. By making state-of-the-art reasoning, tool use, and agentic capabilities available for anyone to inspect and deploy, OpenAI throws open the door to an entire community of makers, researchers, and enterprises—not just to use, but to build on, iterate, and evolve.


    Check out the gpt-oss-120B, gpt-oss-20B and  Technical Blog. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

    The post OpenAI Just Released the Hottest Open-Weight LLMs: gpt-oss-120B (Runs on a High-End Laptop) and gpt-oss-20B (Runs on a Phone) appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleA Coding Implementation to Build a Self-Adaptive Goal-Oriented AI Agent Using Google Gemini and the SAGE Framework
    Next Article Anthropic AI Introduces Persona Vectors to Monitor and Control Personality Shifts in LLMs

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    August 8, 2025
    Machine Learning

    Your LLM Knows the Future: Uncovering Its Multi-Token Prediction Potential

    August 8, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    This mirror wraps your reflection inside Microsoft Paint — but you only have two days to order your own

    News & Updates

    CVE-2025-40583 – SCALANCE LPE9403 Cleartext Sensitive Information Transmission

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-46689 – Ververica Platform Reflected Cross-Site Scripting

    Common Vulnerabilities and Exposures (CVEs)
    Exploring Pages, Links, Tags, and Block References in Logseq

    Exploring Pages, Links, Tags, and Block References in Logseq

    Linux

    Highlights

    Minimal CSS-only blurry image placeholders

    April 9, 2025

    Here’s a CSS technique that produces blurry image placeholders (LQIPs) without cluttering up your markup…

    Disentangled Representational Learning with the Gromov-Monge Gap

    April 17, 2025

    CVE-2025-6040 – WordPress Easy Flashcards CSRF

    June 14, 2025

    DeepReinforce Team Introduces CUDA-L1: An Automated Reinforcement Learning (RL) Framework for CUDA Optimization Unlocking 3x More Power from GPUs

    August 3, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.