Microsoft launches Phi-3 Mini, a tiny but powerful LM

Microsoft launched Phi-3 Mini, a tiny language model that is part of the companyâ€™s strategy to develop lightweight, function-specific AI models.

The progression of language models has seen ever larger parameters, training datasets, and context windows. Scaling the size of these models delivered more powerful capabilities but at a cost.

The traditional approach to training an LLM is to have it consume massive amounts of data which requires huge computing resources. Training an LLM like GPT-4, for example, is estimated to have taken around 3 months and to have cost over $21m.

GPT-4 is a great solution for tasks that require complex reasoning but overkill for simpler tasks like content creation or a sales chatbot. Itâ€™s like using a Swiss Army knife when all you need is a simple letter opener.

At only 3.8B parameters, Phi-3 Mini is tiny. Still, Microsoft says it is an ideal lightweight, low-cost solution for tasks like summarizing a document, extracting insights from reports, and writing product descriptions or social media posts.

The MMLU benchmark figures show Phi-3 Mini and the yet-to-be-released larger Phi models beating larger models like Mistral 7B and Gemma 7B.

Phi-3 modelsâ€™ performance on the Massive Multitask Language Understanding (MMLU) benchmark compared to other models of similar size. Source: Microsoft

Microsoft says Phi-3-small (7B parameters) and Phi-3-medium (14B parameters) will be available in the Azure AI Model Catalog â€œshortlyâ€.

Larger models like GPT-4 are still the gold standard and we can probably expect that GPT-5 will be even bigger.

SLMs like Phi-3 Mini offer some important benefits that larger models donâ€™t. SLMs are cheaper to fine-tune, require less compute, and could run on-device even in situations where no internet access is available.

Deploying an SLM at the edge results in less latency and maximum privacy because thereâ€™s no need to send data back and forth to the cloud.

Hereâ€™s Sebastien Bubeck, VP of GenAI research at Microsoft AI with a demo of Phi-3 Mini. Itâ€™s super fast and impressive for such a small model.

phi-3 is here, and itâ€™s â€¦ good :-).

I made a quick short demo to give you a feel of what phi-3-mini (3.8B) can do. Stay tuned for the open weights release and more announcements tomorrow morning!

(And ofc this wouldnâ€™t be complete without the usual table of benchmarks!) pic.twitter.com/AWA7Km59rp

â€” Sebastien Bubeck (@SebastienBubeck) April 23, 2024

Curated synthetic data

Phi-3 Mini is a result of discarding the idea that huge amounts of data are the only way to train a model.

Sebastien Bubeck, Microsoft vice president of generative AI research asked â€œInstead of training on just raw web data, why donâ€™t you look for data which is of extremely high quality?â€

Microsoft Research machine learning expert Ronen Eldan was reading bedtime stories to his daughter when he wondered if a language model could learn using only words a 4-year-old could understand.

This led to an experiment where they created a dataset starting with 3,000 words. Using only this limited vocabulary they prompted an LLM to create millions of short childrenâ€™s stories which were compiled into a dataset called TinyStories.

The researchers then used TinyStories to train an extremely small 10M parameter model which was subsequently able to generate â€œfluent narratives with perfect grammar.â€

They continued to iterate and scale this synthetic data generation approach to create more advanced, but carefully curated and filtered synthetic datasets that were eventually used to train Phi-3 Mini.

The result is a tiny model that will be more affordable to run while offering performance comparable to GPT-3.5.

Smaller but more capable models will see companies move away from simply defaulting to large LLMs like GPT-4. We could also soon see solutions where an LLM handles the heavy lifting but delegates simpler tasks to lightweight models.

The post Microsoft launches Phi-3 Mini, a tiny but powerful LM appeared first on DailyAI.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Microsoft launches Phi-3 Mini, a tiny but powerful LM

Curated synthetic data

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

Indiana Jones was the second best-selling game in the US its debut week, behind another Xbox property

CVE-2025-36521 – MicroDicom DICOM Viewer Out-of-Bounds Read Vulnerability

8 Best Free and Open Source Restic Wrappers

Researchers at Princeton University Proposes Edge Pruning: An Effective and Scalable Method for Automated Circuit Finding

This neckband for my XR glasses was the upgrade I didn’t know I needed

CISA Sounds the Alarm: Broadcom and Commvault Flaws Under Active Exploitation! ️

10+ Tools & Resources for Designers and Agencies in 2025

CVE-2025-3842 – Panhainan DS-Java Code Injection Vulnerability

Microsoft launches Phi-3 Mini, a tiny but powerful LM

Curated synthetic data

Related Posts