Tool

Added 29 Jun 2026 Last updated 29 Jun 2026 Read time 5 min

Microsoft Phi

Microsoft Phi is a family of small, open-weight language models built to stay capable at sizes that run on-device and cut inference cost.

aismall language modelsopen weightsmicrosofton-device

Connected Foundation Models LLM - Large Language Model Inference - Running AI Models in Production Mixture of Experts (MoE)Azure OpenAI - Enterprise GPT on Microsoft Cloud

Learn this your way

Read Guided course

Three small glowing spheres converging, representing a family of small, efficient language models. — Phi is a family of small models tuned so that quality does not have to scale with size.

Microsoft Phi is a family of small language models (SLMs) released as open weights under the MIT license. The models solve a specific problem: most capable large language models are big, slow, and expensive to run, which puts them out of reach for phones, laptops, and cost-sensitive workloads. Phi trades raw scale for carefully curated training data, aiming to keep quality high while the parameter count stays small enough to run on modest hardware.

A small language model is a foundation model with far fewer parameters than a frontier system. Parameters are the learned weights a model uses to generate output. Fewer parameters mean smaller memory footprint, faster inference , and lower cost per request. Microsoft’s bet with Phi is that data quality, not sheer size, drives much of a model’s usefulness. Phi models are trained on heavily filtered and synthetic “textbook-quality” data rather than the whole web.

Where Phi sits

Phi occupies the small end of the model-size spectrum. You reach for it when a frontier model is more than the task needs, or when the deployment target cannot host one.

Frontier models

GPT class Claude Highest capability, hosted, higher cost and latency

Mid-size open models

Llama Mistral Strong general models, still need server-class GPUs

Small language models

Phi-4 Phi-4-mini Phi-4-multimodal Runs on-device or on cheap GPUs, low latency, open weights

Deployment target

Laptop Phone Edge device Small cloud instance

The Phi family

Microsoft has shipped several generations. The current Phi-4 line covers a text model, a compact model, a multimodal model, and reasoning-tuned variants.

Phi-4 is a 14 billion parameter text model, first presented in December 2024. It is built on a decoder-only Transformer, was pretrained on roughly 10 trillion tokens of curated and synthetic data, and supports a 16k-token context length. Microsoft targeted mathematics and multi-step reasoning with this release.
Phi-4-mini is a 3.8 billion parameter model aimed at even lighter deployment.
Phi-4-multimodal is a 5.6 billion parameter model that handles speech, vision, and text in one model using a mixture-of-LoRAs design, with a 128k-token context length. Microsoft reports it ranked first on the Hugging Face OpenASR leaderboard with a 6.14% word error rate at the time of release.
Phi-4-reasoning (14B) and Phi-4-reasoning-plus (14B) are reasoning-tuned variants. Phi-4-reasoning-plus is further trained with reinforcement learning to spend more inference-time compute. Phi-4-reasoning-plus supports a 32k-token context by default.
Phi-4-mini-reasoning (3.8B) targets multi-step mathematical problem solving at small size.

Earlier generations remain available too. The Phi-3.5 line, released in August 2024, includes Phi-3.5-mini (3.82B), Phi-3.5-vision (4.15B), and Phi-3.5-MoE, a mixture-of-experts model with 41.9 billion total parameters that activates about 6.6 billion per token. All three support a 128k-token context.

How to access it

Phi models are open weights. You do not need a Microsoft account to download and run them.

Step 1 Pick a variant Match model size to hardware and task. Use mini for edge, Phi-4 for general text, multimodal for speech and vision.

→

Step 2 Get the weights Download from Hugging Face under the MIT license, or select the model in Azure AI Foundry.

→

Step 3 Run or host Run locally with common inference runtimes, or serve it as a managed endpoint through Azure.

The MIT license allows free use, modification, and distribution, including for commercial products. Phi-4 and the reasoning variants are published on Hugging Face and in the Azure AI Foundry catalog. If you already run other Microsoft-hosted models through Azure OpenAI Service , Foundry gives you Phi alongside them without changing clouds.

How it compares

Phi competes with other small and open model families. The comparison below is about positioning, not a benchmark ranking.

	Phi-4	Mistral small models	DeepSeek distills
Maker	Microsoft	Mistral AI	DeepSeek
Size focus	3.8B to 14B	Small to mid	Distilled small variants
License	MIT (open weights)	Open weights on many models	Open weights on many models
Strength	Reasoning at small size	General European multilingual	Distilled reasoning
Best for	On-device, cost-sensitive apps	Broad general use	Reasoning on a budget

For the mid-size and multilingual end, see Mistral AI . For distilled reasoning models released as open weights, see DeepSeek .

When not to use it

Small models trade capability for size. Phi is the wrong choice when:

You need frontier-level breadth. For the hardest open-ended reasoning, broad world knowledge, or long complex documents, a large model still leads. Phi-4’s base text context is 16k tokens, smaller than many hosted frontier models.
You need the widest tool and ecosystem support. Frontier hosted APIs ship mature tool-calling, function-calling, and safety tooling. Verify Phi’s support for your exact features before committing.
Accuracy on rare edge cases is safety-critical. A smaller parameter count means less capacity to memorise long-tail facts. Add retrieval or human review for high-stakes output.
You have no capacity to self-host and want a fully managed frontier experience. In that case a hosted API may be less operational work, even at higher cost per call.

Match the model to the job. Phi shines when latency, cost, or on-device privacy matter more than absolute peak capability.

Sources

Phi Open Models, Microsoft Azure : official product page for the Phi family.
Empowering innovation: the next generation of the Phi family, Microsoft Azure Blog : Phi-4-mini and Phi-4-multimodal announcement.
Microsoft launches Phi-4-reasoning-plus, VentureBeat : reasoning variant sizes and context length.
Microsoft AI released Phi-4 under the MIT license, MarkTechPost : open weights and MIT licensing.
Microsoft AI releases Phi-3.5 mini, MoE and Vision, MarkTechPost : Phi-3.5 family sizes and context.
microsoft/phi-4, Hugging Face : model card for the 14B text model.

Open source projects

Freelancer Templates Contracts, proposals, SOWs

Freelancer Automation Workflow recipes, AI playbooks

Work with Linda

Workshop Series €2,000/mo x 3

1:1 Consulting 60 min session