An aerial view of a dark circuit board with a red trace network, representing a widely used open model family.
Qwen sits deep in the open-model supply chain, powering derivatives and applications well beyond Alibaba's own products.

Qwen is the family of large language models developed by the Qwen team at Alibaba Cloud, first launched in April 2023 under the Chinese name Tongyi Qianwen. Many Qwen models ship as open weights under the permissive Apache 2.0 license, which means you can download, run, fine-tune, and redistribute them for commercial use without royalty. That combination of capability and open licensing has made Qwen one of the most downloaded and forked model families in the open-model ecosystem, with hundreds of thousands of derivative variations published on Hugging Face.

The problem Qwen solves is access. Frontier-quality foundation models are usually locked behind proprietary APIs. Qwen gives teams a route to run competitive models on their own infrastructure, keep data in their own environment, and avoid per-token API lock-in, while still offering a hosted API for teams that prefer it.

Where Qwen sits in the stack

Applications
Chat assistants Agents RAG pipelines Qwen powers Alibaba products and third-party apps
Access layer
Open weights Alibaba Cloud Model Studio API Local runtimes
Model family
Qwen3 dense Qwen3 MoE Qwen-VL Qwen-Coder Text, vision, audio, code and math variants
Infrastructure
Your own GPUs Cloud GPU rental Alibaba Cloud

The model family

Qwen has grown into a broad family rather than a single model. The main lines are:

  • Base text models: general-purpose language models, evolving through Qwen, Qwen2, Qwen2.5, and Qwen3.
  • Qwen3: announced in April 2025, released in both dense sizes (0.6B, 1.7B, 4B, 8B, 14B, 32B) and Mixture-of-Experts sizes (30B-A3B and 235B-A22B, where the second number is the active parameter count). Qwen3 supports both a thinking mode for step-by-step reasoning and a faster non-thinking mode.
  • Qwen-VL and Qwen2-VL: vision-language models that read images alongside text.
  • Qwen2.5-Coder: code-focused models for generation and completion.
  • Specialized variants: including math and audio-focused releases, plus multimodal releases that handle voice.

Alibaba also runs proprietary, API-only models such as Qwen3-Max that are not distributed as open weights. Licensing varies across the family: most open-weight releases use Apache 2.0, while some releases use the Qwen License or a research-only license. Check the license on each specific model card before you build on it.

How to access it

You can use Qwen in three ways, depending on how much control you need.

  1. Download the open weights. Open-weight Qwen models are published on Hugging Face and ModelScope. You pull the weights and run them on hardware you control. This keeps your prompts and data in your own environment.
  2. Run it locally or self-host. Qwen open-weight models run through common runtimes such as llama.cpp, Ollama, and LM Studio for local use, or through serving stacks like vLLM for production. You can also fine-tune the open weights on your own data.
  3. Call the hosted API. Alibaba Cloud Model Studio exposes Qwen models, including proprietary variants, over an API. This route removes the need to manage GPUs and gives you access to models that are not released as open weights.
Step 1 Pick a model Choose a size and variant. Small dense models for local use, MoE models for higher capability.
Step 2 Check the license Confirm Apache 2.0 or the applicable license on the model card before commercial use.
Step 3 Deploy or call the API Serve the weights on your own GPUs, or call Alibaba Cloud Model Studio.
Step 4 Adapt Fine-tune on your data or wire the model into an agent or RAG pipeline.

How it compares

Qwen competes most directly with other open-weight model families. The table compares it against three widely used alternatives.

Alibaba QwenMeta LlamaMistral AIDeepSeek
MakerAlibaba CloudMetaMistral AI (France)DeepSeek (China)
Open weightsYes, most releasesYesYes, several modelsYes
Common licenseApache 2.0 (varies)Llama community licenseApache 2.0 (varies)MIT and others (varies)
ArchitecturesDense and MoEDense and MoEDense and MoEDense and MoE
Best forMultilingual, wide size rangeLarge community, toolingEuropean hosting, efficiencyReasoning, cost efficiency

For a broader view of where these families sit, see the 2026 LLM landscape comparison , Mistral AI , and DeepSeek .

When not to use it

  • You need a fully managed frontier product with enterprise support baked in. A proprietary hosted API from a single vendor may fit your procurement and support needs better than self-hosting open weights.
  • You have strict data-residency or vendor-governance rules that exclude the provider. Some organizations restrict models developed by specific companies or countries. Confirm your policy before adopting Qwen through the hosted API.
  • You cannot run the model you want. The largest MoE models need serious GPU capacity. If you lack that hardware and do not want to pay for the hosted API, a smaller model or a different provider may suit you.
  • The license does not permit your use. Not every Qwen release is Apache 2.0. If a model ships under a research-only or restricted license, do not use it commercially.

Further reading

Sources