Qwen Models | Catalog Overview

Q: What do the Qwen version numbers mean?

The version number reflects the generation of the overall family. Qwen2 is the second generation, Qwen2.5 is an incremental update within that generation. A suffix like -Coder indicates a code-specialised fine-tune; -VL indicates a vision-language model; -Audio indicates audio capability. The parameter count comes after the family name, followed by -Base for the pre-trained model or -Instruct for the instruction-tuned version.

Q: Which Qwen model is best for my use case?

It depends on the task, hardware, and language requirements. For most text tasks on consumer hardware, Qwen2.5-7B-Instruct is a good starting point. For code-specific work, Qwen2.5-Coder-7B or 14B covers most cases. For vision tasks, the Qwen2.5-VL variants are the most recent multimodal releases. The 32B and 72B sizes are appropriate when output quality needs to be as strong as possible and hardware is available.

Page Pulse

Qwen models in brief: the family spans text, code, vision, and audio branches across multiple generations (Qwen through Qwen2.5 and beyond); each generation ships a wide parameter sweep from ~0.5B to 100B+; base and instruction-tuned variants ship at each size; official weights live on the Qwen Hugging Face organisation; version naming follows a consistent pattern once you know the conventions.

How the Qwen model family is organised

The Qwen family tree has four main branches — text, code, vision, and audio — each with multiple generations and a wide parameter sweep at each release.

The Qwen models family has grown substantially since the first public release. Understanding the full set requires a mental model of four orthogonal axes: the generation (Qwen 1.0 through Qwen2.5 and later), the branch (text, code, vision, audio), the parameter size (roughly 0.5B through 100B+), and the training stage (base versus instruction-tuned). Any given model release sits at a specific combination of those four axes.

The generation axis is the most visible in the naming. Qwen2.5 is a more capable and more refined release than Qwen2, which improved on Qwen1.5, which improved on the original Qwen release. Within a generation, the team ships updates in parallel across branches — so Qwen2.5, Qwen2.5-Coder, and Qwen2.5-VL are separate model lines sharing a generation number but fine-tuned for different task types.

The parameter sweep within a generation is unusually wide. While many open-weight families focus their release on two or three sizes, Qwen typically ships something like 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B within the same generation, plus occasionally a larger flagship. That breadth means there is a Qwen variant for almost any hardware constraint, from a mobile device to a server cluster.

How to read Qwen version naming

The naming convention is systematic once decoded: family name + generation + optional branch + parameter count + base or instruct stage.

A full Qwen model name like "Qwen2.5-Coder-14B-Instruct" decomposes as follows: Qwen is the family name; 2.5 is the generation; Coder indicates the code-specialised branch; 14B is the parameter count; Instruct means the instruction-tuned variant. A name like "Qwen2.5-72B-Base" means the Qwen 2.5 generation general text model at 72B parameters in its base (pre-trained, not instruction-tuned) form.

Some releases add further suffixes. "-GPTQ-Int4" indicates a GPTQ 4-bit quantised build. "-AWQ" indicates an AWQ quantised build. "-GGUF" indicates a GGUF build for llama.cpp. These suffixed variants are typically community-published builds on Hugging Face under separate accounts, not official Tongyi releases. The official releases use the base naming without quantisation suffixes.

Generations and branches at a glance

A quick map of the four Qwen branches and how they have evolved across generations, with notes on where each branch's releases currently live.

The text branch is the core of the Qwen models family. It covers the general-purpose base models and instruction-tuned chat models. Each generation has produced base and instruct variants at the full parameter sweep. The instruction-tuned models in this branch are the right starting point for general conversation, summarisation, translation, and reasoning tasks.

The code branch (Qwen-Coder) ships alongside the text branch with each generation. It continues pre-training on code-heavy data and adds fill-in-the-middle training. Recent Qwen-Coder releases have scored competitively with the best open-weight code models at equivalent parameter counts. The coding plan page on this site covers the Qwen-Coder branch in more depth.

The vision-language branch (Qwen-VL) adds image understanding to the text capability. Qwen-VL models accept image plus text inputs and return text. They are suited for chart analysis, document understanding, visual question answering, and image description. The VL branch has improved multimodal integration with each generation and is the primary Qwen entry point for multimodal applications.

The audio branch (Qwen-Audio) adds speech and audio understanding. It is the smallest and most recently added branch, with fewer parameter size options than the text and code branches. For teams building voice or audio-processing applications, the Qwen-Audio releases are the relevant starting point. AI model evaluation methodology guidance from NIST's AI Risk Management Framework provides useful context for teams formally evaluating Qwen models against alternatives. Research on multimodal model assessment from UC Berkeley's BAIR is another relevant reference.

Qwen family branch, release names by generation, and parameter classes shipped
Family branch	Release names (representative)	Parameter classes shipped
Text (base + instruct)	Qwen, Qwen1.5, Qwen2, Qwen2.5 — Base and Instruct variants	0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B, 110B (flagship varies by gen)
Code (Qwen-Coder)	Qwen2.5-Coder, Qwen2-Coder — Base and Instruct variants	0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B (sweep varies by generation)
Vision-language (Qwen-VL)	Qwen-VL, Qwen2-VL, Qwen2.5-VL — Instruct primary	3B, 7B, 72B in recent generations; earlier gens had single-size releases
Audio (Qwen-Audio)	Qwen-Audio, Qwen2-Audio — Instruct primary	7B class primary; fewer size options than text or code branch
Math (Qwen-Math)	Qwen2.5-Math — Base and Instruct variants	1.5B, 7B, 72B; specialised for mathematical reasoning tasks

Where Qwen model weights live

The primary distribution channel for Qwen models is the official Qwen organisation on Hugging Face. Each model has its own repository containing the weights in safetensors format, the tokeniser files, the model card, and the LICENSE file. Weights are free to download subject to the applicable license — Apache 2.0 on recent flagship text and code releases, custom Qwen Community License on some older or specialised variants.

Community quantised builds — GGUF, AWQ, and GPTQ variants — are typically found under separate community accounts on Hugging Face rather than under the official Qwen organisation. The Ollama model library includes a curated set of Qwen builds that can be pulled with a single command. For teams running Qwen via vLLM or llama-server, the standard Hugging Face download path followed by model loading via the respective inference engine is the recommended route.

One useful framing for an external reader is that the catalog is best read top-down rather than left-to-right. Start with the family branch you actually need — text, code, vision, or audio — before sweating about parameter sizes inside that branch. The right size question is downstream of the right branch question, and a team that gets the branch wrong will waste a week of fine-tuning before realising the issue. The other useful framing is that recent generations supersede older ones except where a license footprint or a hardware constraint pulls the decision in another direction.

Frequently asked questions

Five questions on the Qwen model catalog that developers most often need answered before selecting a release.

How many Qwen models have been released?

The Qwen family has grown to dozens of distinct model checkpoints across text, code, vision, and audio branches. Each major generation ships multiple parameter sizes — typically 0.5B through 72B and sometimes larger — plus base and instruction-tuned variants at each size, resulting in a large and growing catalog.

What do the Qwen version numbers mean?

The version number reflects the generation of the family. Qwen2 is the second generation, Qwen2.5 an incremental update within it. A suffix like -Coder indicates a code-specialised fine-tune; -VL indicates vision-language capability; -Audio indicates audio support. The parameter count follows the family name, and -Base or -Instruct indicates whether the model is pre-trained or instruction-tuned.

Where do I download Qwen model weights?

Official Qwen model weights are hosted on the Qwen organisation page on Hugging Face in safetensors format, with model cards, tokeniser configs, and LICENSE files. Community quantised builds (GGUF, AWQ, GPTQ) are available under separate community accounts on Hugging Face.

What is the difference between Qwen-Base and Qwen-Instruct?

Qwen-Base is the pre-trained language model trained on the large-scale corpus without instruction tuning. It is generally not suitable for direct user-facing conversation. Qwen-Instruct is the supervised fine-tuned and RLHF-aligned version suited for chat and instruction-following tasks.

Which Qwen model is best for my use case?

For most text tasks on consumer hardware, Qwen2.5-7B-Instruct is a practical starting point. For code-specific work, Qwen2.5-Coder-7B or 14B-Instruct covers most cases. For vision tasks, the Qwen2.5-VL variants are the most recent multimodal releases. The 32B and 72B sizes are appropriate when output quality needs to be as high as possible and hardware supports it.

Qwen models: a catalog overview of every release