Last Updated: 3/9/2026
Installation
vLLM supports the following hardware platforms:
GPU
CPU
- Intel/AMD x86
- ARM AArch64
- Apple silicon
- IBM Z (S390X)
Hardware Plugins
vLLM supports third-party hardware plugins that live outside the main vllm repository. These follow the Hardware-Pluggable RFC.
A list of all supported hardware can be found on the vllm.ai website . If you want to add new hardware, please contact us on Slack or Email.
Installation Instructions
NVIDIA CUDA
For NVIDIA GPUs, install vLLM using pip:
# Using uv (recommended)
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install vllm --torch-backend=autoThe --torch-backend=auto flag automatically selects the appropriate PyTorch index based on your CUDA driver version.
Alternatively, use conda:
conda create -n myenv python=3.12 -y
conda activate myenv
pip install --upgrade uv
uv pip install vllm --torch-backend=autoAMD ROCm
For AMD GPUs, install vLLM using uv:
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install vllm --extra-index-url https://wheels.vllm.ai/rocm/Requirements:
- Python 3.12
- ROCm 7.0
glibc >= 2.35
Note: Previously, Docker images were published using AMD’s docker release pipeline at rocm/vllm-dev. This is being deprecated in favor of vLLM’s docker release pipeline.
Intel XPU
For Intel GPUs, follow the installation instructions in the XPU documentation .
Google TPU
To run vLLM on Google TPUs, install the vllm-tpu package:
uv pip install vllm-tpuFor more detailed instructions, refer to the vLLM on TPU documentation .
CPU Installation
For CPU-only installation:
uv pip install vllm-cpuSupported CPU architectures:
- Intel/AMD x86-64
- ARM AArch64
- Apple Silicon (M1/M2/M3)
- IBM Z (S390X)
Docker Installation
vLLM provides official Docker images:
# NVIDIA GPU
docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
-p 8000:8000 \
--ipc=host \
vllm/vllm-openai:latest \
--model mistralai/Mistral-7B-v0.1
# AMD GPU
docker run --device=/dev/kfd --device=/dev/dri \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
-p 8000:8000 \
--ipc=host \
vllm/vllm-openai:latest \
--model mistralai/Mistral-7B-v0.1Building from Source
To build vLLM from source:
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -e .For development with editable install:
pip install -e ".[dev]"Troubleshooting
If you encounter issues during installation:
- Check CUDA/ROCm version compatibility
- Verify Python version (3.10-3.13)
- Check available disk space (models can be large)
- Review error logs for specific dependency issues
For more help, visit:
Next Steps
After installation, proceed to the Quickstart Guide to start using vLLM.