windows 安装cuda版本
查看window cuda版本
nvidia-smi
vllm 获取镜像,此版本需要cuda 版本12.8 或以上
docker pull vllm/vllm-openai:latest
下载模型
git lfs installcd e:\ai mkdir vllm\models\qwen2cd vllm\models#通过git下载git clone https://www.modelscope.cn/qwen/qwen2-0.5b.git Qwen2-0.5B#通过sdk下载pip install modelscope from modelscope import snapshot_download
model_dir = snapshot_download('qwen/qwen2-0.5b',local_dir='e:\ai\vllm\models\qwen2')#通过命令下载conda create --name vLLM python=3.10 -yconda activate vllmpip install modelscopemodelscope download --model qwen/qwen2-0.5b --local_dir e:\ai\vllm\models\qwen2
下载结果
运行vllm
services:vllm:container_name: vllmrestart: noimage: vllm/vllm-openai:latestruntime: nvidiaipc: host #environment:# - HF_HUB_OFFLINE = 1# - CUDA_VISIBLE_DEVICES = 0volumes:- E:\ai\vllm\models\Qwen2:/modelscommand: ["--model", "/models/Qwen/qwen2-0___5b","--served_model_name", "qen2","--gpu_memory_utilization", "0.90","--max_model_len", "1024 ","--tensor-parallel-size", "1"]ports:- 8000:8000deploy:resources:reservations:devices:- driver: nvidiacapabilities: [ gpu ]count: all
vllm 运行时提示,需要的gpu版本,运行后查看cuda版本
cuda版本可以做升级处理
CUDA下载地址:CUDA Toolkit Archive | NVIDIA Developer
升级处理 安装选自定义全部安装
启动vllm
cd E:\project\vllm-maindocker-compose up -d