安装#

前提条件： Linux · Python 3.9–3.13 · NVIDIA GPU（算力 7.0+）· CUDA 12.1+ · uv

安装 LMCache#

Python (pip / uv)

稳定

CUDA 13.0

uv venv --python 3.12
source .venv/bin/activate
uv pip install lmcache

重要

一切准备就绪！您现在可以开始使用 LMCache。有关实践指南和更多使用示例，请参阅更多示例部分。

备注

NIXL 支持（例如用于分离式 Prefill 和 P2P KV 共享）是一个可选的附加功能：

uv pip install lmcache[nixl]

CUDA 12.9

CUDA 12.9 的 wheel 发布在专用的 GitHub Release 上，而不是 PyPI。

uv venv --python 3.12
source .venv/bin/activate
VERSION=0.4.3  # replace with target release
uv pip install lmcache==${VERSION} \
    --extra-index-url https://download.pytorch.org/whl/cu129 \
    --find-links https://github.com/LMCache/LMCache/releases/expanded_assets/v${VERSION}-cu129 \
    --index-strategy unsafe-best-match

备注

--extra-index-url https://download.pytorch.org/whl/cu129 确保解析 PyTorch 的 CUDA 12.9 构建。没有它，pip 可能会选择不匹配的 CUDA 变体。

夜间版

Nightly wheel 包每天 UTC 时间 07:30 从最新的 dev 分支构建并发布到 GitHub Releases。无需锁定版本 — --pre 会自动选取最新的 nightly 版本。

CUDA 13.0

uv venv --python 3.12
source .venv/bin/activate
uv pip install lmcache --pre \
    --extra-index-url https://download.pytorch.org/whl/cu130 \
    --find-links https://github.com/LMCache/LMCache/releases/expanded_assets/nightly \
    --index-strategy unsafe-best-match

CUDA 12.9

uv venv --python 3.12
source .venv/bin/activate
uv pip install lmcache --pre \
    --extra-index-url https://download.pytorch.org/whl/cu129 \
    --find-links https://github.com/LMCache/LMCache/releases/expanded_assets/nightly-cu129 \
    --index-strategy unsafe-best-match

从源代码安装

--no-build-isolation 确保内核与您环境中已安装的相同 torch 进行编译，从而防止运行时出现未定义符号错误。

CUDA 13.0

git clone https://github.com/LMCache/LMCache.git
cd LMCache

uv venv --python 3.12
source .venv/bin/activate

uv pip install -r requirements/build.txt
uv pip install vllm  # pulls in required torch version (cu13)
uv pip install -e . --no-build-isolation

CUDA 12.9

git clone https://github.com/LMCache/LMCache.git
cd LMCache

uv venv --python 3.12
source .venv/bin/activate

uv pip install -r requirements/build.txt
# Pin vLLM (and torch) to the cu12.9 wheel index so the local
# CUDA 12 toolchain matches what the extensions are built against.
uv pip install vllm \
    --extra-index-url https://download.pytorch.org/whl/cu129 \
    --index-strategy unsafe-best-match
# LMCACHE_CUDA_MAJOR=12 makes setup.py pick cupy-cuda12x
# for install_requires instead of the cu13 default.
LMCACHE_CUDA_MAJOR=12 \
    uv pip install -e . --no-build-isolation

ROCm

git clone https://github.com/LMCache/LMCache.git
cd LMCache

uv venv --python 3.12
source .venv/bin/activate

# Need to install these packages manually to avoid build isolation
uv pip install -r requirements/build.txt

# Install torch from the ROCm wheel index
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm7.0

# Build LMCache. BUILD_WITH_HIP=1 makes setup.py pick cupy-rocm-7-0 automatically.
# PYTORCH_ROCM_ARCH selects the target GPU(s):
#   gfx942  -> MI300X / MI325X
#   gfx950  -> MI350X / MI355X
# Comma-separate to build a fat binary for multiple archs.
PYTORCH_ROCM_ARCH="gfx942,gfx950" \
TORCH_DONT_CHECK_COMPILER_ABI=1 \
CXX=hipcc \
BUILD_WITH_HIP=1 \
uv pip install -e . --no-build-isolation

英特尔 XPU

git clone https://github.com/LMCache/LMCache.git
cd LMCache

uv venv --python 3.12
source .venv/bin/activate

# Need to install these packages manually to avoid build isolation
uv pip install -r requirements/build.txt

# Build LMCache with SYCL backend.
BUILD_WITH_SYCL=1 uv pip install --no-build-isolation -e .

Docker

稳定

CUDA 13.0

docker pull lmcache/vllm-openai

CUDA 12.9

docker pull lmcache/vllm-openai:latest-cu129

夜间版

CUDA 13.0

docker pull lmcache/vllm-openai:latest-nightly

CUDA 12.9

docker pull lmcache/vllm-openai:latest-nightly-cu129

ROCm

docker pull rocm/vllm-dev:nightly_0624_rc2_0624_rc2_20250620

英特尔 XPU

docker pull intel/vllm:0.17.0-xpu

请参阅 Docker 部署以获取运行容器和 ROCm 镜像的信息。

仅限 CLI

轻量级纯 CLI 软件包，用于查询或对远程 LMCache 服务器进行基准测试。无需 CUDA，支持任意操作系统。

pip install lmcache-cli

备注

lmcache-cli 和 lmcache 提供相同的 lmcache CLI 命令。请勿在同一环境中同时安装两者。

构建 Docker 镜像#

您也可以不拉取预构建镜像，而是直接使用仓库中提供的 Dockerfile 自行构建 LMCache（集成 vLLM）镜像，Dockerfile 位于 docker/。

在 LMCache 仓库的根目录下：

docker build --tag <IMAGE_NAME>:<TAG> --target image-build --file docker/Dockerfile .

将 <IMAGE_NAME> 和 <TAG> 替换为所需的镜像名称和标签。各构建参数的说明请参阅 docker/ 中的示例构建文件。

验证安装#

python -c "import lmcache.c_ops"