Skip to content

⌘ K

Welcome to LMCache

Welcome to LMCache!

Getting Started

Installation
Quickstart Examples
Benchmarking
TroubleShoot
FAQ

KV Cache offloading and sharing

Using Different Storage Backends
Using Different Caching Policies

Disaggregated prefill

Using NIXL
Using shared storage

KV Cache management

LMCache Controller

KV Cache Optimizations

Compression
Blending

Use LMCache in production

Docker deployment
Kubernetes deployment
Observability

Internal API Server

Configuring the Internal API Server
How to extend the Internal API Server

Developer Guide

Contributing Guide
Dockerfile
Architecture Overview
Integration
Extending LMCache
- Extending LMCache: Plugin
Usage Data Module
- Usage Stats Collection

API Reference

Configuring LMCache
Adding new storage backends
vLLM Dynamic Connector
KV Caching for Multimodal Models with vLLM

Community

Community meetings
Blogs

/

Observability

Observability#

Metrics by vLLM API
- Quick Start Guide
- Available Metrics
Internal API Server Metrics

Kubernetes deployment

Metrics by vLLM API

© 2024, The LMCache Team Built with Sphinx 8.2.3