Skip to content

⌘ K

Welcome to LMCache

Welcome to LMCache!

Getting Started

Installation
Quickstart
More Examples
Benchmarking
KV Cache Size Calculator
TroubleShoot
FAQ
Standalone Starter

KV Cache offloading and sharing

Using Different Storage Backends
- CPU RAM
- Local storage
- GDS Backend
- Redis
- S3 Backend
- SageMaker Hyperpod
- InfiniStore
- Mooncake
- Valkey
- Weka
- Nixl
- Configurable Storage Backends
- EIC
- Mock
Async Loading
Using Different Caching Policies
P2P KV Cache Sharing
(Experimental) LMCache Multi-process Mode

Disaggregated prefill

Using NIXL
- 1p1d
- XpYd
Using shared storage

KV Cache management

LMCache Controller

KV Cache Optimizations

Compression
- CacheGen
Blending
Layerwise KV Transfer

Use LMCache in production

Docker deployment
Kubernetes Deployment
Observability

Internal API Server

Configuring the Internal API Server
How to extend the Internal API Server

Developer Guide

Contributing Guide
Dockerfile
Architecture Overview
Integration
Extending LMCache
- Extending LMCache: Plugin
Usage Data Module
- Basic Check Tool
- Usage Stats Collection

API Reference

Configuring LMCache
Adding new storage backends
vLLM Dynamic Connector
KV Caching for Multimodal Models with vLLM

Community

Community meetings
Blogs

/

Observability

Observability#

Metrics by vLLM API
- Quick Start Guide
- Available Metrics
Internal API Server Metrics
Chunk Statistics

Kubernetes Deployment

Metrics by vLLM API

© 2024, The LMCache Team Built with Sphinx 8.2.3