Skip to content
LMCache
⌘ K
LMCache

Welcome to LMCache

  • Welcome to LMCache!

Getting Started

  • Installation
  • Quickstart Examples
  • Benchmarking
  • TroubleShoot
  • FAQ

KV Cache offloading and sharing

  • Using Different Storage Backends
  • Using Different Caching Policies

Disaggregated prefill

  • Using NIXL
  • Using shared storage

KV Cache management

  • LMCache Controller

KV Cache Optimizations

  • Compression
  • Blending

Use LMCache in production

  • Docker deployment
  • Kubernetes deployment
  • Observability

Internal API Server

  • Configuring the Internal API Server
  • How to extend the Internal API Server

Developer Guide

  • Contributing Guide
  • Dockerfile
  • Architecture Overview
  • Integration
  • Extending LMCache
    • Extending LMCache: Plugin
  • Usage Data Module
    • Usage Stats Collection

API Reference

  • Configuring LMCache
  • Adding new storage backends
  • vLLM Dynamic Connector
  • KV Caching for Multimodal Models with vLLM

Community

  • Community meetings
  • Blogs
LMCache
/
Observability

Observability#

  • Metrics by vLLM API
    • Quick Start Guide
    • Available Metrics
  • Internal API Server Metrics
    • Overview
    • Quick Start Guide
    • Port Configuration
    • Advanced Usage
Kubernetes deployment
Metrics by vLLM API

© 2024, The LMCache Team Built with Sphinx 8.2.3