Skip to content

⌘ K

Getting Started

KV Cache Operations

Recipes
- Uniform Attention Models
- Hybrid Attention Models

Secondary KV Storage
- Supported Backends
  - NIXL
  - File & Block
  - Remote & Distributed
  - DAX
  - Mock
  - Fault Inject
- KV Cache Compression
  - CacheGen

Distributed KV Cache

Use LMCache in Production

Observability
- Metrics
- Logging
- Tracing

Community
- Community meetings
- Blogs

KV Cache Optimizations
- CacheBlend
- Segmented Prefill

Developer Guide

Extension Guide
- Extending the CLI

Non-KV Caching
- Encodings
- Hidden States

Legacy (In-Process Mode)

/

Extension Guide

Extension Guide#

This section describes how to extend LMCache by adding new CLI commands, plugins, or other components.

Extending the CLI

Extending the HTTP API

Extending the CLI

© 2024, The LMCache Team Built with Sphinx 8.2.3