lmcache server#

The lmcache server command launches the standalone LMCache Multi-Process (MP) server, which exposes a ZMQ control plane and an HTTP frontend (status, healthcheck, cache-clear, checksum APIs). It is the server that lmcache describe, lmcache ping kvcache, lmcache kvcache, and lmcache bench server talk to.

Note

This command requires the full lmcache installation with CUDA extensions. It is not available in the lightweight lmcache-cli package.

lmcache server [options]

Quick start#

lmcache server \
    --host 0.0.0.0 --port 5555 \
    --l1-size-gb 100 \
    --eviction-policy LRU

Options#

The server composes its arguments from several configuration modules — the multiprocess server, the storage manager (L1 / L2 adapters / eviction), the HTTP frontend, and the Prometheus / telemetry observability layer. The full, authoritative list is large and evolves with the runtime, so consult:

lmcache server --help

Commonly used flags include:

Flag

Description

--host HOST

Bind address for the server.

--port PORT

ZMQ control-plane port.

--chunk-size N

KV cache chunk size in tokens.

--l1-size-gb GB

L1 (CPU/DRAM) cache capacity in GB.

--eviction-policy POLICY

L1 eviction policy (e.g. LRU).

--eviction-trigger-watermark RATIO

L1 fill ratio at which eviction begins.

--eviction-ratio RATIO

Fraction of L1 cleared per eviction cycle.

--max-workers N

Number of server worker processes.

--coordinator-url URL

Register with an MP coordinator at this base URL (e.g. http://coordinator:9300). Opt-in; enables fleet registration. See Multi-Server Coordination.

--coordinator-advertise-ip IP

IP the coordinator should reach this server at (defaults to the outbound IP).

--coordinator-heartbeat-interval SECONDS

Seconds between heartbeats (> 0, default 5). Keep well below the coordinator’s instance timeout.

--coordinator-l2-event-reporting

Enable reporting L2 store/lookup events to the coordinator for fleet-wide usage tracking and quota-based eviction.

--coordinator-l2-event-flush-interval SECONDS

Seconds between L2 event batch flushes (> 0, default 1).

--trace-level {storage}

Enable storage-level trace recording (see lmcache trace).

--trace-output PATH

Destination for recorded .lct trace files.

L2 adapters, observability, and Prometheus exporters are configured through their own flag groups; see lmcache server --help for the complete set.