Observability#

LMCache multiprocess mode provides three complementary observability modes: metrics (Prometheus counters via OTel), logging (Python logging with optional OTel log forwarding), and tracing (OTel spans for per-request latency).

All three modes are powered by an internal EventBus that decouples producers (L1Manager, StorageManager, MPCacheServer) from subscribers.

Quick Start#

By default, metrics and logging are enabled; tracing is disabled. No extra flags are needed:

lmcache server \
    --l1-size-gb 100 --eviction-policy LRU

To enable tracing, supply an OTLP endpoint:

lmcache server \
    --l1-size-gb 100 --eviction-policy LRU \
    --enable-tracing --otlp-endpoint http://localhost:4317

Configuration#

Argument

Default

Description

--disable-observability

off

Master switch: disable the EventBus entirely (no metrics, logging, or tracing subscribers are registered).

--disable-metrics

off

Skip metrics subscribers (Prometheus endpoint is not started).

--disable-logging

off

Skip logging subscribers.

--enable-tracing

off

Register tracing subscribers. Requires --otlp-endpoint.

--event-bus-queue-size

10000

Maximum events in the EventBus queue before tail-drop.

--otlp-endpoint

(none)

OTLP gRPC endpoint (e.g. http://localhost:4317). Used for exporting metrics (push mode) and traces.

--prometheus-port

9090

Port for the Prometheus /metrics HTTP endpoint.

--metrics-sample-rate

0.01

Fraction of chunks/blocks to track for lifecycle histograms (0, 1.0]. Counters always count all events. Default is 1%.

--trace-level

(none)

Enable trace recording at the given level. Currently only storage is supported (records StorageManager public-API calls for offline replay). When unset, trace recording is off. See Trace Recording for details.

--trace-output

(none)

Path to write the trace file. If omitted while --trace-level is set, a timestamped file under $TMPDIR is minted (lmcache-trace-<pid>-<UTC>.lct) and its path is logged at INFO.

Environment variables:

Variable

Default

Description

LMCACHE_LOG_LEVEL

INFO

Controls the log level for all LMCache loggers. Valid values: DEBUG, INFO, WARNING, ERROR, CRITICAL.