Observability#
LMCache multiprocess mode provides three complementary observability modes: metrics (Prometheus counters via OTel), logging (Python logging with optional OTel log forwarding), and tracing (OTel spans for per-request latency).
All three modes are powered by an internal EventBus that decouples producers (L1Manager, StorageManager, MPCacheEngine) from subscribers.
Quick Start#
By default, metrics and logging are enabled; tracing is disabled. No extra flags are needed:
python3 -m lmcache.v1.multiprocess.server \
--l1-size-gb 100 --eviction-policy LRU
To enable tracing, supply an OTLP endpoint:
python3 -m lmcache.v1.multiprocess.server \
--l1-size-gb 100 --eviction-policy LRU \
--enable-tracing --otlp-endpoint http://localhost:4317
Configuration#
Argument |
Default |
Description |
|---|---|---|
|
off |
Master switch: disable the EventBus entirely (no metrics, logging, or tracing subscribers are registered). |
|
off |
Skip metrics subscribers (Prometheus endpoint is not started). |
|
off |
Skip logging subscribers. |
|
off |
Register tracing subscribers. Requires |
|
|
Maximum events in the EventBus queue before tail-drop. |
|
(none) |
OTLP gRPC endpoint (e.g. |
|
|
Port for the Prometheus |
Environment variables:
Variable |
Default |
Description |
|---|---|---|
|
|
Controls the log level for all LMCache loggers. Valid values:
|
Metrics#
Metrics are collected via OpenTelemetry counters and exported through an
in-process Prometheus /metrics HTTP endpoint (default port 9090).
When --otlp-endpoint is set, metrics are also pushed to the OTel
collector.
All metrics use the lmcache_mp. prefix (multiprocess). On Prometheus,
dots are converted to underscores and counters get a _total suffix
(e.g. lmcache_mp_l1_read_keys_total).
StorageManager Metrics#
Metric |
Type |
Description |
|---|---|---|
|
Counter |
Number of read (prefetch) requests received by the StorageManager. |
|
Counter |
Number of keys successfully read from LMCache. |
|
Counter |
Number of keys that failed to read. |
|
Counter |
Number of write (reserve) requests. |
|
Counter |
Number of keys successfully reserved for write. |
|
Counter |
Number of keys that failed to reserve (OOM, write conflict). |
L1 Metrics#
Metric |
Type |
Description |
|---|---|---|
|
Counter |
Number of keys read from L1. |
|
Counter |
Number of keys written to L1. |
|
Counter |
Number of keys evicted by the EvictionController. |
Prometheus Scrape Configuration#
Add the LMCache server as a Prometheus scrape target:
scrape_configs:
- job_name: "lmcache-mp"
static_configs:
- targets: ["<lmcache-host>:9090"]
Logging#
Logging subscribers emit debug-level messages for store, retrieve, lookup,
L1, and StorageManager events via Python’s standard logging module.
When OpenTelemetry is installed, init_logger automatically attaches an
OTel LoggingHandler so that log records are forwarded to any configured
OTel LoggerProvider. The handler respects the LMCACHE_LOG_LEVEL
environment variable.
LMCACHE_LOG_LEVEL=DEBUG lmcache server ...
Key log messages:
Level |
Message |
|---|---|
INFO |
|
INFO |
|
INFO |
|
DEBUG |
|
DEBUG |
|
Tracing#
Note
--enable-tracing requires --otlp-endpoint to be set.
The server will refuse to start if tracing is enabled without an
OTLP endpoint, since there is no local fallback for trace export.
When tracing is enabled (--enable-tracing --otlp-endpoint <URL>),
the tracing subscriber creates OTel spans from START/END event pairs:
mp.store— fromMP_STORE_STARTtoMP_STORE_ENDmp.retrieve— fromMP_RETRIEVE_STARTtoMP_RETRIEVE_ENDmp.lookup_prefetch— fromMP_LOOKUP_PREFETCH_STARTtoMP_LOOKUP_PREFETCH_END
Each span carries event metadata as span attributes (e.g. device,
stored_count, found_count).
View traces in any OTel-compatible backend such as Jaeger or Grafana Tempo.
# Start Jaeger all-in-one (OTLP gRPC on 4317)
docker run -d --name jaeger \
-p 16686:16686 -p 4317:4317 \
jaegertracing/all-in-one:latest
# Start LMCache with tracing
python3 -m lmcache.v1.multiprocess.server \
--l1-size-gb 100 --eviction-policy LRU \
--enable-tracing --otlp-endpoint http://localhost:4317