Internal API Server Metrics#
Another approach to retrieve LMCache metrics is to use the internal API server.
Overview#
The internal API server exposes Prometheus-compatible metrics endpoints in your LMCache deployment.
Quick Start Guide#
Step 1: Enable Internal API Server#
Configure your vLLM instance to enable the internal API server:
LMCACHE_INTERNAL_API_SERVER_ENABLED=true \
vllm serve $model \
--kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'
Step 2: Access Metrics Endpoint#
Retrieve metrics from the worker’s endpoint:
curl http://$IP:7000/metrics
Port Configuration#
The following environment variables are used implicitly with their default values:
Environment Variable |
Default Value |
Description |
---|---|---|
|
|
Host address for the internal API server to bind to. |
|
|
Starting port number, e.g.:
|
Therefore, the metrics endpoint curl command above uses port 7000.
Advanced Usage#
For comprehensive testing and configuration options, refer to Testing the Server for detailed examples and best practices.