Extending the HTTP API#
You can add new endpoints to the lmcache server HTTP frontend without
modifying any existing code. An endpoint is just a Python module placed in
lmcache/v1/multiprocess/http_apis/ that exposes a FastAPI APIRouter;
HTTPAPIRegistry auto-discovers and mounts it at startup – the same
zero-modification pattern used by the L2 adapters.
How discovery works#
At startup, http_server.py hands the FastAPI app to HTTPAPIRegistry
(lmcache/v1/multiprocess/http_api_registry.py), which scans the
http_apis/ directory with pkgutil, imports every module whose name ends
with _api, and includes any module-level router. The built-in modules
follow this pattern:
Module |
Endpoint |
Method |
Description |
|---|---|---|---|
|
|
GET |
Basic liveness check |
|
|
GET |
Kubernetes probe endpoint |
|
|
POST |
Force-clear the L1 cache |
|
|
GET |
Internal status report |
Adding an endpoint#
Create a file in lmcache/v1/multiprocess/http_apis/ whose name ends with
_api.py and expose a router:
# lmcache/v1/multiprocess/http_apis/metrics_api.py
# SPDX-License-Identifier: Apache-2.0
from fastapi import APIRouter, Request
from fastapi.responses import JSONResponse
router = APIRouter()
@router.get("/metrics")
async def metrics(request: Request):
"""Return cache hit/miss metrics."""
engine = getattr(request.app.state, "engine", None)
if engine is None:
return JSONResponse(
status_code=503,
content={"error": "engine not initialized"},
)
return {"hits": 42, "misses": 7}
That’s it – HTTPAPIRegistry discovers and mounts it on the next server
startup; no other file needs to change.
Module contract#
An API module must:
live in
lmcache/v1/multiprocess/http_apis/with a filename ending in_api.py;expose a module-level
routerof typefastapi.APIRouter.
An API module should:
guard against uninitialized state by checking
request.app.state.engineand returning503when it isNone;use
lmcache.logging.init_logger(__name__)for logging;use
asynchandlers and avoid blocking I/O.
An API module must not import or mutate the app object from
http_server.py.