Mooncake Store#
An L2 adapter backed by the native C++ Mooncake Store connector. Uses Mooncake for high-performance distributed KV cache storage with RDMA support.
When Mooncake is configured with "protocol": "rdma", LMCache must also
have a valid contiguous L1 memory region available. The distributed storage
manager passes this L1 memory descriptor to the adapter factory automatically
in MP mode. If the descriptor is missing or invalid, adapter creation fails
with ValueError instead of silently falling back to a non-RDMA path.
Prerequisites – Building with Mooncake support:
The Mooncake extension is not built by default. You must explicitly enable it:
BUILD_MOONCAKE=1 pip install -e . --verbose
The BUILD_MOONCAKE environment variable controls compilation:
BUILD_MOONCAKE=1: Enable the Mooncake C++ extension.BUILD_MOONCAKE=0: Force disable (highest priority), even ifMOONCAKE_INCLUDE_DIRis set.Not set: Falls back to checking
MOONCAKE_INCLUDE_DIRfor backward compatibility. IfMOONCAKE_INCLUDE_DIRis also unset, the extension is skipped.
If the Mooncake headers are not installed in the system include path
(e.g., /usr/local/include), you must point to them explicitly:
BUILD_MOONCAKE=1 \
MOONCAKE_INCLUDE_DIR=/path/to/mooncake/include \
MOONCAKE_LIB_DIR=/path/to/mooncake/lib \
pip install -e . --verbose
LMCache-specific fields:
num_workers: Number of C++ worker threads for the shared pool (default4, must be > 0).per_op_workers(dict[str, int], optional): A dict mapping lane keys to dedicated worker thread counts. Supported keys:"lookup"— threads forEXISTSoperations."retrieve"— threads forGET/ load operations."store"— threads forSET/ put operations."delete"— threads forDELETEoperations.
Operations whose lane key is not present in the dict use the shared
num_workerspool. There is no requirement to set all keys — you can configure only the lanes that need dedicated pools.
Mooncake fields:
All other keys in the JSON config (except type, num_workers,
per_op_workers, and eviction) are forwarded as-is to Mooncake’s
setup_internal(ConfigDict). Refer to the
Mooncake documentation
for available setup keys (e.g., local_hostname,
metadata_server, master_server_addr, protocol,
rdma_devices, global_segment_size).
Configuration example:
# Shared pool (default)
--l2-adapter '{
"type": "mooncake_store",
"num_workers": 4,
"local_hostname": "node01",
"metadata_server": "http://localhost:8080/metadata",
"master_server_addr": "localhost:50051",
"protocol": "tcp",
"local_buffer_size": "3221225472",
"global_segment_size": "3221225472"
}'
# Per-operation pools (GET-heavy workload)
--l2-adapter '{
"type": "mooncake_store",
"per_op_workers": {
"lookup": 2,
"retrieve": 16,
"store": 4
},
"local_hostname": "node01",
"metadata_server": "http://localhost:8080/metadata",
"master_server_addr": "localhost:50051",
"protocol": "tcp"
}'
For full Mooncake setup instructions (master service, metadata server, etc.), see Mooncake .
RDMA notes:
protocol: "rdma"requires a valid LMCache L1 memory descriptor.When using
protocol: "rdma", it is recommended to disable lazy L1 allocation with--no-l1-use-lazyso the L1 buffer is fully allocated before Mooncake registers it.protocol: "tcp"does not require L1 preregistration.If Mooncake RDMA initialization fails at adapter creation time, verify that LMCache L1 memory is enabled and that the descriptor has a non-zero pointer and size.