LMCache Plugin Framework#
The LMCache plugin system allows developers to extend functionality by running custom scripts alongside LMCache processes. Plugins can be written in Python and Bash for now, and are managed by the PluginLauncher
class.
Key Use Cases#
Start metric reporters for centralized monitoring
Implement log reporters for log collection systems
Report process-level metrics to alerting systems
Implement health checks and service discovery
Custom cache management operations
Configuration#
Plugins are configured through environment variables and configuration files:
Environment Variables:
- LMCACHE_PLUGIN_ROLE
: Process role (e.g., SCHEDULER
, WORKER
)
- LMCACHE_PLUGIN_CONFIG
: JSON string containing plugin configuration
- LMCACHE_PLUGIN_WORKER_ID
: Current worker ID
- LMCACHE_PLUGIN_WORKER_COUNT
: Total worker count in cluster
Configuration File (lmcache.yaml
):
plugin_locations: ["/path/to/plugins"]
extra_config:
custom_setting: value
Plugin Naming Convention#
Plugin filenames determine execution targets:
Role-Specific Plugins:
- Format: <ROLE>[_<WORKER_ID>][_<DESCRIPTION>].<EXTENSION>
- Examples:
scheduler_foo_plugin.py
: Runs only onSCHEDULER
worker_0_test.sh
: Runs only on worker ID 0
all_plugin.sh
: Runs on all workers
Notes: - Role names are case-insensitive - Worker ID must be numeric when specified - To target a specific worker ID, the filename must have at least three parts separated by underscores (e.g., worker_<ID>_<DESCRIPTION>.ext). A file named worker_<DESCRIPTION>.ext will run on all workers.
Execution Model#
Interpreter Detection: - Uses shebang line (e.g.,
#!/opt/venv/bin/python
) - Fallback interpreters:.py
→python
.sh
→bash
Output Handling: - Stdout/stderr captured continuously - Logged with plugin name prefix
Process Management: - Launched as subprocesses - Terminated when parent process exits
Example Plugins#
Python Plugin (scheduler_foo_plugin.py
):
1#!/opt/venv/bin/python
2# SPDX-License-Identifier: Apache-2.0
3"""Example plugin for LMCache system
4This plugin runs continuously and exits when parent process terminates"""
5
6# Standard
7import json
8import os
9import signal
10import time
11
12# First Party
13from lmcache.integration.vllm.utils import lmcache_get_config
14from lmcache.v1.config import LMCacheEngineConfig
15
16
17# Graceful exit handler
18def handle_exit(signum, frame):
19 print("Received termination signal, exiting...")
20 exit(0)
21
22
23signal.signal(signal.SIGTERM, handle_exit)
24
25role = os.getenv("LMCACHE_PLUGIN_ROLE")
26worker_id = os.getenv("LMCACHE_PLUGIN_WORKER_ID")
27worker_count = os.getenv("LMCACHE_PLUGIN_WORKER_COUNT")
28config_str = os.getenv("LMCACHE_PLUGIN_CONFIG")
29try:
30 config = LMCacheEngineConfig.from_json(config_str)
31except json.JSONDecodeError as e:
32 print(f"Error parsing LMCACHE_PLUGIN_CONFIG: {e}")
33 config = lmcache_get_config()
34
35print(
36 f"Python plugin running with role: {role}, worker_id: {worker_id}, "
37 f"worker_count: {worker_count}"
38)
39print(f"Config: {config}")
40
41# Main loop
42loop_count = 0
43while True:
44 print(f"Scheduler plugin is running... (loop_count: {loop_count})")
45 loop_count += 1
46 time.sleep(10)
Bash Plugin (all_plugin.sh
):
1#!/bin/bash
2# Example plugin for LMCache system
3# This plugin runs continuously and exits when parent process terminates
4
5# Handle termination signal
6trap "echo 'Received termination signal, exiting...'; exit 0" SIGTERM
7
8role="$LMCACHE_PLUGIN_ROLE"
9worker_id="$LMCACHE_PLUGIN_WORKER_ID"
10worker_count="$LMCACHE_PLUGIN_WORKER_COUNT"
11config="$LMCACHE_PLUGIN_CONFIG"
12
13echo "All plugin started for role: $role, worker ID: $worker_id, worker count: $worker_count"
14echo "All plugin accept LMCache Config: $config"
15
16loop_count=0
17while true; do
18 echo "All plugin is running for ${role} ${worker_id}...(loop_count: ${loop_count})"
19 loop_count=$((loop_count + 1))
20 sleep 10
21done
Best Practices#
Keep plugins lightweight and efficient
Use descriptive naming conventions
Implement graceful error handling
Include shebang for portability
Validate configuration inputs
Add timeout mechanisms for long operations