LMCache Plugin Framework#

The LMCache plugin system allows developers to extend functionality by running custom scripts alongside LMCache processes. Plugins can be written in Python and Bash for now, and are managed by the PluginLauncher class.

Key Use Cases#

  • Start metric reporters for centralized monitoring

  • Implement log reporters for log collection systems

  • Report process-level metrics to alerting systems

  • Implement health checks and service discovery

  • Custom cache management operations

Configuration#

Plugins are configured through environment variables and configuration files:

Environment Variables: - LMCACHE_PLUGIN_ROLE: Process role (e.g., SCHEDULER, WORKER) - LMCACHE_PLUGIN_CONFIG: JSON string containing plugin configuration - LMCACHE_PLUGIN_WORKER_ID: Current worker ID - LMCACHE_PLUGIN_WORKER_COUNT: Total worker count in cluster

Configuration File (lmcache.yaml):

plugin_locations: ["/path/to/plugins"]
extra_config:
  custom_setting: value

Plugin Naming Convention#

Plugin filenames determine execution targets:

Role-Specific Plugins: - Format: <ROLE>[_<WORKER_ID>][_<DESCRIPTION>].<EXTENSION> - Examples:

  • scheduler_foo_plugin.py: Runs only on SCHEDULER

  • worker_0_test.sh: Runs only on worker ID 0

  • all_plugin.sh: Runs on all workers

Notes: - Role names are case-insensitive - Worker ID must be numeric when specified - To target a specific worker ID, the filename must have at least three parts separated by underscores (e.g., worker_<ID>_<DESCRIPTION>.ext). A file named worker_<DESCRIPTION>.ext will run on all workers.

Execution Model#

  1. Interpreter Detection: - Uses shebang line (e.g., #!/opt/venv/bin/python) - Fallback interpreters:

    • .pypython

    • .shbash

  2. Output Handling: - Stdout/stderr captured continuously - Logged with plugin name prefix

  3. Process Management: - Launched as subprocesses - Terminated when parent process exits

Example Plugins#

Python Plugin (scheduler_foo_plugin.py):

 1#!/opt/venv/bin/python
 2# SPDX-License-Identifier: Apache-2.0
 3"""Example plugin for LMCache system
 4This plugin runs continuously and exits when parent process terminates"""
 5
 6# Standard
 7import json
 8import os
 9import signal
10import time
11
12# First Party
13from lmcache.integration.vllm.utils import lmcache_get_config
14from lmcache.v1.config import LMCacheEngineConfig
15
16
17# Graceful exit handler
18def handle_exit(signum, frame):
19    print("Received termination signal, exiting...")
20    exit(0)
21
22
23signal.signal(signal.SIGTERM, handle_exit)
24
25role = os.getenv("LMCACHE_PLUGIN_ROLE")
26worker_id = os.getenv("LMCACHE_PLUGIN_WORKER_ID")
27worker_count = os.getenv("LMCACHE_PLUGIN_WORKER_COUNT")
28config_str = os.getenv("LMCACHE_PLUGIN_CONFIG")
29try:
30    config = LMCacheEngineConfig.from_json(config_str)
31except json.JSONDecodeError as e:
32    print(f"Error parsing LMCACHE_PLUGIN_CONFIG: {e}")
33    config = lmcache_get_config()
34
35print(
36    f"Python plugin running with role: {role}, worker_id: {worker_id}, "
37    f"worker_count: {worker_count}"
38)
39print(f"Config: {config}")
40
41# Main loop
42loop_count = 0
43while True:
44    print(f"Scheduler plugin is running... (loop_count: {loop_count})")
45    loop_count += 1
46    time.sleep(10)

Bash Plugin (all_plugin.sh):

 1#!/bin/bash
 2# Example plugin for LMCache system
 3# This plugin runs continuously and exits when parent process terminates
 4
 5# Handle termination signal
 6trap "echo 'Received termination signal, exiting...'; exit 0" SIGTERM
 7
 8role="$LMCACHE_PLUGIN_ROLE"
 9worker_id="$LMCACHE_PLUGIN_WORKER_ID"
10worker_count="$LMCACHE_PLUGIN_WORKER_COUNT"
11config="$LMCACHE_PLUGIN_CONFIG"
12
13echo "All plugin started for role: $role, worker ID: $worker_id, worker count: $worker_count"
14echo "All plugin accept LMCache Config: $config"
15
16loop_count=0
17while true; do
18    echo "All plugin is running for ${role} ${worker_id}...(loop_count: ${loop_count})"
19    loop_count=$((loop_count + 1))
20    sleep 10
21done

Best Practices#

  1. Keep plugins lightweight and efficient

  2. Use descriptive naming conventions

  3. Implement graceful error handling

  4. Include shebang for portability

  5. Validate configuration inputs

  6. Add timeout mechanisms for long operations