EIC#
EIC(Elastic Instant Cache) is a distributed database designed for LLM KV Cache. It supports RDMA, GDR and has the capabilities of distributed disaster tolerance and expansion. You can understand the principles and architecture of EIC through these articles:
Deploy EIC#
You can visit the official link https://console.volcengine.com/eic and deploy EIC KVCache on your compute cluster with web UI. In addition, we provide particular image in volcano engine, which integrates various optimizations based on the official image. You may use tests/v1/storage_backend/test_eic.py to detect the connectivity of EIC.
Deploy Model With EIC#
You can enable EIC KVCache offload with the official interface, such as
export LMCACHE_CONFIG_FILE=/workspace/config/remote-eic.yaml
export LMCACHE_USE_EXPERIMENTAL=True
export VLLM_USE_V1=1
python3 -m vllm.entrypoints.openai.api_server \
... \
--kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'
Example config.yaml:
chunk_size: 256
remote_url: "eic://your-eic-endpoint"
eic_instance_id: "your-eic-instance-id"
eic_flag_file: "your-eic-config-path"
For more details, you can see https://www.volcengine.com/docs/85848/1749188.