lmcache.server.server_storage_backend package#
Submodules#
lmcache.server.server_storage_backend.abstract_backend module#
- class lmcache.server.server_storage_backend.abstract_backend.LMSBackendInterface[source]#
- abstract close()[source]#
Do the cleanup things Children classes should override this method if necessary
- abstract get(key: str) Tensor | None [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of a big tensor None if the key is not found
- abstract list_keys() List[str] [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of a big tensor None if the key is not found
- abstract put(key: str, kv_chunk_bytes: bytearray, blocking=True) None [source]#
Store the KV cache of the tokens into the cache server.
- Parameters:
key – the key of the token chunk, in the format of str
kv_chunk – the kv cache (bytearray) of the token chunk,
tensor (in the format of a big)
blocking – whether to block the call before the operation is
completed
- Returns:
None
Note
The KV cache should NOT have the “batch” dimension.
lmcache.server.server_storage_backend.local_backend module#
- class lmcache.server.server_storage_backend.local_backend.LMSLocalBackend[source]#
Bases:
LMSBackendInterface
Cache engine for storing the KV cache of the tokens in the local cpu/gpu memory.
- contains(key: str) bool [source]#
Check if the cache engine contains the key.
- Input:
key: the key of the token chunk, including prefix hash and format
- Returns:
True if the cache engine contains the key, False otherwise
- get(key: str) bytearray | None [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of nested tuples None if the key is not found
- list_keys() List[str] [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of a big tensor None if the key is not found
- put(key: str, kv_chunk_bytes: bytearray, blocking: bool = True) None [source]#
Store the KV cache of the tokens into the cache engine.
- Input:
key: the key of the token chunk, including prefix hash and format kv_chunk_bytes: the kv cache of the token chunk, in the format of bytearray
- Returns:
None
Note
The KV cache should NOT have the “batch” dimension.
- class lmcache.server.server_storage_backend.local_backend.LMSLocalDiskBackend(path: str)[source]#
Bases:
LMSBackendInterface
Cache engine for storing the KV cache of the tokens in the local disk.
- contains(key: str) bool [source]#
Check if the cache engine contains the key.
- Input:
key: the key of the token chunk, including prefix hash and format
- Returns:
True if the cache engine contains the key, False otherwise
- get(key: str) bytes | None [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of nested tuples None if the key is not found
- list_keys() List[str] [source]#
Retrieve the KV cache chunk by the given key
- Input:
key: the key of the token chunk, including prefix hash and format
- Output:
the kv cache of the token chunk, in the format of a big tensor None if the key is not found
- put(key: str, kv_chunk_bytes: bytearray, blocking: bool = True) None [source]#
Store the KV cache of the tokens into the cache engine.
- Input:
key: the key of the token chunk, including prefix hash and format kv_chunk: the kv cache of the token chunk, in the format of nested tuples
- Returns:
None
Note
The KV cache should NOT have the “batch” dimension.