LMCache Memory Interface#
- class MemoryAllocatorInterface[source]#
- abstract allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.UNDEFINED) MemoryObj | None [source]#
Allocates the memory to hold a tensor of the given shape.
- Parameters:
shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.
- Returns:
A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.
- Return type:
Optional[MemoryObj]
- abstract free(memory_obj: MemoryObj)[source]#
Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.
- Parameters:
memory_obj (MemoryObj) – The MemoryObj to free.
- abstract get_ref_count(memory_obj: MemoryObj)[source]#
Get ref count for the given MemoryObj.
:param MemoryObj memory_obj.
- class BufferAllocator(device='cpu')[source]#
Bases:
MemoryAllocatorInterface
Allocates memory in the pre-allocated pinned memory.
- allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.BINARY_BUFFER) BytesBufferMemoryObj [source]#
Allocates the memory to hold a tensor of the given shape.
- Parameters:
shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.
- Returns:
A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.
- Return type:
Optional[MemoryObj]
- free(memory_obj: MemoryObj)[source]#
Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.
- Parameters:
memory_obj (MemoryObj) – The MemoryObj to free.
- get_ref_count(memory_obj: MemoryObj)[source]#
Get ref count for the given MemoryObj.
:param MemoryObj memory_obj.
- class GPUMemoryAllocator(size: int, device='cuda')[source]#
Bases:
MemoryAllocatorInterface
Allocates memory in the pre-allocated Host memory.
- allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.KV_BLOB) MemoryObj | None [source]#
Allocates the memory to hold a tensor of the given shape.
- Parameters:
shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.
- Returns:
A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.
- Return type:
Optional[MemoryObj]
- free(memory_obj: MemoryObj)[source]#
Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.
- Parameters:
memory_obj (MemoryObj) – The MemoryObj to free.
- get_ref_count(memory_obj: MemoryObj)[source]#
Get ref count for the given MemoryObj.
:param MemoryObj memory_obj.
- class TensorMemoryAllocator(tensor: Tensor)[source]#
Bases:
MemoryAllocatorInterface
Implements a “explicit list” memory allocator.
- ALIGN_BYTES = 512#
- allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.KV_BLOB) TensorMemoryObj | None [source]#
Allocates the memory to hold a tensor of the given shape.
- Parameters:
shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.
- Returns:
A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.
- Return type:
Optional[MemoryObj]
- free(memory_obj: MemoryObj)[source]#
Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.
- Parameters:
memory_obj (MemoryObj) – The MemoryObj to free.
- get_ref_count(memory_obj: MemoryObj)[source]#
Get ref count for the given MemoryObj.
:param MemoryObj memory_obj.