LMCache Memory Interface#

class MemoryAllocatorInterface[source]#

abstract allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.UNDEFINED) → MemoryObj | None[source]#

Allocates the memory to hold a tensor of the given shape.

Parameters:

shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.

Returns:

A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.

Return type:

Optional[MemoryObj]

abstract free(memory_obj: MemoryObj)[source]#

Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.

Parameters:: memory_obj (MemoryObj) – The MemoryObj to free.

abstract get_ref_count(memory_obj: MemoryObj)[source]#

Get ref count for the given MemoryObj.

:param MemoryObj memory_obj.

abstract ref_count_down(memory_obj: MemoryObj)[source]#

Decrease ref count for the given MemoryObj.

:param MemoryObj memory_obj.

abstract ref_count_up(memory_obj: MemoryObj)[source]#

Increase ref count for the given MemoryObj.

:param MemoryObj memory_obj.

class BufferAllocator(device='cpu')[source]#

Bases: MemoryAllocatorInterface

Allocates memory in the pre-allocated pinned memory.

allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.BINARY_BUFFER) → BytesBufferMemoryObj[source]#

Allocates the memory to hold a tensor of the given shape.

Parameters:

shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.

Returns:

A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.

Return type:

Optional[MemoryObj]

free(memory_obj: MemoryObj)[source]#

Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.

Parameters:: memory_obj (MemoryObj) – The MemoryObj to free.

get_ref_count(memory_obj: MemoryObj)[source]#

Get ref count for the given MemoryObj.

:param MemoryObj memory_obj.

memcheck()[source]#

ref_count_down(memory_obj: MemoryObj)[source]#

Decrease ref count for the given MemoryObj.

:param MemoryObj memory_obj.

ref_count_up(memory_obj: MemoryObj)[source]#

Increase ref count for the given MemoryObj.

:param MemoryObj memory_obj.

class GPUMemoryAllocator(size: int, device='cuda')[source]#

Bases: MemoryAllocatorInterface

Allocates memory in the pre-allocated Host memory.

allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.KV_BLOB) → MemoryObj | None[source]#

Allocates the memory to hold a tensor of the given shape.

Parameters:

shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.

Returns:

A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.

Return type:

Optional[MemoryObj]

free(memory_obj: MemoryObj)[source]#

Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.

Parameters:: memory_obj (MemoryObj) – The MemoryObj to free.

get_ref_count(memory_obj: MemoryObj)[source]#

Get ref count for the given MemoryObj.

:param MemoryObj memory_obj.

memcheck()[source]#

ref_count_down(memory_obj: MemoryObj)[source]#

Decrease ref count for the given MemoryObj.

:param MemoryObj memory_obj.

ref_count_up(memory_obj: MemoryObj)[source]#

Increase ref count for the given MemoryObj.

:param MemoryObj memory_obj.

class TensorMemoryAllocator(tensor: Tensor)[source]#

Bases: MemoryAllocatorInterface

Implements a “explicit list” memory allocator.

ALIGN_BYTES = 512#

allocate(shape: Size | Tuple[int, ...], dtype: dtype | None, fmt: MemoryFormat = MemoryFormat.KV_BLOB) → TensorMemoryObj | None[source]#

Allocates the memory to hold a tensor of the given shape.

Parameters:

shape (torch.Size) – The shape of the tensor to allocate.
dtype (torch.dtype) – The dtype of the tensor to allocate.
fmt (MemoryFormat) – The format of the memory to allocate.

Returns:

A MemoryObj wrapping the allocated memory. Returns None if the allocation failed.

Return type:

Optional[MemoryObj]

free(memory_obj: MemoryObj)[source]#

Frees the memory allocated for the given MemoryObj. Note that this function shouldn’t be explicitly called. Instead, use ref_count_down to decrease ref count.

Parameters:: memory_obj (MemoryObj) – The MemoryObj to free.

get_ref_count(memory_obj: MemoryObj)[source]#

Get ref count for the given MemoryObj.

:param MemoryObj memory_obj.

memcheck()[source]#: For debug purposes. Returns True is everything is fine, otherwise False.

ref_count_down(memory_obj: MemoryObj)[source]#

Decrease ref count for the given MemoryObj.

:param MemoryObj memory_obj.

ref_count_up(memory_obj: MemoryObj)[source]#

Increase ref count for the given MemoryObj.

:param MemoryObj memory_obj.

LMCache Memory Object

LMCache Backend Interface