5.2. 2023-09-08 Python buffer protocol¶
Notes on the Python buffer documentation page: https://docs.python.org/3/c-api/buffer.html.
5.2.1. Introduction¶
Many Python objects operate with large consecutive memory blocks, or buffers. Often it is necessary to pass these objects between different libraries without copying the data. This is what the buffer protocol is for. It allows to expose the structure of the buffer and access the pointer to the buffer data.
The protocol allows to expose data in read-only form or with write access.
For example, buffer interface allows to pass large arrays from Python to a C extension module, or write objects to files.
5.2.2. Buffer structure¶
Buffer is not PyObject *
but data structure Py_buffer
with many members
such as:
void *b
: A pointer to any location within the memory block owned by the bufferPy_ssize_t len
: number of elements in the buffer in bytesint readonly
flag indicating whether it is possible to modify the bufferPy_ssize_t itemsize
size of one element of the buffer in bytesint ndim
number of dimensions when the buffer is an \(n\)-dimensional array. If0
then the buffer is scalar.Py_ssize_t *shape
size of each dimension in an \(n\)-dimensional arrayPy_ssize_t *strides
an array where each element indicates the number of bytes that should be skipped when moving to another element in each direction.
When working with buffers, it is possible to ask whether the buffer is contiguous or not. Also, it is possible to know about the memory layout (C style or Fortran style).
5.2.3. Complex arrays¶
5.2.3.1. NumPy arrays¶
NumPy arrays can be described completely with four properties:
itemsize
, ndim
, shape
, and strides
.
If strides
is NULL
, then it is just contiguous array with C memory layout.
If strides is not NULL
(to remind, strides are specified in bytes),
then, for example, item [i, j]
in a two-dimensional array with elements
of type double
can be accessed via:
ptr_i_j = ((char *) buf) + i * strides[0] + j * strides[1]
item = *((double *) ptr_i_j)