
In the Python C API, I dislike APIs modifying immutable objects such as _PyBytes_Resize(). I designed a whole new PyBytesWriter API to replace this _PyBytes_Resize() function. As usual in Python, it took multiple iterations and one year to design the API and to reach an agreement.
Picture: The Secret World of Arrietty by Hayao Miyazaki.
Original private _PyBytesWriter API
In 2016 (Python 3.6), I designed a private _PyBytesWriter API to create bytes objects in an efficient way, especially by overallocating a buffer. See my article Fast _PyAccu, _PyUnicodeWriter and_PyBytesWriter APIs to produce strings in CPython about this API (and other similar APIs).
In July 2023 (Python 3.13), I moved the private _PyBytesWriter API to the internal C API. See the article Remove private C API functions.
First public API attempt
In June 2024, Marc-Andre Lemburg asked to make the private _PyBytesWriter API public.
In July, I wrote a first public API attempt: PR gh-121726. API:
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size, char **str)
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, char *str)
void PyBytesWriter_Discard(PyBytesWriter *writer)
int PyBytesWriter_Prepare(PyBytesWriter *writer, char **str, Py_ssize_t size)
int PyBytesWriter_WriteBytes(PyBytesWriter *writer, char **str, const void *bytes, Py_ssize_t size)
Example creating the string "abc":
PyObject*
create_abc(void)
{
char *str;
PyBytesWriter *writer = PyBytesWriter_Create(3, &str);
if (writer == NULL) {
return NULL;
}
memcpy(str, "abc", 3);
str += 3;
return PyBytesWriter_Finish(writer, str);
}
With a PyBytesWriter_Prepare(writer, &str, size) to preallocate the buffer.
The implementation was fully based on the private structure:
typedef struct {
PyObject *buffer;
Py_ssize_t allocated;
Py_ssize_t min_size;
int use_bytearray;
int overallocate;
int use_small_buffer;
char small_buffer[512];
} _PyBytesWriter;
In August, I created a C API Working Group decision. Sadly, this API didn't convinced the C API WG which found the Prepare() API confusing and the str variable hard to use.
In October, I closed the decision issue:
It seems like this API is too low-level and too error-prone. I prefer to abandon promoting this API as a public API for now. We can revisit this API later if needed.
Second public API attempt
In February 2025, I gave a try to a second public API: issue gh-129813 and PR gh-129814. API:
void* PyBytesWriter_Create(PyBytesWriter **writer, Py_ssize_t alloc)
void PyBytesWriter_Discard(PyBytesWriter *writer)
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, void *buf)
void* PyBytesWriter_Extend(PyBytesWriter *writer, void *buf, Py_ssize_t extend)
void* PyBytesWriter_WriteBytes(PyBytesWriter *writer, void *buf, const void *bytes, Py_ssize_t size)
void* PyBytesWriter_Format(PyBytesWriter *writer, void *buf, const char *format, ...)
The API now uses void* instead of char* for the buffer and I added PyBytesWriter_Format() function.
Example creating the string "abc":
PyObject*
create_abc(void)
{
PyBytesWriter *writer;
char *buf = PyBytesWriter_Create(&writer, 3);
if (buf == NULL) {
return NULL;
}
memcpy(buf, "abc", 3);
buf += 3;
return PyBytesWriter_Finish(writer, buf);
}
The API is similar to the first version, but PyBytesWriter_Create() now returns a void* instead of the PyBytesWriter*.
With a buf = PyBytesWriter_Extend(writer, buf, str_size) API to preallocate the bufer.
The implementation now uses a new dedicated simpler structure (less members):
struct PyBytesWriter {
char small_buffer[256];
PyObject *obj;
Py_ssize_t size;
int use_bytearray;
};
This time, I followed Petr Viktorin's advice and I created a discussion on Discourse. Again, the API was not liked by other developers who were confused by the API.
In March, I gave up again, and closed my PR:
It seems like most developers are confused by the API which requires to pass writer and buf to most functions. I abandon this API.
Third public API: PEP 782
Following Antoine Pitrou's link, I had a look at Arrow C++ BufferBuilder API. Antoine helped me to design a better API using size and without the void *buf parameter.
At the end of March, I wrote PEP 782 – Add PyBytesWriter C API and created a new discussion on the PEP.
Example creating the string "abc":
PyObject *
create_abc(void)
{
PyBytesWriter *writer = PyBytesWriter_Create(3);
if (writer == NULL) {
return NULL;
}
char *buf = PyBytesWriter_GetData(writer);
memcpy(buf, "abc", 3);
return PyBytesWriter_Finish(writer);
}
With a PyBytesWriter_Resize(writer, size) API to preallocate the buffer. The size is now absolute, rather than being relative.
The mandatory void *buf parameter was replaced with PyBytesWriter_GetData() function.
In May, I submitted the PEP to the Steering Council. In September, the Steering Council approved PEP 782! (Yeah, it took them 4 months to take a decision.)
Final API
PyBytesWriter* PyBytesWriter_Create(Py_ssize_t size)
void PyBytesWriter_Discard(PyBytesWriter *writer)
PyObject* PyBytesWriter_Finish(PyBytesWriter *writer)
PyObject* PyBytesWriter_FinishWithSize(PyBytesWriter *writer, Py_ssize_t size)
PyObject* PyBytesWriter_FinishWithPointer(PyBytesWriter *writer, void *buf)
void* PyBytesWriter_GetData(PyBytesWriter *writer)
Py_ssize_t PyBytesWriter_GetSize(PyBytesWriter *writer)
int PyBytesWriter_WriteBytes(PyBytesWriter *writer, const void *bytes, Py_ssize_t size)
int) PyBytesWriter_Format(PyBytesWriter *writer, const char *format, ...)
int PyBytesWriter_Resize(PyBytesWriter *writer, Py_ssize_t size)
int PyBytesWriter_Grow(PyBytesWriter *writer, Py_ssize_t size)
void* PyBytesWriter_GrowAndUpdatePointer(PyBytesWriter *writer, Py_ssize_t size, void *buf)
See the documentation.
Implementation
In September, I implemented the PyBytesWriter API in the main branch (future Python 3.15) with documentation and tests.
I also modified code using soft deprecated APIs, PyBytes_FromStringAndSize(NULL, size) and _PyBytes_Resize(), to use the new PyBytesWriter API instead. When doing these conversions, I ran benchmarks to check that there is no significant impact on performance. Examples of benchmarks:
For example, I abandonned these two changes:
Later, other people joined the party and found other opportunity for PyBytesWriter with great optimizations: