This article is the history of Python C API discussions over the last 4 years, and the creation of C API projects: pythoncapi website, pythoncapi_compat.h header file and HPy (new clean C API). More and more people are aware of issues caused by the C API and are working on solutions.
It took me a lot of iterations to find the right approach to evolve the C API without breaking too many third-party extension modules. My first ideas were based on two APIs with an opt-in option somehow. At the end, I decided to fix directly the default API, and helped maintainers of extension modules to update their projects for incompatible C API changes.
I wrote a pythoncapi_compat.h header file which adds C API functions of newer Python to old Python versions up to Python 2.7. I also wrote a upgrade_pythoncapi.py script to add Python 3.10 support to an extension module without losing Python 2.7 support: the tool adds #include "pythoncapi_compat.h". For example, it replaces Py_TYPE(obj) = type with Py_SET_SIZE(obj, type).
The photo: my cat attacking the Python C API.
Between 2016 and 2017, Larry Hastings worked on removing the GIL in a CPython fork called "The Gilectomy". He pushed the first commit in April 2016: Removed the GIL. Don't merge this! ("Few programs work now"). At EuroPython 2016, he gave the talk Larry Hastings - The Gilectomy where he explains that the current parallelism bottleneck is the CPython reference counting which doesn't scale with the number of threads.
It was just another hint telling me that "something" should be done to make the C API more abstract, move away from implementation details like reference counting. PyPy also has performance issues with the C API for many years.
In 2017, I discussed with Eric Snow who was working on subinterpreters. He had to modify public structures, especially the PyInterpreterState structure. He created Include/internal/ subdirectory to create a new "internal C API" which should not be exported. (Later, he moved the PyInterpreterState structure to the internal C API in Python 3.8.)
I started the discuss C API changes during the Python Language Summit (PyCon US 2017): "Python performance" slides (PDF):
- Split Include in sub-directories
- Move towards a stable ABI by default
See also the LWN article: Keeping Python competitive by Jake Edge.
July: first PEP draft
I proposed the first PEP draft to python-ideas: PEP: Hide implementation details in the C API.
The idea is to add an opt-in option to distutils to build an extension module with a new C API, remove implementation details from the new C API, and maybe later switch to the new C API by default.
I discussed my C API change ideas at the CPython core dev sprint (at Instagram, California). The ideas were liked by most (if not all) core developers who are fine with a minor performance slowdown (caused by replacing macros with function calls). I wrote A New C API for CPython blog post about these discussions.
I proposed Make the stable API-ABI usable on the python-dev list. The idea is to add PyTuple_GET_ITEM() (for example) to the limited C API but declared as a function call. Later, if enough extension modules are compatible with the extended limited C API, make it the default.
In July, I created the pythoncapi website to collect issues of the current C API, list things to avoid in new functions like borrowed references, and start to design a new better C API.
In September, Antonio Cuni wrote Inside cpyext: Why emulating CPython C API is so Hard article.
In February, I sent Update on CPython header files reorganization to the capi-sig list.
- Include/: limited C API
- Include/cpython/: CPython C API
- Include/internal/: CPython internal C API
In March, I modified the Python debug build to make its ABI compatible with the release build ABI: What’s New In Python 3.8: Debug build uses the same ABI as release build.
In May, I gave a lightning talk Status of the stable API and ABI in Python 3.8, at the Language Summit (during Pycon US 2019):
- Convert macros to static inline functions
- Install the internal C API
- Debug build now ABI compatible with the release build ABI
- Getting rid of global variables
By the way, see my Split Include/ directory in Python 3.8 article: I converted many macros in Python 3.8.
In July, the HPy project was created during EuroPython at Basel. There was an informal meeting which included core developers of PyPy (Antonio, Armin and Ronan), CPython (Victor Stinner and Mark Shannon) and Cython (Stefan Behnel).
In December, Antonio, Armin and Ronan had a small internal sprint to kick-off the development of HPy: HPy kick-off sprint report
I proposed PEP: Modify the C API to hide implementation details on the python-dev list. The main idea is to provide a new optimized Python runtime which is backward incompatible on purpose, and continue to ship the regular runtime which is fully backward compatible.
I wrote PEP 620 -- Hide implementation details from the C API and proposed the PEP to python-dev. This PEP is my 3rd attempt to fix the C API: I rewrote it from scratch. Python now distributes a new pythoncapi_compat.h header and a process is defined to reduce the number of broken C extensions when introducing C API incompatible changes listed in this PEP.
I created the pythoncapi_compat project: header file providing new C API functions to old Python versions using static inline functions.
I wrote a new upgrade_pythoncapi.py script to add Python 3.10 support to an extension module without losing support with Python 2.7. I sent New script: add Python 3.10 support to your C extensions without losing Python 3.6 support to the capi-sig list.
The pythoncapi_compat project got its first users (bitarray, immutables, python-zstandard)! It proves that the project is useful and needed.
I collaborated with the HPy project to create a manifesto explaining how the C API prevents to optimize CPython and makes the CPython C API inefficient on PyPy. It is still a draft.