In April 2016, I experimented a Python change to avoid temporary tuple to call functions. Builtin functions were between 20 and 50% faster!
Sadly, some benchmarks were randomy slower. It will take me four months to understand why!
Work on benchmarks
During four months, I worked on making benchmarks more stable. See my previous blog posts:
- My journey to stable benchmark, part 1 (system) (May 21, 2016)
- My journey to stable benchmark, part 2 (deadcode) (May 22, 2016)
- My journey to stable benchmark, part 3 (average) (May 23, 2016)
- Visualize the system noise using perf and CPU isolation (June 16, 2016)
- Intel CPUs: P-state, C-state, Turbo Boost, CPU frequency, etc. (July 15, 2015)
- Intel CPUs (part 2): Turbo Boost, temperature, frequency and Pstate C0 bug (September 23, 2016)
- Analysis of a Python performance issue (November 19, 2016)
See my talk How to run a stable benchmark that I gave at FOSDEM 2017 (Brussels, Belgium): slides + video. I listed all the issues that I had to get reliable benchmarks.
Ask for permission
August 2016, I confirmed that my change didn't introduce any slowndown. So I asked for the permission on the python-dev mailing list to start pushing changes: New calling convention to avoid temporarily tuples when calling functions.
Guido van Rossum asked me for benchmark results:
But is there a performance improvement?
On micro-benchmarks, FASTCALL is much faster:
- getattr(1, "real") becomes 44% faster
- list(filter(lambda x: x, list(range(1000)))) becomes 31% faster
- namedtuple.attr (read the attribute) becomes 23% faster
On the CPython benchmark suite, I also saw many faster benchmarks:
- pickle_list: 1.29x faster
- etree_generate: 1.22x faster
- pickle_dict: 1.19x faster
- etree_process: 1.16x faster
- mako_v2: 1.13x faster
- telco: 1.09x faster
Replies to my email
I got two very positive replies, so I understood that it was ok.
I just wanted to say I'm excited about this and I'm glad someone is taking advantage of what Argument Clinic allows for and what I know Larry had initially hoped AC would make happen!
Exceptional results, congrats Victor. Will be happy to help with code review.
That's how the FASTCALL began for real! I started to push a long serie of patches adding new private functions and then modify code to call these new functions.