False start

In April 2016, I experimented a Python change to avoid temporary tuple to call functions. Builtin functions were between 20 and 50% faster!

Sadly, some benchmarks were randomy slower. It will take me four months to understand why!

Work on benchmarks

During four months, I worked on making benchmarks more stable. See my previous blog posts:

See my talk How to run a stable benchmark that I gave at FOSDEM 2017 (Brussels, Belgium): slides + video. I listed all the issues that I had to get reliable benchmarks.

Ask for permission

August 2016, I confirmed that my change didn't introduce any slowndown. So I asked for the permission on the python-dev mailing list to start pushing changes: New calling convention to avoid temporarily tuples when calling functions.

Guido van Rossum asked me for benchmark results:

But is there a performance improvement?

Benchmark results

On micro-benchmarks, FASTCALL is much faster:

  • getattr(1, "real") becomes 44% faster
  • list(filter(lambda x: x, list(range(1000)))) becomes 31% faster
  • namedtuple.attr (read the attribute) becomes 23% faster
  • ...

Full results:

On the CPython benchmark suite, I also saw many faster benchmarks:

  • pickle_list: 1.29x faster
  • etree_generate: 1.22x faster
  • pickle_dict: 1.19x faster
  • etree_process: 1.16x faster
  • mako_v2: 1.13x faster
  • telco: 1.09x faster
  • ...

Replies to my email

I got two very positive replies, so I understood that it was ok.

Brett Canon:

I just wanted to say I'm excited about this and I'm glad someone is taking advantage of what Argument Clinic allows for and what I know Larry had initially hoped AC would make happen!

Yury Selivanov:

Exceptional results, congrats Victor. Will be happy to help with code review.

Real start

That's how the FASTCALL began for real! I started to push a long serie of patches adding new private functions and then modify code to call these new functions.