Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Maybe rewrite small functions like map in Python. #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markshannon opened this issue Oct 5, 2021 · 3 comments
Closed

Maybe rewrite small functions like map in Python. #91

markshannon opened this issue Oct 5, 2021 · 3 comments

Comments

@markshannon
Copy link
Member

Consider map(func, seq), where func is a Python function.
This calls from Python into a builtin function, and then calls back into Python.

Once we have fast Python to Python calls, it might be beneficial to convert map and similar functions into Python.

E.g. consider the common case of a single iterable. map(f, i) is equivalent to (f(x) for x in i)

The above would allow us to avoid C calls, with the possibility of a speedup within the interpreter and the possibility of a larger speedup with a hypothetical JIT.

For maximum efficiency we might want to write map in Python.

def map1(func, seq):
     for x in seq:
          yield func(x)

compiles to:

              0 GEN_START                0

  2           2 LOAD_FAST                1 (seq)
              4 GET_ITER
        >>    6 FOR_ITER                 7 (to 22)
              8 STORE_FAST               2 (x)

  3          10 LOAD_FAST                0 (func)
             12 LOAD_FAST                2 (x)
             14 CALL_FUNCTION            1
             16 YIELD_VALUE
             18 POP_TOP
             20 JUMP_ABSOLUTE            3 (to 6)

  2     >>   22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

Which has an loop of 8 instructions (7 allowing for superinstructions).

The question is:
Does the additional interpreter overhead of the dispatch cost more or less than the call to a C function and the call back into the interpreter?

My guess is that it with specialized CALL_FUNCTION and FOR_ITER, the Python version will be about a little faster than the C version if func is a Python function, and a bit slower if func is a builtin function.
But that is just guess.

@ericsnowcurrently
Copy link
Collaborator

The idea is definitely worth exploring, though maybe not with map() specifically. It returns a map instance rather than a generator. Also, I expect use of map() to diminish in favor of the recommended idiom: generator expressions. Perhaps all() or any() would be a better example for the idea?

@gvanrossum
Copy link
Collaborator

Though the docs for map() don't mention its type-ness -- it's listed under Builtin Functions and the text just says it returns an iterator. I'm sure it will break someone's code. But maybe specialization can help avoid that.

@markshannon
Copy link
Member Author

Using the code from https://docs.python.org/3/library/functions.html
all:

  2           0 LOAD_FAST                0 (iterable)
              2 GET_ITER
        >>    4 FOR_ITER                 7 (to 20)
              6 STORE_FAST               1 (element)

  3           8 LOAD_FAST                1 (element)
             10 POP_JUMP_IF_TRUE         9 (to 18)

  4          12 POP_TOP
             14 LOAD_CONST               1 (False)
             16 RETURN_VALUE

  3     >>   18 JUMP_ABSOLUTE            2 (to 4)

  5     >>   20 LOAD_CONST               2 (True)
             22 RETURN_VALUE

(Not sure why the POP_JUMP_IF_TRUE -> JUMP_ABSOLUTE isn't eliminated)
So it looks like all and any are better candidates than map.
If we manually optimise, then the inner loop is just two instructions:

              0 LOAD_FAST                0 (iterable)
              2 GET_ITER
        >>    4 FOR_ITER                 5 (to 14)
              6 POP_JUMP_IF_TRUE         2 (to 4)

              8 POP_TOP
             10 LOAD_CONST               1 (False)
             12 RETURN_VALUE

        >>   14 LOAD_CONST               2 (True)
             16 RETURN_VALUE

@faster-cpython faster-cpython locked and limited conversation to collaborators Dec 2, 2021
@faster-cpython faster-cpython unlocked this conversation Dec 2, 2021
@faster-cpython faster-cpython locked and limited conversation to collaborators Dec 2, 2021
@markshannon markshannon converted this issue into discussion #131 Dec 3, 2021
@gramster gramster moved this to Todo in Fancy CPython Board Jan 10, 2022
@gramster gramster moved this from Todo to Other in Fancy CPython Board Jan 10, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants