You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prevent the compiler from merging computed-goto dispatches
When compiling the computed-goto interpreter, every opcode
implementation ends with an identical chunk of code, generated by the
`DISPATCH()` macro. In some cases, the compiler is able to notice
this, and replaces the code in one or more opcodes with a jump into
the tail portion of a different opcode.
However, we specifically **don't** want that to happen; the entire
premise of using computed gotos is to lift more information into the
instruction pointer in order to give the hardware branch-target-
predictor more information to work with! In my preliminary tests, this
tail-merging of opcode implementations explains most of the
performance improvement of the new tail-call interpreter (python#128718) --
compilers are much less willing to merge code across functions, and so
the tail-call interpreter preserves all (or at least more) of the
individual `DISPATCH` sites.
This change attempts to prevent the merging of `DISPATCH` calls, by
adding an (empty) `__asm__ volatile`, which acts as an opaque barrier
to the optimizer, preventing it from considering all of these
sequences as identical.
0 commit comments