-
Notifications
You must be signed in to change notification settings - Fork 49
Description
Currently our IR has a mix of relatively high-level concepts (e.g. unboxing) and low-level concepts (e.g. integer tag checks). This restricts the sort of optimizations that are easy to implement, and it makes it hard to implement a non-C back end. Instead, we should probably have multiple levels of abstraction for IR. Two levels should be sufficient.
The table below has some examples of potential high-level ops and corresponding low-level ops (informal).
High-level | Low-level |
---|---|
x == y (tagged ints) |
tag checks followed by low-level integer comparison or function call |
x[y] (list get item) |
perform length check and check for negative index; read item from array |
cast(str, x) |
read type of x and type flags; check type flag |
unbox(tuple[int, int], x) |
check that x is a tuple; check tuple length; unbox items |
isinstance(x, NoneType) |
read value of Py_None; pointer comparison against x |
Various operations would be shared between the high-level and low-level representations, such as calls to primitives and register assignments. If having separate high-level and low-level IR representations for some operation doesn't help us generate better code, we may not need it.
Motivation
I think that these are the main benefits that justify having two abstraction levels:
- Various optimizations are easier to implement with the high-level IR (see below for examples)
- Code generation is easier and cleaner from the low-level IR, especially for non-C back ends
There are also some side benefits:
- High-level IR is easier for humans to work with and to validate
- Most irbuild tests could produce high-level IR for less verbose output
- Optimizations and transformations may be faster using the high-level IR, since there will be fewer ops
Implementation
We can perhaps use the same Op
hierarchy for both levels, at least initially. Low-level IR could only contain a subset of the available ops.
We'd add some new high-level op types, such as these:
- Tagged integer arithmetic and comparisons
- Runtime type checks (
isinstance
)
Pass structure (simplified):
- Generate high-level IR (same as the current main IR build visitor).
- Add reference counting and exception handling.
- Perform high-level optimizations.
- Translate to low-level IR.
- Perform low-level optimizations.
- Code generation.
The translation to low-level IR could be a little different for C and non-C back ends. The C back end could benefit from a somewhat higher abstraction level, as it would allow us to generate less verbose code.
High-level optimizations
Here are some high-impact optimizations that should benefit from the high-level IR:
- Redundant cast removal
- Redundant incref and decref removal
- Common subexpression elimination