Skip to content

High-level and low-level IR (lowering pass) #854

@JukkaL

Description

@JukkaL

Currently our IR has a mix of relatively high-level concepts (e.g. unboxing) and low-level concepts (e.g. integer tag checks). This restricts the sort of optimizations that are easy to implement, and it makes it hard to implement a non-C back end. Instead, we should probably have multiple levels of abstraction for IR. Two levels should be sufficient.

The table below has some examples of potential high-level ops and corresponding low-level ops (informal).

High-level Low-level
x == y (tagged ints) tag checks followed by low-level integer comparison or function call
x[y] (list get item) perform length check and check for negative index; read item from array
cast(str, x) read type of x and type flags; check type flag
unbox(tuple[int, int], x) check that x is a tuple; check tuple length; unbox items
isinstance(x, NoneType) read value of Py_None; pointer comparison against x

Various operations would be shared between the high-level and low-level representations, such as calls to primitives and register assignments. If having separate high-level and low-level IR representations for some operation doesn't help us generate better code, we may not need it.

Motivation

I think that these are the main benefits that justify having two abstraction levels:

  • Various optimizations are easier to implement with the high-level IR (see below for examples)
  • Code generation is easier and cleaner from the low-level IR, especially for non-C back ends

There are also some side benefits:

  • High-level IR is easier for humans to work with and to validate
    • Most irbuild tests could produce high-level IR for less verbose output
  • Optimizations and transformations may be faster using the high-level IR, since there will be fewer ops

Implementation

We can perhaps use the same Op hierarchy for both levels, at least initially. Low-level IR could only contain a subset of the available ops.

We'd add some new high-level op types, such as these:

  • Tagged integer arithmetic and comparisons
  • Runtime type checks (isinstance)

Pass structure (simplified):

  1. Generate high-level IR (same as the current main IR build visitor).
  2. Add reference counting and exception handling.
  3. Perform high-level optimizations.
  4. Translate to low-level IR.
  5. Perform low-level optimizations.
  6. Code generation.

The translation to low-level IR could be a little different for C and non-C back ends. The C back end could benefit from a somewhat higher abstraction level, as it would allow us to generate less verbose code.

High-level optimizations

Here are some high-impact optimizations that should benefit from the high-level IR:

  • Redundant cast removal
  • Redundant incref and decref removal
  • Common subexpression elimination

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovements to existing logic or features.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions