UnderstandingTheIRStructure.md 10.5 KB

Raw Blame History Permalink



Understanding the IR Structure

The MLIR Language Reference describes the
High Level Structure, this document
illustrates this structure through examples, and introduces at the same time the
C++ APIs involved in manipulating it.

We will implement a pass that traverses any
MLIR input and prints the entity inside the IR. A pass (or in general almost any
piece of IR) is always rooted with an operation. Most of the time the top-level
operation is a ModuleOp, the MLIR PassManager is actually limited to
operation on a top-level ModuleOp. As such a pass starts with an operation,
and so will our traversal:

  void runOnOperation() override {
    Operation *op = getOperation();
    resetIndent();
    printOperation(op);
  }


Traversing the IR Nesting

The IR is recursively nested, an Operation can have one or multiple nested
Regions, each of which is actually a list of Blocks, each of which itself
wraps a list of Operations. Our traversal will follow this structure with
three methods: printOperation(), printRegion(), and printBlock().

The first method inspects the properties of an operation, before iterating on
the nested regions and print them individually:

  void printOperation(Operation *op) {
    // Print the operation itself and some of its properties
    printIndent() << "visiting op: '" << op->getName() << "' with "
                  << op->getNumOperands() << " operands and "
                  << op->getNumResults() << " results\n";
    // Print the operation attributes
    if (!op->getAttrs().empty()) {
      printIndent() << op->getAttrs().size() << " attributes:\n";
      for (NamedAttribute attr : op->getAttrs())
        printIndent() << " - '" << attr.first << "' : '" << attr.second
                      << "'\n";
    }

    // Recurse into each of the regions attached to the operation.
    printIndent() << " " << op->getNumRegions() << " nested regions:\n";
    auto indent = pushIndent();
    for (Region &region : op->getRegions())
      printRegion(region);
  }


A Region does not hold anything other than a list of Blocks:

  void printRegion(Region &region) {
    // A region does not hold anything by itself other than a list of blocks.
    printIndent() << "Region with " << region.getBlocks().size()
                  << " blocks:\n";
    auto indent = pushIndent();
    for (Block &block : region.getBlocks())
      printBlock(block);
  }


Finally, a Block has a list of arguments, and holds a list of Operations:

  void printBlock(Block &block) {
    // Print the block intrinsics properties (basically: argument list)
    printIndent()
        << "Block with " << block.getNumArguments() << " arguments, "
        << block.getNumSuccessors()
        << " successors, and "
        // Note, this `.size()` is traversing a linked-list and is O(n).
        << block.getOperations().size() << " operations\n";

    // A block main role is to hold a list of Operations: let's recurse into
    // printing each operation.
    auto indent = pushIndent();
    for (Operation &op : block.getOperations())
      printOperation(&op);
  }


The code for the pass is available
here in the repo
and can be exercised with mlir-opt -test-print-nesting.


Example

The Pass introduced in the previous section can be applied on the following IR
with mlir-opt -test-print-nesting -allow-unregistered-dialect
llvm-project/mlir/test/IR/print-ir-nesting.mlir:

"module"() ( {
  %0:4 = "dialect.op1"() {"attribute name" = 42 : i32} : () -> (i1, i16, i32, i64)
  "dialect.op2"() ( {
    "dialect.innerop1"(%0#0, %0#1) : (i1, i16) -> ()
  },  {
    "dialect.innerop2"() : () -> ()
    "dialect.innerop3"(%0#0, %0#2, %0#3)[^bb1, ^bb2] : (i1, i32, i64) -> ()
  ^bb1(%1: i32):  // pred: ^bb0
    "dialect.innerop4"() : () -> ()
    "dialect.innerop5"() : () -> ()
  ^bb2(%2: i64):  // pred: ^bb0
    "dialect.innerop6"() : () -> ()
    "dialect.innerop7"() : () -> ()
  }) {"other attribute" = 42 : i64} : () -> ()
  "module_terminator"() : () -> ()
}) : () -> ()


And will yield the following output:

visiting op: 'module' with 0 operands and 0 results
 1 nested regions:
  Region with 1 blocks:
    Block with 0 arguments, 0 successors, and 3 operations
      visiting op: 'dialect.op1' with 0 operands and 4 results
      1 attributes:
       - 'attribute name' : '42 : i32'
       0 nested regions:
      visiting op: 'dialect.op2' with 0 operands and 0 results
       2 nested regions:
        Region with 1 blocks:
          Block with 0 arguments, 0 successors, and 1 operations
            visiting op: 'dialect.innerop1' with 2 operands and 0 results
             0 nested regions:
        Region with 3 blocks:
          Block with 0 arguments, 2 successors, and 2 operations
            visiting op: 'dialect.innerop2' with 0 operands and 0 results
             0 nested regions:
            visiting op: 'dialect.innerop3' with 3 operands and 0 results
             0 nested regions:
          Block with 1 arguments, 0 successors, and 2 operations
            visiting op: 'dialect.innerop4' with 0 operands and 0 results
             0 nested regions:
            visiting op: 'dialect.innerop5' with 0 operands and 0 results
             0 nested regions:
          Block with 1 arguments, 0 successors, and 2 operations
            visiting op: 'dialect.innerop6' with 0 operands and 0 results
             0 nested regions:
            visiting op: 'dialect.innerop7' with 0 operands and 0 results
             0 nested regions:
      visiting op: 'module_terminator' with 0 operands and 0 results
       0 nested regions:


Other IR Traversal Methods.

In many cases, unwrapping the recursive structure of the IR is cumbersome and
you may be interested in using other helpers.


Filtered iterator: getOps<OpTy>()


For example the Block class exposes a convenient templated method
getOps<OpTy>() that provided a filtered iterator. Here is an example:

  auto varOps = entryBlock.getOps<spirv::GlobalVariableOp>();
  for (spirv::GlobalVariableOp gvOp : varOps) {
     // process each GlobalVariable Operation in the block.
     ...
  }


Similarly, the Region class exposes the same getOps method that will iterate
on all the blocks in the region.


Walkers

The getOps<OpTy>() is useful to iterate on some Operations immediately listed
inside a single block (or a single region), however it is frequently interesting
to traverse the IR in a nested fashion. To this end MLIR exposes the walk()
helper on Operation, Block, and Region. This helper takes a single
argument: a callback method that will be invoked for every operation recursively
nested under the provided entity.

  // Recursively traverse all the regions and blocks nested inside the function
  // and apply the callback on every single operation in post-order.
  getFunction().walk([&](mlir::Operation *op) {
    // process Operation `op`.
  });


The provided callback can be specialized to filter on a particular type of
Operation, for example the following will apply the callback only on LinalgOp
operations nested inside the function:

  getFunction.walk([](LinalgOp linalgOp) {
    // process LinalgOp `linalgOp`.
  });


Finally, the callback can optionally stop the walk by returning a
WalkResult::interrupt() value. For example the following walk will find all
AllocOp nested inside the function and interrupt the traversal if one of them
does not satisfy a criteria:

  WalkResult result = getFunction().walk([&](AllocOp allocOp) {
    if (!isValid(allocOp))
      return WalkResult::interrupt();
    return WalkResult::advance();
  });
  if (result.wasInterrupted())
    // One alloc wasn't matching.
    ...


Traversing the def-use chains

Another relationship in the IR is the one that links a Value with its users.
As defined in the
language reference,
each Value is either a BlockArgument or the result of exactly one Operation
(an Operation can have multiple results, each of them is a separate Value).
The users of a Value are Operations, through their arguments: each
Operation argument references a single Value.

Here is a code sample that inspects the operands of an Operation and prints
some information about them:

  // Print information about the producer of each of the operands.
  for (Value operand : op->getOperands()) {
    if (Operation *producer = operand.getDefiningOp()) {
      llvm::outs() << "  - Operand produced by operation '"
                   << producer->getName() << "'\n";
    } else {
      // If there is no defining op, the Value is necessarily a Block
      // argument.
      auto blockArg = operand.cast<BlockArgument>();
      llvm::outs() << "  - Operand produced by Block argument, number "
                   << blockArg.getArgNumber() << "\n";
    }
  }


Similarly, the following code sample iterates through the result Values
produced by an Operation and for each result will iterate the users of these
results and print informations about them:

  // Print information about the user of each of the result.
  llvm::outs() << "Has " << op->getNumResults() << " results:\n";
  for (auto indexedResult : llvm::enumerate(op->getResults())) {
    Value result = indexedResult.value();
    llvm::outs() << "  - Result " << indexedResult.index();
    if (result.use_empty()) {
      llvm::outs() << " has no uses\n";
      continue;
    }
    if (result.hasOneUse()) {
      llvm::outs() << " has a single use: ";
    } else {
      llvm::outs() << " has "
                   << std::distance(result.getUses().begin(),
                                    result.getUses().end())
                   << " uses:\n";
    }
    for (Operation *userOp : result.getUsers()) {
      llvm::outs() << "    - " << userOp->getName() << "\n";
    }
  }


The illustrating code for this pass is available
here in the repo
and can be exercised with mlir-opt -test-print-defuse.

The chaining of Values and their uses can be viewed as following:


The uses of a Value (OpOperand or BlockOperand) are also chained in a
doubly linked-list, which is particularly useful when replacing all uses of a
Value with a new one ("RAUW"):