-
Notifications
You must be signed in to change notification settings - Fork 366
[LLHD] Add loop unrolling pass #8620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome! I'll review details Monday but I have a high-level question. Is there specific reason to unroll loops after SCF lowering? I'm wondering if we can simply unroll constant loops at SCF dialect which is substantially simpler (and there are many helpers in MLIR upstream).
I remember @maerhart experimenting with that a little bit on his experimental branch. The problem is that Verilog has |
@uenoku We originally retained the I had experimented with the lift-control-flow pass upstream which converts any CF region to SCF, but all the operations end up in the condition region of an In general, I think a CF unrolling pass would be a good addition to MLIR upstream, but I agree that it is relatively complicated to implement. |
That's fair point. Unfortunate that MLIR SCF doesn't have generic control flow yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance we can use mlir::CFGLoopInfo for finding out loop candidates? I think some of the logic like finding a single exist and parent/child relation should work out-of-box.
./build/bin/circt-opt foo.mlir --mlir-print-op-generic | ./build/bin/mlir-opt --allow-unregistered-dialect --pass-pipeline='builtin.module(func.func(test-cfg-loop-info))'
Testing : "TwoNestedLoops"
Blocks : ^bb0, ^bb0, ^bb1, ^bb2, ^bb3, ^bb4, ^bb5
Loop at depth 1 containing: ^bb1<header><exiting>,^bb2,^bb4<latch>,^bb3
Loop at depth 2 containing: ^bb2<header><exiting>,^bb3<latch>
module {
func.func @TwoNestedLoops() -> i42 {
%0 = "hw.constant"() <{value = 0 : i42}> : () -> i42
%1 = "hw.constant"() <{value = 1 : i42}> : () -> i42
%2 = "hw.constant"() <{value = 2 : i42}> : () -> i42
%3 = "hw.constant"() <{value = 3 : i42}> : () -> i42
%4 = "hw.constant"() <{value = 42 : i42}> : () -> i42
cf.br ^bb1(%0, %0 : i42, i42)
^bb1(%5: i42, %6: i42): // 2 preds: ^bb0, ^bb4
%7 = "comb.icmp"(%5, %2) <{predicate = 2 : i64}> : (i42, i42) -> i1
cf.cond_br %7, ^bb2(%0, %6 : i42, i42), ^bb5
^bb2(%8: i42, %9: i42): // 2 preds: ^bb1, ^bb3
%10 = "comb.icmp"(%8, %3) <{predicate = 2 : i64}> : (i42, i42) -> i1
cf.cond_br %10, ^bb3, ^bb4
^bb3: // pred: ^bb2
%11 = "comb.add"(%9, %4) : (i42, i42) -> i42
%12 = "comb.add"(%8, %1) : (i42, i42) -> i42
cf.br ^bb2(%12, %11 : i42, i42)
^bb4: // pred: ^bb2
%13 = "comb.add"(%5, %1) : (i42, i42) -> i42
cf.br ^bb1(%13, %9 : i42, i42)
^bb5: // pred: ^bb1
return %6 : i42
}
}
Loop::match and Loop::unroll would be necessary to implement our own but other wise I think we can utilize LLVM LoopInfo.
@uenoku That's a fantastic idea. Thanks for the pointer -- I wasn't aware of |
3dae8ea
to
f1bebd5
Compare
I have reworked the pass to use the |
f1bebd5
to
33e5e16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Is it intentional that you removed CFGLoop?
Yikes no, that was an accident. Let me get that back 👍 |
Add an LLHD pass that unrolls loops with statically-known bounds. The pass computes the dominance info for `llhd.combinational` ops, finds loop back-edges by looking for blocks that branch back to blocks that dominate them, which yields initial *loop head* and *back-edge* pairs. Based on these the pass populates a data structure with loop metadata, including the blocks in the loop body, exit control flow edges, the exit condition, induction variable, and loop bounds. After gathering this information, the loop body is simply stamped out muliple times, the few relevant control flow edges are adjusted, and a rough cleanup of unused blocks and values is performed. The current implementation of the pass is *very* conservative. The loops are required to have very simple control flow, and the induction variable must go from 0 to a positive upper bound in increments of 1. We'll want to relax these constraints in the future and allow other increments/decrements, comparison predicates, and more. Thanks again to @maerhart for doing a lot of initial work for this on his experimental branch. Co-authored-by: Martin Erhart <martin.erhart@sifive.com>
33e5e16
to
e1888cd
Compare
Add an LLHD pass that unrolls loops with statically-known bounds.
The pass computes the dominance info for
llhd.combinational
ops, finds loop back-edges by looking for blocks that branch back to blocks that dominate them, which yields initial loop head and back-edge pairs. Based on these the pass populates a data structure with loop metadata, including the blocks in the loop body, exit control flow edges, the exit condition, induction variable, and loop bounds. After gathering this information, the loop body is simply stamped out muliple times, the few relevant control flow edges are adjusted, and a rough cleanup of unused blocks and values is performed.The current implementation of the pass is very conservative. The loops are required to have very simple control flow, and the induction variable must go from 0 to a positive upper bound in increments of 1. We'll want to relax these constraints in the future and allow other increments/decrements, comparison predicates, and more.
Thanks again to @maerhart for doing a lot of initial work for this on his experimental branch.