-
Notifications
You must be signed in to change notification settings - Fork 366
[Seq] Add FirMem op #5009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Seq] Add FirMem op #5009
Conversation
35e7486
to
eefe6f1
Compare
Based entirely on the description, this sounds very, very similar to HLMem. Is there a reason you can't extend and use that one? |
@teqdruid Yeah definitely, the two are similar in design. The idea behind |
I don't think we use HLMem for anything yet. (@mortbopet Am I correct?) If I'm correct, I think it's reasonable to start with one. Anything you're going to need for your memories, we're likely to need also. It seems silly to have two implementations with one being a superset of the other. Frankly, if y'all put in a bunch of work on this one we're more likely to use it than HLMem so we'd end up with one anyway. |
Basically I think there are additional data which need to be added to this Op(maybe add to spec firstly?) see chipsalliance/chisel#3131. |
Currently, HandshakeToHW is the only user of HLMem. It doesn't require any "special" semantics, and currently just relies on the behavioral lowering. My personal opinion on this is that we should as much as possible try to work towards sharing code as soon as possible, instead of relying on things coming together down the road. While there is an argument for "ironing things out in the FIRRTL flow and letting that influence a better, improved design in the core/generic/shared ops", I still think the ecosystem is better off sharing abstractions sooner, rather than later... not to mention the confusion of people coming into CIRCT seeing That is, at least from my point of view, the whole point of the |
It would likely be very disruptive to try to move Put differently, the only unification of |
fb7dcbe
to
072ebcc
Compare
Thanks for the great discussion in the CIRCT ODM today 👍 🥳 @teqdruid @mortbopet I have a stack of PRs built on this that deals with carving memory lowering out from the FIRRTL pipeline. It would be easier for me to unify the firmem and hlmem ops after those FIRRTL-side changes are in. Is it okay with you to land the PRs as they are first and then do the merging into |
@sequencer It would be really nice if we could make this work with chipsalliance/chisel#3131. Anything missing for that to work we could add to |
I've no objection... so long as you get around to merging ;) |
072ebcc
to
f996afa
Compare
I32Attr:$readLatency, | ||
I32Attr:$writeLatency, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be the attribute to port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed, you'd probably want to have them on the port. However, that's not how the firrtl.mem
op and Chisel works right now. This seq.firmem
tries to capture the current behavior; but I think we should definitely have a discussion and start adding new memory ops that are a better or alternative representation of memories! That'll inform what a generic Seq memory would look like -- if that even exists.
RUWAttr:$ruw, | ||
WUWAttr:$wuw, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think RUW
, WUW
is the hazard definition between each port, so it should be a 1 to 1 matrix for ports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Same as above: totally agreed, we should get some improved mem op discussion going beyond just extracting firrtl.mem
behavior.
OptionalAttr<FirMemInitAttr>:$init, | ||
OptionalAttr<StrAttr>:$prefix, | ||
OptionalAttr<AnyAttr>:$output_file | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for better PnR implementation experience, should we add optional attribute of hard macro name, and then expose them to repl-seq-mem flow?
Additionally, I think we should add extra IO for MBIST and DFT flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for simulation, we should add a config to optionally tie down the extra IO since there are two types of simulations for memory:
- the trivial behavior simulation, which the RTL designer only knows the memory size and ports.
- post-RTL simulation should be able to introduce DFT and MBIST in Chisel designs that we should replace these memory with the behavior memory IP from vendor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could imagine that we add dedicated seq.mem.mbist_port
and seq.mem.dft_port
ops on the seq.mem
op we are trying to distill out of seq.firmem
later on. But it's not totally clear yo to me how to handle these, since the MBIST/DFT stuff is usually very technology node dependent. Maybe some more generic seq.mem.opaque_port
with arbitrary operands/results?
I'm starting to think that we should start collecting lots of different memory ops in Seq, covering all the different use cases we have. And then either the commonality becomes very obvious, or the need for an op interface and multiple dedicated ops is the way forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DFT may be very technology node specific, but can it be treated as an opaque type/bus which is routable, even if it's not decomposable? Presumably there are generic DFT network building nodes too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that could work. That way the DFT specifics could be "implementation-defined", or defined by a dedicated DFT pass that runs later. And the memory could just allow this as some opaque port.
let parameters = (ins | ||
"uint32_t":$depth, | ||
"uint32_t":$width, | ||
OptionalParameter<"llvm::Optional<uint32_t>">:$maskWidth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should mask
also be an attribute to port?
Inside Chisel, what I think we should do is get rid of the |
@sequencer You raise a lot of very valid points. Right now My feeling is that it'll be almost impossible to find a general memory op that unifies all concerns from HLS, pipeline scheduling, FIRRTL, and ASIC node-specific macros. So instead, why not create a lot of different ops that are very use-case specific, but then have a |
I have been wondering how to migrate to a more PnR friendly API, here is my thought:
|
Add a new `FirMemOp` to the Seq dialect to capture the semantics of FIRRTL memories at the HW level. The intent behind this change is the same as for `FirRegOp`, which captures the specifics of FIRRTL registers but within the type system and abstraction level of the HW dialect. Memories are represented as a declaration with the `seq.firmem` which returns a value representing the memory itself. This value is of type `!seq.firmem<12 x 42, mask 2>` which represents an opaque handle to the memory; it encodes the depth (`12`) and width (`42`) of the memory, plus an optional mask if the memory is to support masked writes. Memory ports are represented as separate operations: - `seq.firmem.read_port` - `seq.firmem.write_port` - `seq.firmem.read_write_port` These operations take the memory itself as their first operand. The remaining operands are the specific signals required for the corresponding type of port: clocks, enables, masks, mode select, etc. Read and read-write ports also return the read data as their result. Enables and masks are optional to allow for more concise IR for simple memory configurations. Representing ports as separate ops in the IR has the advantage that adding and removing ports is trivial; the standard CSE and canonicalization even allow for collapsing duplicate ports and removing permanently disabled ports. Separate ops also make building up memories in the IR a lot simpler, since the operands and results associated with a port are grouped together, instead of having to spread them across an overall operand and result list on the overall memory. Getting a memory's ports means walking the list of users of the declaration op, which is very cheap since the only users are the port operations. This commit merely adds the `FirMemOp` alongside some canonicalization and general tests, but the op isn't used by any part of CIRCT yet. Use in FIRRTL-to-HW lowering follows as a separate commit.
f996afa
to
f6d12b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have no doubt there will be iteration on this, but let's get it in and start using it.
OptionalAttr<FirMemInitAttr>:$init, | ||
OptionalAttr<StrAttr>:$prefix, | ||
OptionalAttr<AnyAttr>:$output_file | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DFT may be very technology node specific, but can it be treated as an opaque type/bus which is routable, even if it's not decomposable? Presumably there are generic DFT network building nodes too.
std::max(1U, cast<FirMemType>($_self).getWidth())) | ||
}]>; | ||
|
||
def FirMemReadOp : SeqOp<"firmem.read_port", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still unsure if make-a-port ops are better than the mem producing the ports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've seen in the Arc dialect which uses this pattern, passes get a lot easier to write if you don't need to do this weird split of multiple ports forming logical groups of operands and results, and then co-iterating the operands and results. Gets especially weird if some ports are optional. With a separate op it's easier to split the port structure into more manageable pieces, and we get port DCE for free.
isConst(op.getClock()) || isConstZero(op.getMode())) { | ||
auto newOp = rewriter.replaceOpWithNewOp<FirMemReadOp>( | ||
op, op.getMemory(), op.getAddress(), op.getClock(), op.getEnable()); | ||
for (auto namedAttr : op->getAttrs()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting a crash here when running the canonicalization tests locally. Shouldn't you not be accessing op
after replacing it (not sure why it passes in CI)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened pr to address this here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! For some reason this didn't show up in my local run 😞 Need to check my setup. Thanks for catching and fixing this!
Add a new
FirMemOp
to the Seq dialect to capture the semantics of FIRRTL memories at the HW level. The intent behind this change is the same as forFirRegOp
, which captures the specifics of FIRRTL registers but within the type system and abstraction level of the HW dialect.Memories are represented as a declaration with the
seq.firmem
which returns a value representing the memory itself. This value is of type!seq.firmem<12 x 42, mask 2>
which represents an opaque handle to the memory; it encodes the depth (12
) and width (42
) of the memory, plus an optional mask if the memory is to support masked writes.Memory ports are represented as separate operations:
seq.firmem.read_port
seq.firmem.write_port
seq.firmem.read_write_port
These operations take the memory itself as their first operand. The remaining operands are the specific signals required for the corresponding type of port: clocks, enables, masks, mode select, etc. Read and read-write ports also return the read data as their result. Enables and masks are optional to allow for more concise IR for simple memory configurations.
Representing ports as separate ops in the IR has the advantage that adding and removing ports is trivial; the standard CSE and canonicalization even allow for collapsing duplicate ports and removing permanently disabled ports. Separate ops also make building up memories in the IR a lot simpler, since the operands and results associated with a port are grouped together, instead of having to spread them across an overall operand and result list on the overall memory. Getting a memory's ports means walking the list of users of the declaration op, which is very cheap since the only users are the port operations.
This commit merely adds the
FirMemOp
alongside some canonicalization and general tests, but the op isn't used by any part of CIRCT yet. Use in FIRRTL-to-HW lowering follows as a separate commit.After some discussion: the intent is for
seq.firmem
andseq.hlmem
to be unified into aseq.mem
once the memory handling has been factored out of the FIRRTL-to-HW lowering.