Skip to content

Optimize Semantic #32

@overlookmotel

Description

@overlookmotel

Possible optimizations to Semantic. I'll add to this list as I learn more about how it works:

  • Optimize structs-of-arrays (Optimize Semantic's structs-of-arrays and make API simpler #11, Use IndexVec not FxHashMap for all fields of ScopeTree oxc#4269)
  • Store Semantic data in arena (Store Semantic data in arena #31)
  • Re-design ScopeFlags (Re-design ScopeFlags #16)
  • unresolved_references doesn't need to be stored in ScopeTree, only a single Vec for root_unresolved_references. unresolved_references is used internally within Semantic while resolving references, but at the end, it's an empty Vec for every scope except root scope. Store it in SemanticBuilder instead and discard at the end. perf(semantic): keep a single map of unresolved references oxc#4107
  • unresolved_references does not need to retain entries for every scope. Could be a stack where we reuse hash maps from previous scopes (see perf(semantic): keep a single map of unresolved references oxc#4107 (comment)).
  • unresolved_references could be a linked list / chunked linked list instead of a Vec. I don't think it's ever indexed into.
  • Reduce hashing when resolving references. If current unresolved_references hash map contains hashes already, no need to hash each identifier again when finding entry in parent hash map (maybe hash map doesn't contain hashes - SwissTable-style hash maps don't, I think - in which case we could store hashes in entries).
  • Store binding names as Atom<'a> not CompactStr. Conversion to CompactStr causes unnecessary allocations. We can just reference strings in source text (as Atom does).
  • Reference::name field is unnecessary for bound references - can be got from SymbolId. Is needed for unbound references, but they could be stored elsewhere and referenced.
  • Add a scope for "global" which would contain unbound references (replacing root unresolved_references)? Then every reference has a SymbolId.
  • Initialize Semantic's Vecs with sufficient capacity so they don't need to grow (see Store Semantic data in arena #31).
  • Get rid of Reference.
    • Store SymbolId instead of ReferenceId in IdentifierReference.
    • Store a reference count for each symbol so you can check if a symbol is referenced or not.
    • Need some way in semantic to update SymbolId for IdentifierReferences long after exiting the node. Would need a pointer-based solution, or Cell<SymbolId>.
    • Only thing we lose is the AstNodeId in Reference. This is probably used in linter, but I don't know what for, and how easy to replace it.
    • This would allow removing the resolved_references: IndexVec<SymbolId, Vec<ReferenceId>> field in SymbolTable which is major source of reallocation in semantic, as Vec<ReferenceId> is pushed to every time a IdentifierReference is found, and has an inherently unpredictable growth pattern (can't know in advance how big it needs to be).
  • Don't use Nodes within SemanticBuilder. We can get type of parent node etc by maintaining stacks in the visitor. This will enable us to have a cut-down version of SemanticBuilder which doesn't build Nodes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions