Skip to content

Conversation

camc314
Copy link
Contributor

@camc314 camc314 commented Feb 23, 2025

wanted to have a shot at implementing this, this is rough draft.

Dunqing or overlookmotel feel free to take over or close out this PR. not sure how much time i'll have to finish it off

@Dunqing @overlookmotel this should be ready when you guys have time 🙏

closes #9168

@github-actions github-actions bot added A-transformer Area - Transformer / Transpiler C-enhancement Category - New feature or request labels Feb 23, 2025
Copy link
Contributor Author

camc314 commented Feb 23, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@camc314 camc314 marked this pull request as ready for review February 23, 2025 21:55
@camc314 camc314 marked this pull request as draft February 23, 2025 21:55
Copy link

codspeed-hq bot commented Feb 23, 2025

CodSpeed Performance Report

Merging #9310 will degrade performances by 5.33%

Comparing c/02-23-feat_transformer_transform_explicit_resource_management (a10ead8) with main (6aebdba)

Summary

❌ 2 regressions
✅ 37 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
transformer[checker.ts] 24.2 ms 25.6 ms -5.33%
transformer[pdf.mjs] 10.6 ms 11.1 ms -4.3%

@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch 2 times, most recently from c4b4c55 to 0dd6726 Compare February 27, 2025 17:02
@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch 2 times, most recently from 6b624bd to 081e4d7 Compare February 27, 2025 17:43
@camc314 camc314 marked this pull request as ready for review February 27, 2025 17:43
@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch from 081e4d7 to 6716335 Compare February 28, 2025 09:30
@Dunqing
Copy link
Member

Dunqing commented Feb 28, 2025

Thank you for working on this, I will review this next week, before that, can you add some documentation like other plugins do at the top of file?

For example:

//! ES2020: Nullish Coalescing Operator
//!
//! This plugin transforms nullish coalescing operators (`??`) to a series of ternary expressions.
//!
//! > This plugin is included in `preset-env`, in ES2020
//!
//! ## Example
//!
//! Input:
//! ```js
//! var foo = object.foo ?? "default";
//! ```
//!
//! Output:
//! ```js
//! var _object$foo;
//! var foo =
//! (_object$foo = object.foo) !== null && _object$foo !== void 0
//! ? _object$foo
//! : "default";
//! ```
//!
//! ## Implementation
//!
//! Implementation based on [@babel/plugin-transform-nullish-coalescing-operator](https://babeljs.io/docs/babel-plugin-transform-nullish-coalescing-operator).
//!
//! ## References:
//! * Babel plugin implementation: <https://github.com/babel/babel/tree/v7.26.2/packages/babel-plugin-transform-nullish-coalescing-operator>
//! * Nullish coalescing TC39 proposal: <https://github.com/tc39-transfer/proposal-nullish-coalescing>

@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch from 6716335 to 305c2ab Compare February 28, 2025 10:59
@Dunqing Dunqing self-assigned this Mar 3, 2025
@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch from 305c2ab to 69a7a41 Compare March 3, 2025 13:36
Copy link
Member

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for tackling this. And bravo for getting all the tests to pass, including all the semantic IDs (don't worry about that 1 weird one).

I've made comments below about some small things.

The bigger thing is the performance hit. My assumption is it's due to looping over statements. Usually we try to avoid that by setting some state in enter_statements, then let the traversal run through the children with smaller visitors which update that state as they go, and then you check the state in exit_statements and act on it if there's work to be done. This avoids visiting every statement twice.

However, I'm not familiar with this transform, and found it hard to follow what the code here is doing, so can't judge how amenable it'd be to that approach. So I suggest we merge this, and then work on the performance in later PRs.

What would help a lot is if you'd be able to add more comments, so the logic is easier to follow. The AstBuilder calls are so verbose (bad API, sorry!) that it helps a lot to have a comment before a bunch of ctx.ast.blah_blah calls saying what the code it produces. e.g. add a comment before this:

// `var var_id = <expr>;`

inner_block.push(Statement::VariableDeclaration(ctx.ast.alloc(
ctx.ast.variable_declaration(
span,
VariableDeclarationKind::Var,
ctx.ast.vec_from_array([ctx.ast.variable_declarator(
span,
VariableDeclarationKind::Var,
ctx.ast.binding_pattern(
BindingPatternKind::BindingIdentifier(
ctx.ast.alloc(var_id.create_binding_identifier(ctx)),
),
NONE,
false,
),
Some(expr),
false,
)]),
false,
),
)));

Also methods should ideally have a high-level comment with a before vs after example (exactly like you did on enter_for_of_statement).

Once the logic of the transform is clearer, it'll be easier to see what perf optimization is possible.

Sorry for the deluge of feedback. I hope it's helpful, rather than annoying!

@@ -45,6 +45,7 @@ fn main() {
let ret = SemanticBuilder::new()
// Estimate transformer will triple scopes, symbols, references
.with_excess_capacity(2.0)
.with_scope_tree_child_ids(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems extraneous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i need to think more here, i'm using it

let child_ids = scopes.get_child_ids(current_scope_id);
, but there might be a better way.

will look properly tommorow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't explain myself well. This file is an example. Did you mean to change the example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so if you run the example and it transforms a using declaration, then it'll panic because the child ids don't exist.

let me look and see if it's possible to do it another way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this trying to achieve?

  1. we've got a vec of Statements
  2. each of these statements currently has a parent scope of x
  3. we are moving all of these statements into a child block
  4. so we need to update the scope of all of the statements

e.g.

function foo () {}

becomes

try {
    function foo() {}
} ...

The current approach:

let scope_id_children_to_move = scope_id.unwrap_or(parent_scope_id);
let scope_id = scope_id
.unwrap_or_else(|| ctx.create_child_scope(parent_scope_id, ScopeFlags::empty()));
let block = ctx.ast.block_statement_with_scope_id(SPAN, stmts, scope_id);
let scopes = ctx.scopes_mut();
let child_ids = scopes.get_child_ids(scope_id_children_to_move);
let child_ids = child_ids.to_vec();
for id in child_ids.iter().filter(|id| *id != &scope_id) {
scopes.change_parent_id(*id, Some(scope_id));
}

gets all child scopes, moves them to the new scope id (ignoring the scope we just created.

would it make sense to introduce a new api in oxc_traverse to achieve this?

e.g. a similar (but new) version of. something like insert_scope_above_statements

/// Insert a scope into scope tree below a statement.
///
/// Statement must be in current scope.
/// New scope is created as child of current scope.
/// All child scopes of the statement are reassigned to be children of the new scope.
///
/// `flags` provided are amended to inherit from parent scope's flags.
pub fn insert_scope_below_statement(&mut self, stmt: &Statement, flags: ScopeFlags) -> ScopeId {
let mut collector = ChildScopeCollector::new();
collector.visit_statement(stmt);
self.insert_scope_below(&collector.scope_ids, flags)
}

^^ this is a bit of a brain dump.

Probably a better solution is to use change_parent_id on the scope we passed in, and replace it.
needs more thought.

Copy link
Member

@overlookmotel overlookmotel Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so if you run the example and it transforms a using declaration, then it'll panic because the child ids don't exist.

Ah of course. Now I understand why you changed the example. That makes sense.

But... I may be wrong, but I don't think using child scopes will work in all cases. e.g.:

function outer( x = function inParams() {} ) {
    let f = function inBody() {};
    using u = whatever();
}

Functions only have 1 scope, covering both the function body and its params. So in this case, inParams's scope is a child of outer. inBody needs to have its parent scope changed, but inParams should not. So, if I'm right, we'll need another mechanism to determine which scopes to re-parent.

But let's leave as is for now, and I'll try to devise a failing test that demonstrates the problem in a follow-up PR.

Also, fixing #9666 is going to throw all scopes-related stuff into disarray, so probably best we tackle this after that's done.

@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch 4 times, most recently from fc2520a to 6bcf4b2 Compare March 5, 2025 22:50
@camc314
Copy link
Contributor Author

camc314 commented Mar 5, 2025

Thank you for the detailed review! I appreciate it.

For the perf hit, i had this thought. Would it be something like:

  1. enter var decl.
  2. if using, add address of parent (e.g. block stmt) to a hash map
  3. when leaving a block, check if the address is in the hash map, if so, do some work. else skip.

If that makes sense, perhapse we merge this once the other feedback has been resolved and fixed in a foillowup?

I've actioned most of your comments just now. the more complex ones i need to think about more before actioning, but i'll try to do tomorrow/friday.

NOTE: still need to add some more code comments about ast builder changes

@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch from 6bcf4b2 to beed50c Compare March 5, 2025 22:56
@Boshen
Copy link
Member

Boshen commented Mar 6, 2025

This is a great help for future maintenance: https://github.com/oxc-project/oxc/blob/main/crates/oxc_transformer/README.md#style-guide-for-implementing-transforms

@camc314 camc314 force-pushed the c/02-23-feat_transformer_transform_explicit_resource_management branch from beed50c to be9230c Compare March 6, 2025 12:39
@overlookmotel
Copy link
Member

For the perf hit, i had this thought. Would it be something like:

  1. enter var decl.
  2. if using, add address of parent (e.g. block stmt) to a hash map
  3. when leaving a block, check if the address is in the hash map, if so, do some work. else skip.

Yes, pretty much exactly that. Except I'd suggest using a stack instead of a hashmap. You'd push to the stack when entering a block, and pop from it when exiting. In between, VariableDeclaration visitor would update the entry on top of the stack if it's a using statement.

Same principle, but a stack is usually cheaper than a hash map.

oxc_data_structures crate has some stack types which are optimized for this kind of thing. In this case, probably SparseStack<BoundIdentifier> would be a good choice (where the BoundIdentifier represents the binding for the _usingCtx var).

SparseStack<BoundIdentifier> behaves the same as a Vec<Option<BoundIdentifier>>, but optimized for when most entries are None - which they probably are, as using is a bit of niche language feature.

@overlookmotel
Copy link
Member

Actually, you might be right, maybe a hash map is better, on assuption that using statements are extremely rare. As long as the hashmap is empty (no using statements), doing the hashmap lookup in exit_statements etc would probably be even cheaper than a stack.

It'd be interesting to test that and see if there's any difference.

graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Style nit. Re-order imports with external crates before `oxc_*` crates.

This is just my personal OCD preference! I find it easier to see where imports are coming from when ordered like this. Sadly there is no rustfmt rule to enforce this style.
@camc314
Copy link
Contributor Author

camc314 commented Mar 11, 2025

yeah best to try it out, and see what happens

graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Add more comments, and move/amend some. Correct the link to the Babel plugin.
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Use shorter `AstBuilder` calls where possible.

It's always preferable to use the `alloc_*` methods where possible, as it maximizes the chances compiler can see it can construct nodes directly in arena, rather than construct on stack and then copy to arena.
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Use shorter `AstBuilder` calls where possible.

It's always preferable to use the `alloc_*` methods where possible, as it maximizes the chances compiler can see it can construct nodes directly in arena, rather than construct on stack and then copy to arena.
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Replace `.into()` calls with more explicit `Expression::from` etc. This syntax is longer, but personally I find it clearer. When I see `.into()`, I always find myself wondering "but into *what*?".
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310.

Holding large types on the stack is generally best to avoid where possible. Get the data into the arena as quickly as possible, and then only need to pass around `Box`es (which are only 8 bytes).

In the case of `Class`, previously we were `unbox`-ing a `Class` (pull it out of the arena, and onto the stack) and then allocating it back into the arena again. `Class` is a large type - 160 bytes - and this extra work doesn't add any value. We can just leave the `Class` where it is in the arena, and pass around a `Box<Class>`.

This is something of a micro-optimization, but they all add up...
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Just rename vars to be more descriptively named, rather than generic names like `node`.
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Replace `.into()` calls with more explicit `Expression::from` etc. This syntax is longer, but personally I find it clearer. When I see `.into()`, I always find myself wondering "but into *what*?".
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310.

Holding large types on the stack is generally best to avoid where possible. Get the data into the arena as quickly as possible, and then only need to pass around `Box`es (which are only 8 bytes).

In the case of `Class`, previously we were `unbox`-ing a `Class` (pull it out of the arena, and onto the stack) and then allocating it back into the arena again. `Class` is a large type - 160 bytes - and this extra work doesn't add any value. We can just leave the `Class` where it is in the arena, and pass around a `Box<Class>`.

This is something of a micro-optimization, but they all add up...
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Just rename vars to be more descriptively named, rather than generic names like `node`.
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Replace `.into()` calls with more explicit `Expression::from` etc. This syntax is longer, but personally I find it clearer. When I see `.into()`, I always find myself wondering "but into *what*?".
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310.

Holding large types on the stack is generally best to avoid where possible. Get the data into the arena as quickly as possible, and then only need to pass around `Box`es (which are only 8 bytes).

In the case of `Class`, previously we were `unbox`-ing a `Class` (pull it out of the arena, and onto the stack) and then allocating it back into the arena again. `Class` is a large type - 160 bytes - and this extra work doesn't add any value. We can just leave the `Class` where it is in the arena, and pass around a `Box<Class>`.

This is something of a micro-optimization, but they all add up...
graphite-app bot pushed a commit that referenced this pull request Mar 11, 2025
Follow-on after #9310. Pure refactor. Just rename vars to be more descriptively named, rather than generic names like `node`.
graphite-app bot pushed a commit that referenced this pull request Mar 13, 2025
Add a test case for explicit resource management (`using` declarations) which demonstrates a problem with scopes.

See: #9310 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-transformer Area - Transformer / Transpiler C-enhancement Category - New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transformer: explicit resource management
4 participants