Skip to content

Conversation

gnprice
Copy link

@gnprice gnprice commented Jan 14, 2019

Q                       A
Fixed Issues?
Patch: Bug Fix? Yes!
Major: Breaking Change? nope
Minor: New Feature? nope
Tests Added + Pass? ✔️
Documentation PR Link
Any Dependency Changes? nope
License MIT

Labels work quite differently from identifier bindings in how they collide. Because we track them on scopes in a very similar way to identifier bindings, this means that a transform which innocently tries to insert a label using scope.generateUid, like this:

const label = path.scope.generateUidIdentifier("label");
path.replaceWith(t.labeledStatement(label, path.node));

is subject to a bug where it can generate invalid code.

Specifically, it's actually perfectly fine to have several labels as siblings:

function f() {
  loop: do { continue loop; } while (0);
  loop: while (1) { break loop; };
  loop: if (1) { break loop; /* :-P */ }
}

but on the other hand, it's not permitted to have one as an ancestor of another:

function f() {
  loop: do {
    loop: while (1) { // Error!
      break loop;     // After all -- what should this mean?
    }
    console.log("would this run?");
  } while (0);
}

This commit adds several test cases to demonstrate the issue -- an end-to-end test featuring a transform, and a couple of unit tests -- and then changes the algorithm a Scope uses for tracking labels to one that more closely reflects the language semantics, fixing the issue.

We were trying the names

  _temp  _temp  _temp2 _temp3 ...

Skip the first one -- so it's just

  _temp  _temp2 _temp3 ...

(Alternatively, the hyper-optimized way would be like this:

  _temp  _temp0 _temp1 _temp2 ...

That's what C++ compilers do in name-mangling, to make it a little
less common to have to spend the extra character for `_temp10`.
But of course that breaks almost every fragile, output-dependent test
case in the codebase... 182/8641 tests in 32/134 suites, by my count.)
Labels work quite differently from identifier bindings in how they
collide.  It's actually perfectly fine to have several of them as
siblings:

  function f() {
    loop: do { continue loop; } while (0);
    loop: while (1) { break loop; };
    loop: if (1) { break loop; /* :-P */ }
  }

On the other hand, it's *not* permitted to have one as an ancestor of
another:

  function f() {
    loop: do {
      loop: while (1) { // Error!
        break loop;     // After all -- what should this mean?
      }
      console.log("would this run?");
    } while (0);
  }

So we'll need to handle them differently.

This commit adds a couple of unit tests, and also a more end-to-end
test with a transform, that demonstrate the issue.
This fixes (and un-comments) the tests added in the previous commit.

We don't do anything to track labels on modifications, just like
the previous algorithm didn't.  Instead, like before, we rely on the
rather overkill solution that any name gotten from `generateUid`
goes into `getProgramParent().uids`...  and if a transform is
inserting labels that it *hasn't* gotten from `generateUid`,
it's probably in trouble already.
@babel-bot
Copy link
Collaborator

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/9788/

@gnprice gnprice mentioned this pull request Jan 14, 2019
2 tasks
@danez danez added PR: Bug Fix 🐛 A type of pull request used for our changelog categories pkg: traverse (scope) labels Mar 1, 2019
@@ -238,7 +271,7 @@ export default class Scope {
.replace(/[0-9]+$/g, "");

let uid;
let i = 0;
let i = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get this change. Don't we now start with _temp1, although we should start with _temp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it is > 1, wonder if we should instead make _generateUid() do >=1. Although that might change all tests.

Guess we should do that separately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it is > 1, wonder if we should instead make _generateUid() do >=1. Although that might change all tests.

Guess we should do that separately.

Yep -- agreed on all points :-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Here's how I put it in the commit message, a bit more verbosely:

We were trying the names

  _temp  _temp  _temp2 _temp3 ...

Skip the first one -- so it's just

  _temp  _temp2 _temp3 ...

(Alternatively, the hyper-optimized way would be like this:

  _temp  _temp0 _temp1 _temp2 ...

That's what C++ compilers do in name-mangling, to make it a little
less common to have to spend the extra character for `_temp10`.
But of course that breaks almost every fragile, output-dependent test
case in the codebase... 182/8641 tests in 32/134 suites, by my count.)

)

@danez
Copy link
Member

danez commented Mar 1, 2019

The change looks okay to me, but I haven't fully understand what the problem was. Can you elaborate which invalid code we were generating?

@gnprice
Copy link
Author

gnprice commented Mar 2, 2019

Thanks for the review!

The change looks okay to me, but I haven't fully understand what the problem was. Can you elaborate which invalid code we were generating?

Sure. Here's the invalid-syntax example in the commit message on 4e7f31f, the commit that adds the tests:

  function f() {
    loop: do {
      loop: while (1) { // Error!
        break loop;     // After all -- what should this mean?
      }
      console.log("would this run?");
    } while (0);
  }

The rule on label collisions in JavaScript is: you can't reuse a name for one label that's an ancestor (or descendant) of another, unless separated by a function or class declaration. In the spec, this is expressed through the ContainsDuplicateLabels semantic rule -- I believe the total effect of all the various references to ContainsDuplicateLabels ends up being precisely that. (Though unfortunately if there's somewhere that this rule or others like it are stated more concisely, I haven't found it.)

One story you might tell as a rationale for this is that if the code inside were to then contain a break loop or continue loop statement, it'd be ambiguous what that statement should do. (And if it doesn't, the labels have no effect at all -- those are the only things that a label in JS can be used for.)


So, the new end-to-end test contains a plugin that innocently tries to introduce a label on a statement:

          const label = path.scope.generateUidIdentifier("label");
          path.replaceWith(t.labeledStatement(label, path.node));
          path.skip();

together with input code that just happens to already contain a colliding label _label: on an outer loop. Because the logic behind generateUidIdentifier wasn't seeing that, it would miss the collision and give us _label as the name to use for the new label, and the generated code would give a SyntaxError.

Likewise, the new unit test cases cover generating a name to use for a label, where the input code already contains a potentially-colliding label on either a descendant node or an ancestor node:

      const gen = (id, path) => path.scope.generateUid(id);
      expect(gen("foo", getPath("do { _foo: { } } while (0)"))).toBe("_foo2");
      expect(
	        gen("foo", getPathHere("_foo: do { HERE } while (0)")),
      ).toBe("_foo2");

Before this change:

  • generateUid('foo') at the root of do { _foo: { } } while (0) would produce _foo -- so a transform could innocently end up with _foo: do { _foo: { } } while (0);
  • generateUid('foo') at the point marked HERE in _foo: do { HERE } while (0) would produce _foo -- so a transform could innocently end up with _foo: do { _foo: { /* ... whatever ... */ } } while (0).

Does that help? Happy to put this longer explanation (or parts of it) in comments or commit messages, wherever you think it'd be helpful; or to explain more in another way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: traverse (scope) PR: Bug Fix 🐛 A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants