Skip to content

Conversation

kitbellew
Copy link
Collaborator

Add and use uber whitespace-token containers; these will contain contiguous sequences of horizontal whitespace, or sequences of newlines, ignoring trailing or embedded horizontal space.

kitbellew added 2 commits July 3, 2024 17:56
These will contain contiguous sequences of horizontal whitespace, or
sequences of newlines, ignoring trailing or embedded horizontal space.
@kitbellew kitbellew merged commit 86edfac into scalameta:main Jul 3, 2024
@kitbellew kitbellew deleted the 3786 branch July 3, 2024 16:20
@bjaglin
Copy link
Member

bjaglin commented Aug 4, 2024

This looks like a small breaking change for consumers of the tokenizer, since consecutive whitespaces are now collapsed into composite tokens.

Scalafix rules are impacted by that change since tokens are exposed at the document level and via each element of the tree without any indirection, so instead of trying to abstract that change away, I am considering signalling it via a breaking change bump.

Which got me wondering:

  • In the interest of minimizing impact on Scalafix rule authors, are there at this stage any other upcoming breaking changes on the horizon?
  • If not, do you think there would be a way to preserve the old behavior for a bit longer on the 4.9.x line? Probably not, as the changelog mentions "Support CRLF, multiple-whitespace sequences" so I assume it's a feature, not a side effect 😃

@kitbellew
Copy link
Collaborator Author

This looks like a small breaking change for consumers of the tokenizer, since consecutive whitespaces are now collapsed into composite tokens.

Oh well, I didn't realize scalafix was looking at tokens as well.

  • In the interest of minimizing impact on Scalafix rule authors, are there at this stage any other upcoming breaking changes on the horizon?

We usually try to avoid breaking changes... that is, if we know they are breaking. But no, no plans.

  • If not, do you think there would be a way to preserve the old behavior for a bit longer on the 4.9.x line? Probably not, as the changelog mentions "Support CRLF, multiple-whitespace sequences" so I assume it's a feature, not a side effect 😃

@tgodzik Sometime ago, we were debating whether to add some parameters to ScalametaParser, perhaps this might be one case. I think we ended up thinking Dialect is enough, don't remember why.

@tgodzik
Copy link
Collaborator

tgodzik commented Aug 5, 2024

I guess in this case a setting would make sense, though not sure how to sensibly add it. We can have a separate tokenized method maybe?

@bjaglin
Copy link
Member

bjaglin commented Aug 15, 2024

We can have a separate tokenized method maybe?

Any chance you could give me some pointers so that I look at that to unblock the bump on Scalafix side? I went through the tokenisation code paths, but I am not sure where to start.

@tgodzik
Copy link
Collaborator

tgodzik commented Aug 15, 2024

I guess we could just add another method to ScalametaTokenizer which will take options. We can then just use that explicitely from scalafix? If needed we can expose it later via the extension methods, but I think it should not be necessary right now.

@kitbellew
Copy link
Collaborator Author

@bjaglin let me do it in the next day or two, and release, if this is blocking you. I'm generally in the middle of fixing some bugs that i found by attempting to use scalafmt on the dotty codebase, but can complete the fixes in the next patch.

@kitbellew
Copy link
Collaborator Author

does scalafix use the tokenizer directly? or through the parser?

@bjaglin
Copy link
Member

bjaglin commented Aug 15, 2024

does scalafix use the tokenizer directly? or through the parser?

Only through the parser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants