Skip to content

Conversation

pilleye
Copy link
Contributor

@pilleye pilleye commented Jun 2, 2024

Summary

Fixes #10841 by handling non-printable characters in the same way as GNU coreutils cat's --show-nonprinting flag. One downside of this approach is that unicode characters are no longer printed as their actual character, which could be problematic for non-Latin languages.

Test Plan

Tested with the same example in #10841, but would appreciate a pointer to where I can programmatically add some tests for various non-printable ASCII characters.

Screenshot 2024-06-01 at 7 49 00 PM

Example of Unicode Changes

Screenshot 2024-06-01 at 7 56 31 PM

Additional Considerations

As @carljm pointed out, this would make the code diverge from git diff.

Would appreciate any feedback on this quick, slightly hacky diff!

Copy link
Contributor

github-actions bot commented Jun 2, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this.

I'm a bit surprised that we don't see a single test failure. I think we may also need to make this change in

pub(super) struct MessageCodeFrame<'a> {

and

pub(super) struct Diff<'a> {

}

impl ShowNonprinting for &str {
fn show_nonprinting(&self) -> String {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could return a Cow here to avoid allocating a string if it doesn't contain any non printable characters.

We may be able to do something similar to
https://github.com/mitsuhiko/insta/blob/569bbade9d9ff4bd43fe4138bdcbde00a6bf34c4/insta/src/output.rs#L325-L339

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

fn show_nonprinting(&self) -> String {
let mut output = String::with_capacity(self.len());

for byte in self.bytes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand this correctly, that this will escape all non-ascii characters (in addition to some ascii characters)? I think that's undesirable. I would be very confused to see ä escaped in a Diff and wouldn't understand where this is coming from.

I recommend focusing on the non printable characters and would consider using unicode characters for them (similar to what insta does see link above) or biomejs (https://github.com/MichaReiser/biome/blob/96bbf507a006274518e8f5ae5af57c56b7771fb0/crates/biome_diagnostics/src/display/frame.rs#L394-L406)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was correct. Fixed in the recent version.

@pilleye pilleye force-pushed the pilleye/nonprinting branch from 6c99816 to 359080e Compare June 5, 2024 19:18
@pilleye
Copy link
Contributor Author

pilleye commented Jun 5, 2024

Fixed the unicode issues and Cow issues: Screenshot 2024-06-05 at 3 19 46 PM

Working on adding it to the other spots in the repo.

@pilleye pilleye force-pushed the pilleye/nonprinting branch from 359080e to 434dc95 Compare June 5, 2024 19:29
@pilleye pilleye force-pushed the pilleye/nonprinting branch from 434dc95 to cd1ae4a Compare June 5, 2024 19:33
@pilleye
Copy link
Contributor Author

pilleye commented Jun 5, 2024

Updated the snapshot tests, they look to be much more clear since the ^ is actually pointing to the correct location.

@pilleye pilleye requested a review from MichaReiser June 5, 2024 19:50
@zanieb
Copy link
Member

zanieb commented Jun 5, 2024

Thanks for contributing!

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is a huge improvement over what we used to have before.

Comment on lines 237 to 250
match change.tag() {
ChangeTag::Equal => write!(f, " {}", change.value())?,
ChangeTag::Delete => write!(f, "{}{}", "-".red(), change.value().red())?,
ChangeTag::Insert => write!(f, "{}{}", "+".green(), change.value().green())?,
ChangeTag::Equal => write!(f, " {}", change.value().show_nonprinting())?,
ChangeTag::Delete => write!(
f,
"{}{}",
"-".red(),
change.value().show_nonprinting().red()
)?,
ChangeTag::Insert => write!(
f,
"{}{}",
"+".green(),
change.value().show_nonprinting().green()
)?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I would introduce a variable with the cleaned-up code to reduce the diff size and reduce the risk that one path misses the call.

let change = change.value().show_nonprinting();

match change.tag {
	ChangeTag::Equal => write!(f, " {}", change)?,
	...
}

@MichaReiser MichaReiser enabled auto-merge (squash) June 8, 2024 06:19
@MichaReiser MichaReiser added the cli Related to the command-line interface label Jun 8, 2024
@MichaReiser MichaReiser merged commit ed94779 into astral-sh:main Jun 8, 2024
@pilleye pilleye deleted the pilleye/nonprinting branch June 9, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Related to the command-line interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider handling of non-printable characters when showing code/diffs
3 participants