-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Hi there👋, I would be interested in sending a PR to add a new diff package to go-internal based on the patience diff implementation Russ added in CL 384255 during 1.19 dev cycle.
Compared to the the current pkg/diff used by go-internal, it is faster and much more memory efficient, running in O(n log n) time and I think it is O(n + m) space.
I currently have a copy here:
https://github.com/thepudds/patience-diff
I think go-internals would be a better home for it, including I think it is generally useful for the community, it would bring go-internal down to zero external dependencies (which is part of my selfish reason for being interested in a PR), as well as help deal w/ current testscript diff performance issues such as:
https://github.com/rogpeppe/go-internal/blob/master/testscript/cmd.go#L144-L148
// pkg/diff is quadratic at the moment.
// If the product of the number of lines in the inputs is too large,
// don't call pkg.Diff at all as it might take tons of memory or time.
// We found one million to be reasonable for an average laptop.
const maxLineDiff = 1_000_000
A sample performance comparison (which I originally posted in pkg/diff#26 (comment)):
For this example [~130k lines], it runs in 0.07 sec and uses 38 MB RSS, vs. pkg/diff runs in 19 sec and 18 GB RSS (roughly ~250x slower and ~500x more RAM).
[...]
The patience-diff version is not a minimal diff:$ wc -l *.out 96935 patience-diff.out 56391 pkg-diff.out
...but at least in theory, it is often friendlier to humans.