Skip to content

diff: add new diff package based on Go 1.19 patience diff implementation #157

@thepudds

Description

@thepudds

Hi there👋, I would be interested in sending a PR to add a new diff package to go-internal based on the patience diff implementation Russ added in CL 384255 during 1.19 dev cycle.

Compared to the the current pkg/diff used by go-internal, it is faster and much more memory efficient, running in O(n log n) time and I think it is O(n + m) space.

I currently have a copy here:
https://github.com/thepudds/patience-diff

I think go-internals would be a better home for it, including I think it is generally useful for the community, it would bring go-internal down to zero external dependencies (which is part of my selfish reason for being interested in a PR), as well as help deal w/ current testscript diff performance issues such as:

https://github.com/rogpeppe/go-internal/blob/master/testscript/cmd.go#L144-L148

	// pkg/diff is quadratic at the moment.
	// If the product of the number of lines in the inputs is too large,
	// don't call pkg.Diff at all as it might take tons of memory or time.
	// We found one million to be reasonable for an average laptop.
	const maxLineDiff = 1_000_000

A sample performance comparison (which I originally posted in pkg/diff#26 (comment)):

For this example [~130k lines], it runs in 0.07 sec and uses 38 MB RSS, vs. pkg/diff runs in 19 sec and 18 GB RSS (roughly ~250x slower and ~500x more RAM).
[...]
The patience-diff version is not a minimal diff:

$ wc -l *.out
96935 patience-diff.out
56391 pkg-diff.out

...but at least in theory, it is often friendlier to humans.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions