Skip to content

Conversation

g-harel
Copy link
Contributor

@g-harel g-harel commented Feb 26, 2019

closes #5853

Add an --only-changed flag to be used in conjunction with --write to avoid re-checking files that were not changed since they were last written (with the same formatting configuration). These results are stored in the .prettiercache file by default, but this can be configured using the PRETTIER_CACHE_LOCATION environment variable.

prettier --write --only-changed ...

Here are some rough manual benchmarks using

time ./bin/prettier.js --write --only-changed "**/*.js"
prettier source ~300 file project
write 31s 8s
only-changed initial 31s 8s
only-changed 2nd run 8s 2s

  • I’ve added tests to confirm my change works.
  • (If changing the API or CLI) I’ve documented the changes I’ve made (in the docs/ directory)
  • (If not an internal change) I’ve added my changes to the CHANGELOG.unreleased.md file following the template.
  • I’ve read the contributing guidelines.

Try the playground for this PR

@g-harel
Copy link
Contributor Author

g-harel commented Mar 2, 2019

@lydell

I'm still missing some coverage % for the error handling in the changed cache implementation. I'd like to add I had to add some unit tests, but I can't find an appropriate place to put them!

@g-harel g-harel marked this pull request as ready for review March 5, 2019 21:15
@mbrookes
Copy link

mbrookes commented Mar 6, 2019

These results are stored in the .prettiercache file by default

@g-harel Going back to the discussion you linked to in the related issue... Even though it seems no consensus was reached, what do you think about using .cache/prettier? Prettier is ubiquitous enough that it could help set the standard, and if not, it's the same number of entries (one!) to add to .gitignore.

(Similarly I'd love do see tools adopt .config/toolname as the default configuration file location, rather than littering the root dir, but one step at a time!)

@g-harel
Copy link
Contributor Author

g-harel commented Mar 7, 2019

@mbrookes

Absolutely, I only used .prettiercache to match the pattern from eslint.

Do you know who would need to make the final call?

@lydell
Copy link
Member

lydell commented Mar 8, 2019

.cache/ might not be a good idea, because apparently it's used by Parcel:

https://github.com/github/gitignore/blob/37abd930a6893fd3a716c48934e704795bc2a45c/Node.gitignore#L61-L62

@mbrookes
Copy link

mbrookes commented Mar 8, 2019

apparently it's used by Parcel:

@lydell Great! The whole idea is that it's a single ubiquitous directory to be shared by any tool or utility that wishes to cache some data, rather than polluting the root directory with multiple .xyzcache files.

@g-harel
Copy link
Contributor Author

g-harel commented Mar 8, 2019

@mbrookes From what I remember using parcel, it doesn't use the folder in the way you outlined. It uses the entire folder as a cache and creates sub-directories with hex filenames to store the data.

It looks like the two would not interfere, but given it's parcel's cache file, it might be best to avoid it.

@mbrookes
Copy link

mbrookes commented Mar 9, 2019

Personally I think it’s a missed opportunity. It would be naive of the Parcel devs to assume that a directory called .cache would be exclusive to Parcel. However, if someone is using both Parcel and prettier and it does cause problems, both tools have an option change the cache directory until Parcel is fixed.

@azz
Copy link
Member

azz commented Mar 10, 2019

Just playing devil's advocate on this one, if this is a CLI-only change, wouldn't it be better to have an external CLI tool that conditionally operates on files?

e.g. say we had a CLI tool called smart-exec, that looked something like:

$ smart-exec "**/*.js" -- prettier --write

This would check a local cache before invoking the command after -- for each file.

You could even wire it up to .prettierignore

$ smart-exec "**/*.js" --ignore-path=.prettierignore -- prettier --write

There's nothing Prettier-specific about the implementation, so it should be perfectly re-usable.

Alternatively, if you wanted to do a bespoke prettier-cache tool, you could go zero-config ✨:

$ prettier-changed

This would behave exactly the same as the previous command, you could even build it on top of proposed smart-exec.

I worry about adding too much frills to prettier-core (it's why pretty-quick is a standalone tool). UNIX philosophy.

@g-harel
Copy link
Contributor Author

g-harel commented Mar 15, 2019

I think this feature is important enough on large projects to be included with the "core" CLI.

The current approach (in this PR) can be abstracted to a different tool, but it might make sense to eventually extend caching behavior (ex. --check).

I feel prettier-quick's dependency on external tools (git, svn) puts it in a different league because of the increased complexity and required environment configuration.

@g-harel
Copy link
Contributor Author

g-harel commented Apr 9, 2019

Any updates on this? What is the decision making process?

if (context.argv["only-changed"] && !context.argv["write"]) {
context.logger.error("Cannot use --only-changed without --write.");
process.exit(1);
}
Copy link
Member

@alexander-akait alexander-akait Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should enable cache by default and better add option --no-cache (better DX, this do many tools. example jest)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it might be best to avoid changing existing behavior silently, especially since this flag writes a file that needs to gitignored. I'll defer to the maintainers, y'all know best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @prettier/core

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do this for v2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably better to do it for v2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what is problem implement --no-cache by default?

return crypto
.createHash("sha1")
.update(data)
.digest("base64");
Copy link
Member

@alexander-akait alexander-akait Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe md4 (better use xxHash, but xxHash doesn't built-in in node 😞 ), sha1 is very slow, also we don't need strong here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a bit of a discussion about this on the issue and ended up finding that hashing is not a bottleneck in this case. I'm open to changing it, but I'm not sure it's worth adding a dependency.

#5853 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okey, let's wait other feedback, for me md4 is best solution here, when you have really big codebase, it is increase perf around 5%

Copy link
Member

@alexander-akait alexander-akait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, some comments

@JelteF
Copy link

JelteF commented Apr 30, 2019

This looks like a really nice addition. Would this also work when using --list-different instead of --write? i.e. it would only check the files not checked by the last --write

@g-harel g-harel force-pushed the only-changed branch 2 times, most recently from b0e1335 to a5e9f3f Compare May 4, 2019 20:34
@g-harel
Copy link
Contributor Author

g-harel commented May 4, 2019

@evilebottnawi

Sorry about the delay, I was in exam season.

I've changed the cache directory and switched to write-file-atomic. Let me know what you think of the changes!

@chdsbd
Copy link

chdsbd commented May 25, 2019

This is great addition to prettier. Thank you for taking the time to build this feature.

I found myself wondering why python/black was so much faster than Prettier and it's because Prettier doesn't have this feature yet.

What needs to be done before this PR can get merged?

@g-harel
Copy link
Contributor Author

g-harel commented Jun 6, 2019

@chdsbd

There is a discussion left unresolved about potentially having this behavior be enabled by default, which would need to be introduced in the next major version bump.

@evilebottnawi @lydell

Anything I can do to move this along?

@mbrookes
Copy link

mbrookes commented Jun 6, 2019

Anything I can do to move this along?

Storm prettier HQ and merge it yourself? 😆

Copy link
Member

@lydell lydell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this the way it is! Good job!

expect(cacheContentsAfter).not.toBe(cacheContentsBefore);
});

describe("ChangedCache", () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have much of those unit tests in the repo, but this is fine

@duailibe
Copy link
Collaborator

duailibe commented Jun 7, 2019

@g-harel Thanks for pushing this through. This will go out on the next minor release, so I'll wait a bit until 1.18 gets more stable, so we don't have to push any other patches

Also don't worry about updating the branch, I'll take care of any conflicts before merge

@duailibe duailibe merged commit 6fae09b into prettier:master Jul 22, 2019
@@ -441,6 +443,18 @@ function formatFiles(context) {
context.logger.log("Checking formatting...");
}

let changedCache = null;
if (context.argv["only-changed"]) {
const cacheDir = findCacheDir({ name: "prettier", create: true });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @g-harel need use || os.tmp() https://github.com/avajs/find-cache-dir#findcachediroptions

It returns a string containing the absolute path to the cache directory, or undefined if package.json was never found.

In theory you can install prettier globally and doesn't have package.json

Copy link
Contributor Author

@g-harel g-harel Jul 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! I've openned a PR with a fix (#6327).

@lock lock bot added the locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting. label Oct 21, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cache option to avoid processing unchanged files
9 participants