Skip to content

Conversation

ryanofsky
Copy link
Contributor

Improve ArgsManager::GetPathArg method added in recent PR #24265, so it is usable more places. This PR starts to use it for the -settings option. This can also be helpful for #24274 which is parsing more path options.

  • Add GetPathArg default argument so it is less awkward to use to parse options that have default values.
  • Fix GetPathArg negated argument handling. Return path{} not path{"0"} when path argument is negated.
  • Add unit tests for default and negated cases
  • Move GetPathArg method declaration next to GetArg declaration. The two methods are close substitutes for each, so this should help keep them consistent and make them more discoverable.

@ryanofsky
Copy link
Contributor Author

Updated 615da97 -> 5effc38 (pr/patharg.1 -> pr/patharg.2, compare) to fix typo and window build error https://cirrus-ci.com/task/6421739189043200?logs=ci#L3944

@kiminuo
Copy link
Contributor

kiminuo commented Feb 10, 2022

I wonder what you would think (conceptually) about taking it a step further (not necessarily in this PR) and remove LocalTestingSetup which complicates understanding of the test code as one needs to take into account more code: kiminuo@dceafb6 -> kiminuo@a01d80c

/**
* Return path argument or default value
*
* It is guaranteed that the returned path has no trailing slashes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless the default value has them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes on Ubuntu:

ResetArgs("");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default/"), fs::path{"default/"});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

unless the default value has them

Thanks, rewrote this

@prusnak
Copy link
Contributor

prusnak commented Feb 10, 2022

I closed #24274 in favor of this PR.

@ryanofsky Maybe incorporate 7db1e1f into this PR?

@@ -336,6 +326,16 @@ class ArgsManager
*/
std::string GetArg(const std::string& strArg, const std::string& strDefault) const;

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to change the order of methods in the header file? This makes the review unnecessarily harder than it has to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Do we really need to change the order of methods in the header file? This makes the review unnecessarily harder than it has to be.

It is just moving a comment, and the comment is entirely rewritten at this point, but if you still think this is a problem, just let me know what would be a better way to fix this.

Comment on lines 254 to 269
ResetArgs("-dir=override");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"override"});
ResetArgs("");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"default"});
ResetArgs("-nodir");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{""});
Copy link
Contributor

@prusnak prusnak Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the same checks but without the second parameter (default value) given:

    ResetArgs("-dir=override");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{"override"});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"override"});
    ResetArgs("");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"default"});
    ResetArgs("-nodir");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Let's add the same checks but without the second parameter (default value) given:

    ResetArgs("-dir=override");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{"override"});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"override"});
    ResetArgs("");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"default"});
    ResetArgs("-nodir");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{});

Those cases are already covered above, and I think this style of writing the test makes it less readable and useful as a clear description of GetPathArg behavior. GetPathArg is supposed to be a convenience function (it's a high-level wrapper around lower-level GetArg and GetSetting methods), so if you want to ensure that GetPathArg has convenient and legible behavior, it helps to be to easily see how one GetPathArg call responds to different command lines. Having to think about how different GetPathArg calls that would never appear together in code would respond to the same command line separates the test from reality and makes it harder to understand.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

  • Fix GetPathArg negated argument handling. Return path{} not path{"0"} when path argument is negated.

👍

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 5effc38 -> 693cdca (pr/patharg.2 -> pr/patharg.3, compare) adding @prusnak's commit from #24274 and tweaking comments and tests following some suggestions.

re: #24306 (comment)

I wonder what you would think (conceptually) about taking it a step further (not necessarily in this PR) and remove LocalTestingSetup which complicates understanding of the test code as one needs to take into account more code: kiminuo@dceafb6

I think the LocalTestingSetup in get_args is actually a pretty good and appropriate use of a test fixture. But I understand that test fixtures are less widely used in our codebase than some other codebases, so people may want to get rid of them. Normally test fixtures are used to write small isolated tests, but for some reason we tend to write large multipart tests. In any case, I wouldn't object to getting rid of this test fixture if you also renane m_local_args to local_args to reflect the fact that it is no longer a member variable.

re: #24306 (comment)

@ryanofsky Maybe incorporate 7db1e1f into this PR?

Thanks! Added as third commit

Comment on lines 254 to 269
ResetArgs("-dir=override");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"override"});
ResetArgs("");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"default"});
ResetArgs("-nodir");
BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{""});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Let's add the same checks but without the second parameter (default value) given:

    ResetArgs("-dir=override");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{"override"});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"override"});
    ResetArgs("");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{"default"});
    ResetArgs("-nodir");
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir"), fs::path{});
    BOOST_CHECK_EQUAL(m_local_args.GetPathArg("-dir", "default"), fs::path{});

Those cases are already covered above, and I think this style of writing the test makes it less readable and useful as a clear description of GetPathArg behavior. GetPathArg is supposed to be a convenience function (it's a high-level wrapper around lower-level GetArg and GetSetting methods), so if you want to ensure that GetPathArg has convenient and legible behavior, it helps to be to easily see how one GetPathArg call responds to different command lines. Having to think about how different GetPathArg calls that would never appear together in code would respond to the same command line separates the test from reality and makes it harder to understand.

@@ -336,6 +326,16 @@ class ArgsManager
*/
std::string GetArg(const std::string& strArg, const std::string& strDefault) const;

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Do we really need to change the order of methods in the header file? This makes the review unnecessarily harder than it has to be.

It is just moving a comment, and the comment is entirely rewritten at this point, but if you still think this is a problem, just let me know what would be a better way to fix this.

/**
* Return path argument or default value
*
* It is guaranteed that the returned path has no trailing slashes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

unless the default value has them

Thanks, rewrote this

@kiminuo
Copy link
Contributor

kiminuo commented Feb 11, 2022

I wonder what you would think (conceptually) about taking it a step further (not necessarily in this PR) and remove LocalTestingSetup which complicates understanding of the test code as one needs to take into account more code: kiminuo@dceafb6

I think the LocalTestingSetup in get_args is actually a pretty good and appropriate use of a test fixture. But I understand that test fixtures are less widely used in our codebase than some other codebases, so people may want to get rid of them. Normally test fixtures are used to write small isolated tests, but for some reason we tend to write large multipart tests.

For me it's simply a matter of "how much code I need to read to understand the test fully" (not about test fixtures in general). With LocalTestingSetup it's more code (not trivial imo) and I fear I might miss something. For this reason alone, I think it's worth to apply the suggestion because more people will be able to read the test with high confidence that test is correct and they understand all the prerequisites. Obviously, I'm not a great C++ programmer but then my argument is general: Adding the test fixture for these tests do not seem to add substantial value and without the test fixture we can clearly see what is actually needed to be done for the tests to pass (I find it very educational). If I fail to see some test fixture advantages, please let me know. Anyway, it's just a suggestion. Feel free to apply it or ignore it as you find it the best.

In any case, I wouldn't object to getting rid of this test fixture if you also renane m_local_args to local_args to reflect the fact that it is no longer a member variable.

Sure. 👍 Here: kiminuo@a01d80c

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 693cdca

693cdca commit message "Use GetFileArg where possible" looks outdated, no?

@@ -135,7 +135,7 @@ static const char* BITCOIN_PID_FILENAME = "bitcoind.pid";

static fs::path GetPidFile(const ArgsManager& args)
{
return AbsPathForConfigVal(fs::PathFromString(args.GetArg("-pid", BITCOIN_PID_FILENAME)));
return AbsPathForConfigVal(args.GetPathArg("-pid", BITCOIN_PID_FILENAME));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

693cdca

Suggested change
return AbsPathForConfigVal(args.GetPathArg("-pid", BITCOIN_PID_FILENAME));
return AbsPathForConfigVal(args.GetPathArg("-pid", fs::path{BITCOIN_PID_FILENAME}));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Suggestion seems worse to me, since it is hardcoding assumptions about the type of BITCOIN_PID_FILENAME argument and the GetPathArg parameter in a place where they should not be relevant. If either of those type changes, or if more efficient overloads are added, this code could break or be deoptimized unnecessarily. This also adds unhelpful noise/verbosity to the code. I think the advantage of using statically typed langauges like C++ is that you declare types up front, and after that rely on the compiler to detect errors and optimize things appropriately, and don't need to add manual type casts or type checks in places like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is similar reasoning applicable to 5b946ed "util, refactor: Use GetPathArg to read "-settings" value":

fs::path settings = GetPathArg("-settings", fs::path{BITCOIN_SETTINGS_FILENAME});
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is similar reasoning applicable to 5b946ed "util, refactor: Use GetPathArg to read "-settings" value":

Yes thanks for pointing that out. Since this PR was merged already, I wouldn't open a separate PR just to simplify that line, but there should be other chances to clean it up in the future

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed as a part of #24675.

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviews.

Updated 693cdca -> d7005c6 (pr/patharg.3 -> pr/patharg.4, compare) just taking suggestions to fix a commit message and extend PR to cover -asmap

re: #24306 (review)

693cdca commit message "Use GetFileArg where possible" looks outdated, no?

Thanks, updated the commit title now

re: #24306 (comment)

Feel free to apply it or ignore it as you find it the best.

In any case, I wouldn't object to getting rid of this test fixture if you also renane m_local_args to local_args to reflect the fact that it is no longer a member variable.

Sure. +1 Here: kiminuo@a01d80c

Thanks, like I said I'm perfectly happy with this change. I do think overlap with this PR is pretty minor (and it greatly increases the number of lines changed) so I posted it separately as #24375

@@ -135,7 +135,7 @@ static const char* BITCOIN_PID_FILENAME = "bitcoind.pid";

static fs::path GetPidFile(const ArgsManager& args)
{
return AbsPathForConfigVal(fs::PathFromString(args.GetArg("-pid", BITCOIN_PID_FILENAME)));
return AbsPathForConfigVal(args.GetPathArg("-pid", BITCOIN_PID_FILENAME));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

Suggestion seems worse to me, since it is hardcoding assumptions about the type of BITCOIN_PID_FILENAME argument and the GetPathArg parameter in a place where they should not be relevant. If either of those type changes, or if more efficient overloads are added, this code could break or be deoptimized unnecessarily. This also adds unhelpful noise/verbosity to the code. I think the advantage of using statically typed langauges like C++ is that you declare types up front, and after that rely on the compiler to detect errors and optimize things appropriately, and don't need to add manual type casts or type checks in places like this.

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 19, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #24455 (refactor: Split ArgsManager out of util/system by Empact)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK d7005c6

@ryanofsky ryanofsky force-pushed the pr/patharg branch 3 times, most recently from d7005c6 to b1443f7 Compare February 25, 2022 16:41
Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated d7005c6 -> b1443f7 (pr/patharg.4 -> pr/patharg.5, compare), just tweaking comment as suggested

if (asmap_path.empty()) {
asmap_path = fs::PathFromString(DEFAULT_ASMAP_FILENAME);
}
fs::path asmap_path = args.GetPathArg("-asmap", DEFAULT_ASMAP_FILENAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this may not be equivalent on WIN32: PathFromString is not called on the default value in GetPathArg, so DEFAULT_ASMAP_FILENAME, a char *, is interpreted / converted by GetPathArg into a fs::path. I see path(const char* c) : std::filesystem::path(c) {} in fs.h. PathFromString is different on WIN32: u8path(string).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this may not be equivalent on WIN32: PathFromString is not called on the default value in GetPathArg, so DEFAULT_ASMAP_FILENAME, a char *, is interpreted / converted by GetPathArg into a fs::path. I see path(const char* c) : std::filesystem::path(c) {} in fs.h. PathFromString is different on WIN32: u8path(string).

That's correct. The behavior is inherently different on different platforms because windows path objects represent paths with wchar_t while POSIX paths are represented with char. On windows, the fs::path const char* constructor will call the std::filesystem::path constructor and do a locale conversion from char to wchar_t using the the current code page. Normally we want to avoid conversions based on the code page, and the reason we define an fs::path class instead of using std::filesystem::path is to prevent these conversions from happening accidentally. But in this case, DEFAULT_ASMAP_FILENAME is just ASCII text, so the conversion should be harmless and always result in the same path, regardless of code page. That's why the const char* constructor is allowed, and the reason for the code comment "Allow literal string arguments, which are safe as long as the literals are ASCII."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We simply assume that char pointers are ASCII, but this is not true. In fact the unit tests get this wrong by passing char pointer to a string containing non ASCII chars (emojis).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

We simply assume that char pointers are ASCII, but this is not true. In fact the unit tests get this wrong by passing char pointer to a string containing non ASCII chars (emojis).

I don't think I understand. I'm not aware of any places where unit tests or other code are doing something wrong. It is true that the fs::path interface does not guarantee there are no invalid conversions. It's just trying to add some safety and prevent unintended conversions. In any case, I'm happy to make changes if there are any suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example in the unit test:

    fs::path tmpfile1 = tmpfolder / "fs_tests_₿_🏃";

I think the second argument is a char pointer, but it is not ascii

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

I think the second argument is a char pointer, but it is not ascii

You're right, that's a real bug. Created #24469 to fix and maybe implement a followup

@Empact
Copy link
Contributor

Empact commented Mar 1, 2022

Concept ACK

ryanofsky and others added 3 commits March 2, 2022 06:09
Let GetPathArg method be used more places for path arguments that have
default values, like "-settings" and BITCOIN_SETTINGS_FILENAME in the
next commit.

Also:

- Fix negated argument handling. Return path{} not path{"0"} when path
  argument is negated.

- Add new tests for default and negated cases

- Move GetPathArg() method declaration next to GetArg() declarations.
  The two methods are close substitutes for each other, so this should
  help keep them consistent and make them more discoverable.
Take advantage of GetPathArg to simplify code slightly.
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Mar 2, 2022
…_tests test file.

5d7f225 Do not use `LocalTestingSetup` in getarg_tests test file. (Kiminuo)

Pull request description:

  Avoid using a test fixture in getarg_tests for better readability. Change was implemented by _kiminuo_ and posted bitcoin/bitcoin#24306 (comment)

ACKs for top commit:
  kiminuo:
    ACK 5d7f225

Tree-SHA512: 0fd98622010e6923e91c66447a1d0861bf344a65d86a313dff7d428c089b1740a25f699327f6ed4c163255f270bcbd4f7be962bb551862214f9b9e395d40df04
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 2, 2022
…est file.

5d7f225 Do not use `LocalTestingSetup` in getarg_tests test file. (Kiminuo)

Pull request description:

  Avoid using a test fixture in getarg_tests for better readability. Change was implemented by _kiminuo_ and posted bitcoin#24306 (comment)

ACKs for top commit:
  kiminuo:
    ACK 5d7f225

Tree-SHA512: 0fd98622010e6923e91c66447a1d0861bf344a65d86a313dff7d428c089b1740a25f699327f6ed4c163255f270bcbd4f7be962bb551862214f9b9e395d40df04
Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased b1443f7 -> 60aa179 (pr/patharg.5 -> pr/patharg.6, compare) due to conflict with #24375

if (asmap_path.empty()) {
asmap_path = fs::PathFromString(DEFAULT_ASMAP_FILENAME);
}
fs::path asmap_path = args.GetPathArg("-asmap", DEFAULT_ASMAP_FILENAME);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24306 (comment)

We simply assume that char pointers are ASCII, but this is not true. In fact the unit tests get this wrong by passing char pointer to a string containing non ASCII chars (emojis).

I don't think I understand. I'm not aware of any places where unit tests or other code are doing something wrong. It is true that the fs::path interface does not guarantee there are no invalid conversions. It's just trying to add some safety and prevent unintended conversions. In any case, I'm happy to make changes if there are any suggestions.

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Mar 3, 2022
Call fs::u8path to convert some UTF-8 string literals to paths, instead
of relying on implicit conversions. The implicit conversions incorrectly
decode const char* paths using the current windows codepage, instead of
treating them as UTF-8. This could cause test failures depending what
environment windows tests are run in.

Issue was reported by MarcoFalke <falke.marco@gmail.com> in
bitcoin#24306 (comment)
Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested ACK 60aa179 on Ubuntu 21.10

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 60aa179

@maflcko maflcko merged commit 6687bb2 into bitcoin:master Mar 7, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 7, 2022
maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Mar 10, 2022
…ing paths

2f5fd3c test: Correctly decode UTF-8 literal string paths (Ryan Ofsky)

Pull request description:

  Call `fs::u8path()` to convert some UTF-8 string literals to paths, instead of relying on the implicit conversion. Fake Macro pointed out in bitcoin/bitcoin#24306 (comment) that `fs_tests` are incorrectly decoding some literal UTF-8 paths using the current windows codepage, instead of treating them as UTF-8. This could cause test failures depending what environment windows tests are run under.

  The `fs::path` class exists to avoid problems like this, but because it is lenient with `const char*` conversions, under assumption that they are ["safe as long as the literals are ASCII"](https://github.com/bitcoin/bitcoin/blob/727b0cb59259ac63c627b09b503faada1a89bfb8/src/fs.h#L39), bugs like this are still possible.

  If we think this is a concern, followup options to try to prevent this bug in the future are:

  0. Do nothing
  1. Improve the "safe as long as the literals are ASCII" comment. Make it clear that non-ASCII strings are invalid.
  2. Drop the implicit `const char*` conversion functions. This would be nice because it would simplifify the `fs::path` class a little, while making it safer. Drawback is that it would require some more verbosity from callers. For example, instead of `GetDataDirNet() / "mempool.dat"` they would have to write `GetDataDirNet() / fs::u8path("mempool.dat")`
  3. Keep the implicit `const char*` conversion functions, but make them call `fs::u8path()` internally. Change the "safe as long as the literals are *ASCII*" comment to "safe as long as the literals are *UTF-8*".

  I'd be happy with 0, 1, or 2. I'd be a little resistant to 3 even though it was would add more safety, because it would slightly increase complexity, and because I think it would encourage representing paths as strings, when I think there are so many footguns associated with paths as strings, that it's best to convert strings to paths at the earliest point possible, and convert paths to strings at the latest point possible.

ACKs for top commit:
  laanwj:
    Code review ACK 2f5fd3c
  w0xlt:
    crACK 2f5fd3c

Tree-SHA512: 9c56714744592094d873b79843b526d20f31ed05eff957d698368d66025764eae8bfd5305d5f7b6cc38803f0d85fa5552003e5c6cacf1e076ea6d313bcbc960c
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 11, 2022
2f5fd3c test: Correctly decode UTF-8 literal string paths (Ryan Ofsky)

Pull request description:

  Call `fs::u8path()` to convert some UTF-8 string literals to paths, instead of relying on the implicit conversion. Fake Macro pointed out in bitcoin#24306 (comment) that `fs_tests` are incorrectly decoding some literal UTF-8 paths using the current windows codepage, instead of treating them as UTF-8. This could cause test failures depending what environment windows tests are run under.

  The `fs::path` class exists to avoid problems like this, but because it is lenient with `const char*` conversions, under assumption that they are ["safe as long as the literals are ASCII"](https://github.com/bitcoin/bitcoin/blob/727b0cb59259ac63c627b09b503faada1a89bfb8/src/fs.h#L39), bugs like this are still possible.

  If we think this is a concern, followup options to try to prevent this bug in the future are:

  0. Do nothing
  1. Improve the "safe as long as the literals are ASCII" comment. Make it clear that non-ASCII strings are invalid.
  2. Drop the implicit `const char*` conversion functions. This would be nice because it would simplifify the `fs::path` class a little, while making it safer. Drawback is that it would require some more verbosity from callers. For example, instead of `GetDataDirNet() / "mempool.dat"` they would have to write `GetDataDirNet() / fs::u8path("mempool.dat")`
  3. Keep the implicit `const char*` conversion functions, but make them call `fs::u8path()` internally. Change the "safe as long as the literals are *ASCII*" comment to "safe as long as the literals are *UTF-8*".

  I'd be happy with 0, 1, or 2. I'd be a little resistant to 3 even though it was would add more safety, because it would slightly increase complexity, and because I think it would encourage representing paths as strings, when I think there are so many footguns associated with paths as strings, that it's best to convert strings to paths at the earliest point possible, and convert paths to strings at the latest point possible.

ACKs for top commit:
  laanwj:
    Code review ACK 2f5fd3c
  w0xlt:
    crACK 2f5fd3c

Tree-SHA512: 9c56714744592094d873b79843b526d20f31ed05eff957d698368d66025764eae8bfd5305d5f7b6cc38803f0d85fa5552003e5c6cacf1e076ea6d313bcbc960c
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 25, 2022
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 25, 2022
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 25, 2022
hebasto added a commit to hebasto/bitcoin that referenced this pull request Apr 21, 2022
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this pull request Jul 24, 2022
Call fs::u8path to convert some UTF-8 string literals to paths, instead
of relying on implicit conversions. The implicit conversions incorrectly
decode const char* paths using the current windows codepage, instead of
treating them as UTF-8. This could cause test failures depending what
environment windows tests are run in.

Issue was reported by MarcoFalke <falke.marco@gmail.com> in
bitcoin/bitcoin#24306 (comment)
fanquake added a commit to bitcoin-core/gui that referenced this pull request Aug 4, 2022
…idely

b01f336 util, refactor: Drop explicit conversion to fs::path (Hennadii Stepanov)
138c668 util, refactor: Use GetPathArg to read "-rpccookiefile" value (Hennadii Stepanov)
1276090 util, refactor: Use GetPathArg to read "-conf" value (Hennadii Stepanov)

Pull request description:

  This PR is a continuation of bitcoin/bitcoin#24265 and bitcoin/bitcoin#24306.

  Now the following command-line arguments / configure options been read with the `GetPathArg` method:
  - `-conf`, also `includeconf` values been normalized
  - `-rpccookiefile`

ACKs for top commit:
  jarolrod:
    Code Review ACK b01f336
  ryanofsky:
    Code review ACK b01f336. Changes since last review: just dropping first commit (NormalizedPathFromString) as suggested

Tree-SHA512: 2d26d50b73542acdbcc63a32068977b2a49a017d31ca337471a0446f964eb0a6e3e4e3bb1ebe6771566a260f2cae3bc2ebe93b4b523183cea0d51768daab85c9
Rspigler pushed a commit to Rspigler/bitcoin that referenced this pull request Aug 21, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Mar 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants