Skip to content

Conversation

ryanofsky
Copy link
Contributor

Add more fs::path operator/ and operator+ overloads to prevent unsafe string->path conversions on Windows that would cause strings to be decoded according to the current Windows locale & code page instead of the correct string encoding.

Update application code to deal with loss of implicit string->path conversions by calling fs::u8path or fs::PathFromString explicitly, or by just changing variable types from std::string to fs::path to avoid conversions altogether, or make them happen earlier.

In all cases, there's no change in behavior either (1) because strings only contained ASCII characters and would be decoded the same regardless of what encoding was used, or (2) because of the 1:1 mapping between paths and strings using the PathToString and PathFromString functions.

Motivation for this PR was just that I was experimenting with #24469 and noticed that operations like fs::path / std::string were allowed, and I thought it would be better not to allow them.

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 3, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #18904 (Don't call lsn_reset in periodic flush by bvbfan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@hebasto
Copy link
Member

hebasto commented Mar 5, 2022

Concept ACK.

@laanwj
Copy link
Member

laanwj commented Apr 7, 2022

Concept ACK

…operators

Add more fs::path operator/ and operator+ overloads to prevent unsafe
string->path conversions on Windows that would cause strings to be
decoded according to the current Windows locale & code page instead of
the correct string encoding.

Update application code to deal with loss of implicit string->path
conversions by calling fs::u8path or fs::PathFromString explicitly, or
by just changing variable types from std::string to fs::path to avoid
conversions altoghther, or make them happen earlier.

In all cases, there's no change in behavior either (1) because strings
only contained ASCII characters and would be decoded the same regardless
of what encoding was used, or (2) because of the 1:1 mapping between
paths and strings using the PathToString and PathFromString functions.

Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
@ryanofsky
Copy link
Contributor Author

Rebased 67ca71e -> cb9f6b1 (pr/patha.1 -> pr/patha.2, compare) due to conflict with #23732, and adding fix for windows build error https://cirrus-ci.com/task/4639862308470784

@hebasto
Copy link
Member

hebasto commented Apr 22, 2022

Maybe add

diff --git a/src/qt/guiutil.cpp b/src/qt/guiutil.cpp
index 1e67b1a0d..706cf2f7d 100644
--- a/src/qt/guiutil.cpp
+++ b/src/qt/guiutil.cpp
@@ -508,7 +508,7 @@ fs::path static StartupShortcutPath()
         return GetSpecialFolderPath(CSIDL_STARTUP) / "Bitcoin.lnk";
     if (chain == CBaseChainParams::TESTNET) // Remove this special case when CBaseChainParams::TESTNET = "testnet4"
         return GetSpecialFolderPath(CSIDL_STARTUP) / "Bitcoin (testnet).lnk";
-    return GetSpecialFolderPath(CSIDL_STARTUP) / strprintf("Bitcoin (%s).lnk", chain);
+    return GetSpecialFolderPath(CSIDL_STARTUP) / fs::u8path(strprintf("Bitcoin (%s).lnk", chain));
 }
 
 bool GetStartOnSystemStartup()
diff --git a/src/wallet/bdb.cpp b/src/wallet/bdb.cpp
index 82693eae8..dbc297f45 100644
--- a/src/wallet/bdb.cpp
+++ b/src/wallet/bdb.cpp
@@ -302,7 +302,7 @@ BerkeleyDatabase::~BerkeleyDatabase()
         assert(!m_db);
         size_t erased = env->m_databases.erase(m_filename);
         assert(erased == 1);
-        env->m_fileids.erase(m_filename);
+        env->m_fileids.erase(fs::PathToString(m_filename));
     }
 }
 

?

It makes CI greener -- https://cirrus-ci.com/build/6528392144093184 :)

src/fs.h Outdated
return p1;
}
static inline path operator/(path p1, std::filesystem::path p2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why std::filesystem::path for second parameter? Isn't it safer to limit the usage of std::filesystem::path only within the fs::path wrapper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24470 (comment)

Why std::filesystem::path for second parameter? Isn't it safer to limit the usage of std::filesystem::path only within the fs::path wrapper?

Good point, fs::path is safer and this doesn't seem to be necessary, now dropped

src/fs.h Outdated
Comment on lines 95 to 114
static inline path operator/(path p1, path p2)
{
p1 += std::move(p2);
p1 /= std::move(p2);
return p1;
}
static inline path operator/(path p1, std::filesystem::path p2)
{
p1 /= std::move(p2);
return p1;
}
static inline path operator/(path p1, const char* p2)
{
p1 /= p2;
return p1;
}
static inline path operator+(path p1, const char* p2)
{
p1 += p2;
return p1;
}
static inline path operator+(path p1, path::value_type p2)
{
p1 += p2;
return p1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering why there's a difference between the allowed right-hand operands for / and +

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24470 (comment)

Wondering why there's a difference between the allowed right-hand operands for / and +

The cases that are listed are just cases that seemed safe and are in used by existing code. (I think it'd be fine to add cases that are safe and aren't in use by existing code, too, but I didn't go looking for these.)

On the specific cases, combining two path objects with / is safe and useful. Combining two path objects with + is safe probably not useful. Appending a literal string is safe to a path is safe (assuming the literal string is ASCII) both with + and /. Adding a native character to a path is safe and semi-useful (used by some tests at least)

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated cb9f6b1 -> f64aa9c (pr/patha.2 -> pr/patha.3, compare) adding hebasto's fixes for windows build errors https://cirrus-ci.com/build/6528392144093184

src/fs.h Outdated
return p1;
}
static inline path operator/(path p1, std::filesystem::path p2)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24470 (comment)

Why std::filesystem::path for second parameter? Isn't it safer to limit the usage of std::filesystem::path only within the fs::path wrapper?

Good point, fs::path is safer and this doesn't seem to be necessary, now dropped

src/fs.h Outdated
Comment on lines 95 to 114
static inline path operator/(path p1, path p2)
{
p1 += std::move(p2);
p1 /= std::move(p2);
return p1;
}
static inline path operator/(path p1, std::filesystem::path p2)
{
p1 /= std::move(p2);
return p1;
}
static inline path operator/(path p1, const char* p2)
{
p1 /= p2;
return p1;
}
static inline path operator+(path p1, const char* p2)
{
p1 += p2;
return p1;
}
static inline path operator+(path p1, path::value_type p2)
{
p1 += p2;
return p1;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: #24470 (comment)

Wondering why there's a difference between the allowed right-hand operands for / and +

The cases that are listed are just cases that seemed safe and are in used by existing code. (I think it'd be fine to add cases that are safe and aren't in use by existing code, too, but I didn't go looking for these.)

On the specific cases, combining two path objects with / is safe and useful. Combining two path objects with + is safe probably not useful. Appending a literal string is safe to a path is safe (assuming the literal string is ASCII) both with + and /. Adding a native character to a path is safe and semi-useful (used by some tests at least)

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f64aa9c

@maflcko maflcko merged commit 12455ac into bitcoin:master May 3, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 4, 2022
@bitcoin bitcoin locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants