-
Notifications
You must be signed in to change notification settings - Fork 26.6k
config: introduce safe.bareRepository
and protected config
#1261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f00e91b
to
96038cd
Compare
/preview |
Preview email sent as pull.1261.git.git.1651858667271.gitgitgadget@gmail.com |
96038cd
to
3370258
Compare
/submit |
Submitted as pull.1261.git.git.1651861810633.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Taylor Blau wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Taylor Blau wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Glen Choo wrote (reply to this): Hi Taylor,
Taylor Blau <me@ttaylorr.com> writes:
> Hi Glen,
>
> On Fri, May 06, 2022 at 06:30:10PM +0000, Glen Choo via GitGitGadget wrote:
>> From: Glen Choo <chooglen@google.com>
>>
>> Add a config variable, `safe.barerepository`, that tells Git whether or
>> not to recognize bare repositories when it is trying to discover the
>> repository. This only affects repository discovery, thus it has no
>> effect if discovery was not done (e.g. `--git-dir` was passed).
>
> To summarize, this proposal attempts to work around the problem of
> embedding bare repositories in non-bare checkouts by providing a way to
> opt-out of bare repository discovery (which is to only discover things
> that are listed in the safe.bareRepository configuration).
>
> I agree that this would prevent the problem you're trying to solve, but
> I have significant concerns that this patch is going too far (at the
> risk of future damage to unrelated workflows) in order to accomplish
> that goal.
Thanks again for the careful read. As I understand it, your concern is
that making bare repository discovery configurable and then flipping the
default to e.g. never detecting bare repositories is too disruptive to
fix the embedded bare repository problem. And to avoid disrupting
non-embedded bare repositories, you would prefer to pursue a more
targeted fix.
If the problem statement were limited to embedded bare repositories,
then I agree that this is way more than overkill, and that a targeted
solution would be preferable.
More generally however, the problem of embedded bare repositories seems
to suggest that bare repository discovery doesn't serve all users well,
and in fact, may even be a net negative for a subset of users. I'd be
interested in hearing your thoughts from that perspective, e.g.
- Should bare repository discovery should be configurable?
- What is a good default for bare repository discovery? (regardless of
how feasible changing the default is)
This is a somewhat different direction from how the conversation started
(I hope it doesn't look like I'm shifting the goal posts), but I think
it's a good opportunity to step back and simplify something that we
wished we got right in the beginning.
And even if we don't flip the default, shipping the config value still
seems useful e.g. there's a good amount of interest in disabling bare
repository discovery at $DAYJOB (and I think we'll get a lot of
interesting results once we do).
>> safe.barerepository is presented to users as an allow-list of
>> directories that Git will recognize as a bare repository during the
>> repository discovery process (much like safe.directory), but this patch
>> only implements (and permits) boolean behavior (i.e. on, off and unset).
>> Hopefully, this gives us some room to discuss and experiment with
>> possible formats.
>>
>> Thanks to Taylor for suggesting the allow-list idea :)
>
> I did suggest an allow-list, but not this one ;-).
Ah, yes. Oops. Sorry if it looked like I was putting words in your
mouth.
What I really meant was that an allow-list (untethered from any specific
purpose) seems like a useful 'UI primitive', so thanks for bringing up
the option.
>> I think the core concept of letting users toggle bare repo discovery is
>> solid, but I'm sending this as RFC for the following reasons:
>>
>> * I don't love the name safe.barerepository, because it feels like Git
>> is saying that bare repos are unsafe and consequently, that bare repo
>> users are behaving unsafely. On the other hand, this is quite similar
>> to safe.directory in a few respects, so it might make sense for the
>> naming to reflect that.
>
> Yes, the concerns I outlined above are definitely echoing this
> sentiment. Another way to say it is that this feels like too big of a
> hammer (i.e., it is targeting _all_ bare repositories, not just embedded
> ones) for too small of a nail (embedded bare repositories). As you're
> probably sick of hearing me say by now, I would strongly prefer a more
> targeted solution (perhaps what I outlined, or perhaps something else,
> so long as it doesn't break non-embedded bare repositories if/ever we
> decided to change the default value of safe.bareRepository).
Ok, yeah I think safe.barerepository is a terrible way to achieve my
purported goal of 'making bare repository discovery
configurable/simpler/' - using the "safe." namespace makes it impossible
to see this as anything other than protection against dangerous, unknown
bare repositories. I'll drop the idea of safe-listing known bare
repositories for now, that seems unproductive.
'Optionally disable bare repository discovery' still sounds like it's on
the table though, but probably with a different kind of UX e.g.
"discovery.barerepository" with the options:
- always: always discover bare repos
- never: never discover bare repos
- cwd-only: only discover bare repos if they are the cwd
- dotgit-only: only discover bare repos if they are a descendant of
.git/
>> * The *-gcc CI jobs don't pass. I haven't discerned any kind of pattern
>> yet.
>
> Interesting. I wouldn't expect this to be the case (since the default is
> to allow everything right now).
This might be a false alarm - I saw similar failures on an unrelated
patch. I think my "master" is just out of date :( |
3370258
to
b1427c0
Compare
b1427c0
to
62070aa
Compare
/submit |
Submitted as pull.1261.v2.git.git.1652485058.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> * die()-ing is necessary if we're trying to flip the default value of
> discovery.bare. We'd expect many bare repo users to be broken, and it's
> more helpful to fail loudly than to silently ignore the bare repo.
>
> But in the long term, long after we've flipped the default and users know
> that they need to opt into bare repo discovery, would it be a better UX
> to just silently ignore the bare repo?
Would a middle-ground of giving a warning() message help? Can it be
loud and annoying enough to knudge the users to adjust without
breaking the functionality?
The longer-term default should be "cwd is allowed, but we do not
bother going up from object/04 subdirectory of a bare repository",
not "bare repositories should not be usable at all without GIT_DIR".
> + Add a config variable, `discovery.bare`, that tells Git whether or not
> + it should work with the bare repository it has discovered i.e. Git will
> + die() if it discovers a bare repository, but it is not allowed by
Missing comma before "i.e."
> ++discovery.bare::
> ++ Specifies what kinds of directories Git can recognize as a bare
> ++ repository when looking for the repository (aka repository
> + discovery). This has no effect if repository discovery is not
> + performed e.g. the path to the repository is set via `--git-dir`
> + (see linkgit:git[1]).
> ++
> +This config setting is only respected when specified in a system or global
> +config, not when it is specified in a repository config or via the command
> ++line option `-c discovery.bare=<value>`.
;-)
> +++
> ++The currently supported values are `always` (Git always recognizes bare
> ++repositories) and `never` (Git never recognizes bare repositories).
> ++This defaults to `always`, but this default is likely to change.
> +++
> ++If your workflow does not rely on bare repositories, it is recommended that
> ++you set this value to `never`. This makes repository discovery easier to
> ++reason about and prevents certain types of security and non-security
> ++problems, such as:
Hopefully "git fetch" over ssh:// and file:/// would run the other
side with GIT_DIR explicitly set? As long as this recommendation
does not break these use cases, I think we are OK, but I do not yet
find these "problems, such as..." so convincing. |
On the Git mailing list, Junio C Hamano wrote (reply to this): "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> t/t0034-discovery-bare.sh | 69 +++++++++++++++++++++++
This number is already in use by an in-flight topic, if I am not
mistaken. Please make it a habit to always check your topic works
well when merged to 'next' and to 'seen'.
Thanks.
|
Documentation/config/discovery.txt
Outdated
@@ -0,0 +1,24 @@ | |||
discovery.bare:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Glen Choo wrote (reply to this):
"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Glen Choo <chooglen@google.com>
>
> Add a config variable, `discovery.bare`, that tells Git whether or not
> it should work with the bare repository it has discovered i.e. Git will
> die() if it discovers a bare repository, but it is not allowed by
> `discovery.bare`. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).
>
> This is motivated by the fact that some workflows don't use bare
> repositories at all, and users may prefer to opt out of bare repository
> discovery altogether:
>
> - An easy assumption for a user to make is that Git commands run
> anywhere inside a repository's working tree will use the same
> repository. However, if the working tree contains a bare repository
> below the root-level (".git" is preferred at the root-level), any
> operations inside that bare repository use the bare repository
> instead.
>
> In the worst case, attackers can use this confusion to trick users
> into running arbitrary code (see [1] for a deeper discussion). But
> even in benign situations (e.g. a user renames ".git/" to ".git.old/"
> and commits it for archival purposes), disabling bare repository
> discovery can be a simpler mode of operation (e.g. because the user
> doesn't actually want to use ".git.old/") [2].
>
> - Git won't "accidentally" recognize a directory that wasn't meant to be
> a bare repository, but happens to resemble one. While such accidents
> are probably very rare in practice, this lets users reduce the chance
> to zero.
>
> This config is an enum of:
>
> - ["always"|(unset)]: always recognize bare repositories (like Git does
> today)
> - "never": never recognize bare repositories
>
> More values are expected to be added later, and the default is expected
> to change (i.e. to something other than "always").
>
> [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
> [2]: I don't personally know anyone who does this as part of their
> normal workflow, but a cursory search on GitHub suggests that there is a
> not insubstantial number of people who munge ".git" in order to store
> its contents.
>
> https://github.com/search?l=&o=desc&p=1&q=ref+size%3A%3C1000+filename%3AHEAD&s=indexed&type=Code
> (aka search for the text "ref", size:<1000, filename:HEAD)
>
> Signed-off-by: Glen Choo <chooglen@google.com>
The intended commit message ends here...
> WIP setup.c: make discovery.bare die on failure
>
> Signed-off-by: Glen Choo <chooglen@google.com>
Ugh, dumb mistake (bad squash). Fortunately this was one of my more
professional-sounding WIP commit messages.
On the Git mailing list, Glen Choo wrote (reply to this): Junio C Hamano <gitster@pobox.com> writes:
> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> * die()-ing is necessary if we're trying to flip the default value of
>> discovery.bare. We'd expect many bare repo users to be broken, and it's
>> more helpful to fail loudly than to silently ignore the bare repo.
>>
>> But in the long term, long after we've flipped the default and users know
>> that they need to opt into bare repo discovery, would it be a better UX
>> to just silently ignore the bare repo?
>
> Would a middle-ground of giving a warning() message help? Can it be
> loud and annoying enough to knudge the users to adjust without
> breaking the functionality?
Personally, when my tool changes its behavior, I would strongly prefer
it to die than to "change behavior + warn". I'd feel more comfortable
knowing that the tool did nothing as opposed to doing the wrong thing
and only being informed after the fact. Also, I sometimes ignore
warnings ;)
When we _do_ transition away from die(), ignore + warning() sounds like
a good first step.
But if any of this flies in the face of the project's conventions, let
me know as such.
>> + Add a config variable, `discovery.bare`, that tells Git whether or not
>> + it should work with the bare repository it has discovered i.e. Git will
>> + die() if it discovers a bare repository, but it is not allowed by
>
> Missing comma before "i.e."
Thanks.
>> +++
>> ++The currently supported values are `always` (Git always recognizes bare
>> ++repositories) and `never` (Git never recognizes bare repositories).
>> ++This defaults to `always`, but this default is likely to change.
>> +++
>> ++If your workflow does not rely on bare repositories, it is recommended that
>> ++you set this value to `never`. This makes repository discovery easier to
>> ++reason about and prevents certain types of security and non-security
>> ++problems, such as:
>
> Hopefully "git fetch" over ssh:// and file:/// would run the other
> side with GIT_DIR explicitly set?
Ah, I'll check this and get back to you.
> I do not yet
> find these "problems, such as..." so convincing.
What would be a convincing rationale to you? I'll capture that here.
I'm assuming that you already have such an rationale in mind when you
say that the longer-term default is that "we respect bare repositories
only if they are the cwd.". I'm also assuming that this rationale is
something other than embedded bare repos, because "cwd-only" does not
protect against that.
Perhaps "never" sounds better to folks who don't ever expect bare
repositories and want to lock down the environment. Randall (cc-ed)
suggests one such use case in [1].
(To Randall: Oops, I actually meant to cc you earlier, since you were
the first to suggest a practical use case for never allowing bare repos.
It must've slipped my mind).
[1] https://lore.kernel.org/git/005d01d84ad0$782e8fc0$688baf40$@nexbridge.com. |
Documentation/config/discovery.txt
Outdated
@@ -0,0 +1,24 @@ | |||
discovery.bare:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> Add a config variable, `discovery.bare`, that tells Git whether or not
> it should work with the bare repository it has discovered i.e. Git will
> die() if it discovers a bare repository, but it is not allowed by
> `discovery.bare`. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).
> This config is an enum of:
>
> - ["always"|(unset)]: always recognize bare repositories (like Git does
> today)
> - "never": never recognize bare repositories
>
> More values are expected to be added later, and the default is expected
> to change (i.e. to something other than "always").
I think it is fine to include the "never" option for users to opt-in to
this super-protected state, but I want to make it very clear that we
should never move to it as a new default. This phrasing of 'something
other than "always"' is key, but it might be good to point out that
"never" is very unlikely to be that default.
> WIP setup.c: make discovery.bare die on failure
>
> Signed-off-by: Glen Choo <chooglen@google.com>
Accidental concatenation of squashed commit?
> diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
> new file mode 100644
> index 00000000000..761cabe6e70
> --- /dev/null
> +++ b/Documentation/config/discovery.txt
> @@ -0,0 +1,24 @@
> +discovery.bare::
> + Specifies what kinds of directories Git can recognize as a bare
> + repository when looking for the repository (aka repository
> + discovery). This has no effect if repository discovery is not
> + performed e.g. the path to the repository is set via `--git-dir`
> + (see linkgit:git[1]).
Avoid "e.g." here.
This has no effect if the repository is specified directly via
the --git-dir command-line option or the GIT_DIR environment
variable.
> +This config setting is only respected when specified in a system or global
> +config, not when it is specified in a repository config or via the command
> +line option `-c discovery.bare=<value>`.
We are sprinkling config options that have these same restrictions throughout
the config documentation. It might be time to define a term like "protected
config" at the top of git-config.txt and then refer to that from these other
locations.
> +The currently supported values are `always` (Git always recognizes bare
> +repositories) and `never` (Git never recognizes bare repositories).
This sentence structure is likely to change in the future, and as it stands
will become complicated. A bulleted list will have easier edits in the future.
> +This defaults to `always`, but this default is likely to change.
For now, I would say "but this default may change in the future." instead.
> +If your workflow does not rely on bare repositories, it is recommended that
> +you set this value to `never`. This makes repository discovery easier to
> +reason about and prevents certain types of security and non-security
> +problems, such as:
> +
(You might need a "+" here.)
> +* `git clone`-ing a repository containing a malicious bare repository
> + inside it.
> +* Git recognizing a directory that isn't meant to be a bare repository,
> + but happens to look like one.
I think these last bits recommending the 'never' option are a bit
distracting. It doesn't make repository discovery "easier to reason
about" because we still discover the bare repo and die() instead of
skipping it and looking higher for a non-bare repository in the
parent directories. The case of an "accidentally-recognized bare
repo" is so unlikely it is probably not worth mention in these docs.
Instead, I think something like this might be better:
If you do not use bare repositories in your workflow, then it may
be beneficial to set `discovery.bare` to `never` in your global
config. This will protect you from attacks that involve cloning a
repository that contains a bare repository and running a Git
command within that directory.
> +static int check_bare_repo_allowed(void)
> +{
> + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
> + read_very_early_config(discovery_bare_cb, NULL);
This will add the third place where we use read_very_early_config(),
adding to the existing calls in tr2_sysenv_load() and
ensure_valid_ownership(). If I understand it correctly, that means
that every Git execution in a bare repository will now parse the
system and global config three times.
This doesn't count the check for uploadpack.packobjectshook in
upload-pack.c that uses current_config_scope() to restrict its
value to the system and global config.
We are probably at the point where we need to instead create a
configset that stores this "protected config" and allow us to
lookup config keys directly from that configset instead of
iterating through these config files repeatedly.
> + /* We didn't find a value; use the default. */
> + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN)
> + discovery_bare_config = DISCOVERY_BARE_ALWAYS;
This could also be done in advance of the config parsing
by setting discovery_bare_config = DISCOVERY_BARE_ALWAYS before
calling read_very_early_config(). Avoids an if and a comment
here, which might be nice.
> + }
> + switch (discovery_bare_config) {
> + case DISCOVERY_BARE_NEVER:
> + return 0;
> + case DISCOVERY_BARE_ALWAYS:
> + return 1;
> + default:
> + BUG("invalid discovery_bare_config %d", discovery_bare_config);
> + }
You return -1 in discovery_bare_cb when the key matches, but
the value is not understood. Should we check the return value
of read_very_early_config(), too?
> +static const char *discovery_bare_config_to_string(void)
> +{
> + switch (discovery_bare_config) {
> + case DISCOVERY_BARE_NEVER:
> + return "never";
> + case DISCOVERY_BARE_ALWAYS:
> + return "always";
> + default:
> + BUG("invalid discovery_bare_config %d", discovery_bare_config);
In general, I'm not sure these BUG() statements are helpful,
but they aren't hurting anything. I wonder if it would be
better to use DISCOVERY_BARE_UNKNOWN instead of default,
because then the compiler should notice that the switch needs
updating when a new enum mode is added.
> @@ -1142,7 +1195,8 @@ enum discovery_result {
> GIT_DIR_HIT_CEILING = -1,
> GIT_DIR_HIT_MOUNT_POINT = -2,
> GIT_DIR_INVALID_GITFILE = -3,
> - GIT_DIR_INVALID_OWNERSHIP = -4
> + GIT_DIR_INVALID_OWNERSHIP = -4,
> + GIT_DIR_DISALLOWED_BARE = -5
I think that you can add a comma at the end of this enum to avoid the
changed line the next time the enum needs to be expanded.
> };
>
> /*
> @@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
> }
>
> if (is_git_directory(dir->buf)) {
> + if (!check_bare_repo_allowed())
> + return GIT_DIR_DISALLOWED_BARE;
Won't this fail if someone runs a Git command inside of a .git/
directory for a non-bare repository? I just want to be sure that
we hit this error instead:
fatal: this operation must be run in a work tree
I see that this error is tested in t0008-ignores.sh, but that's
with the default "always" value. It would be good to explicitly
check that this is the right error when using the "never" config.
Thanks,
-Stolee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Taylor Blau wrote (reply to this):
On Mon, May 16, 2022 at 02:46:55PM -0400, Derrick Stolee wrote:
> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> > From: Glen Choo <chooglen@google.com>
> >
> > Add a config variable, `discovery.bare`, that tells Git whether or not
> > it should work with the bare repository it has discovered i.e. Git will
> > die() if it discovers a bare repository, but it is not allowed by
> > `discovery.bare`. This only affects repository discovery, thus it has no
> > effect if discovery was not done (e.g. `--git-dir` was passed).
>
> > This config is an enum of:
> >
> > - ["always"|(unset)]: always recognize bare repositories (like Git does
> > today)
> > - "never": never recognize bare repositories
> >
> > More values are expected to be added later, and the default is expected
> > to change (i.e. to something other than "always").
>
> I think it is fine to include the "never" option for users to opt-in to
> this super-protected state, but I want to make it very clear that we
> should never move to it as a new default. This phrasing of 'something
> other than "always"' is key, but it might be good to point out that
> "never" is very unlikely to be that default.
I am confused, then.
What does a user who has some legitimate (non-embedded) bare
repositories do if they are skeptical of other bare repositories? I
suspect the best answer we would be able to provide with these patches
is "use `--git-dir`".
What happens to a user who has a combination of legitimate bare
repositories, embedded bare repositories that they trust, and other
embedded bare repositories that they don't?
As far as I can tell, our recommendation with these tools would be to:
- run `git config --global discovery.bare never`, and
- include `--git-dir=$(pwd)` in any git invocations in bare
repositories that they do trust
This gets at my concerns from [1] and [2] (mostly [2], in this case)
that we're trying to close the embedded bare repos problem with an
overly broad solution, at the expense of usability.
I can't shake the feeling that something like I described towards the
bottom of [2] would give you all of the security guarantees you're after
without compromising on usability for non-embedded bare repositories.
I'm happy to explore this direction more myself if you don't want to. I
would just much rather see us adopt an approach that doesn't break more
use-cases than it has to if such a thing can be avoided.
I cannot endorse these patches as-is.
Thanks,
Taylor
[1]: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/
[2]: https://lore.kernel.org/git/YnmKwLoQCorBnMe2@nand.local/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Glen Choo wrote (reply to this):
Thanks for being thorough, I find it really helpful.
For brevity, I won't reply to comments that I think are obviously good,
so you can assume I'll incorproate anything that isn't commented on.
Derrick Stolee <derrickstolee@github.com> writes:
> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
>> From: Glen Choo <chooglen@google.com>
>>
>> +This config setting is only respected when specified in a system or global
>> +config, not when it is specified in a repository config or via the command
>> +line option `-c discovery.bare=<value>`.
>
> We are sprinkling config options that have these same restrictions throughout
> the config documentation. It might be time to define a term like "protected
> config" at the top of git-config.txt and then refer to that from these other
> locations.
Agree, and I think defining the term will be useful in future on-list
discussions.
>> +static int check_bare_repo_allowed(void)
>> +{
>> + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
>> + read_very_early_config(discovery_bare_cb, NULL);
>
> This will add the third place where we use read_very_early_config(),
> adding to the existing calls in tr2_sysenv_load() and
> ensure_valid_ownership(). If I understand it correctly, that means
> that every Git execution in a bare repository will now parse the
> system and global config three times.
>
> This doesn't count the check for uploadpack.packobjectshook in
> upload-pack.c that uses current_config_scope() to restrict its
> value to the system and global config.
>
> We are probably at the point where we need to instead create a
> configset that stores this "protected config" and allow us to
> lookup config keys directly from that configset instead of
> iterating through these config files repeatedly.
Looking at all of the read_very_early_config() calls,
- check_bare_repo_allowed() can use git_configset_get_string()
- ensure_valid_ownership() can use git_configset_get_value_multi()
- tr2_sysenv_load() reads every value with the "trace2." prefix. AFAICT
configsets only support exact key lookups and I don't see an easy way
teach configsets to support prefix lookups.
(I didn't look too closely at uploadpack.packobjectshook because I don't
know enough about config scopes to comment.)
So using a configset, we'll still need to read the config files at least
twice. That's better than thrice, but it doesn't cover the
tr2_sysenv_load() use case, and we'll run into this yet again if add
function that reads all config values with a given prefix.
An hacky alternative that covers all of these use cases would be to read
all protected config in a single pass, e.g.
static struct protected_config {
struct safe_directory_data safe_directory_data;
const char *discovery_bare;
struct string_list tr2_sysenv;
};
static int protected_config_cb()
{
/* Parse EVERYTHING that belongs in protected_config. */
}
but protected_config_cb() would have to parse too many unrelated things
for my liking.
So I'll use the configset for the cases where the key is known, and
perhaps we'll punt on tr2_sysenv_load().
>> + }
>> + switch (discovery_bare_config) {
>> + case DISCOVERY_BARE_NEVER:
>> + return 0;
>> + case DISCOVERY_BARE_ALWAYS:
>> + return 1;
>> + default:
>> + BUG("invalid discovery_bare_config %d", discovery_bare_config);
>> + }
>
> You return -1 in discovery_bare_cb when the key matches, but
> the value is not understood. Should we check the return value
> of read_very_early_config(), too?
This comment doesn't apply because unlike most other config reading
functions, read_very_early_config() and read_early_config() die when the
callback returns -1.
I'm not sure why this is the case though, and maybe you think there is
value in having a non-die()-ing variant, e.g.
read_very_early_config_gently()?
>> };
>>
>> /*
>> @@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>> }
>>
>> if (is_git_directory(dir->buf)) {
>> + if (!check_bare_repo_allowed())
>> + return GIT_DIR_DISALLOWED_BARE;
>
> Won't this fail if someone runs a Git command inside of a .git/
> directory for a non-bare repository? I just want to be sure that
> we hit this error instead:
>
> fatal: this operation must be run in a work tree
>
> I see that this error is tested in t0008-ignores.sh, but that's
> with the default "always" value. It would be good to explicitly
> check that this is the right error when using the "never" config.
Yes, it will fail if run inside of a .git/ directory. "never" prevents
you from working from inside .git/ unless you set GIT_DIR.
IIRC, we don't show "fatal: this operation must be run in a work
tree" for every Git command, e.g. "git log" works just fine. It makes
sense to show this warning when the CWD supports 'some, but not all' Git
commands, but I don't think this is valuable if we forbid *all* Git
commands.
Instead of trying to make "never" accomodate this use case, perhaps what
we want is a "dotgit-only" option that allows a bare repository if it is
below a .git/ directory. Since we forbid .git in the index, this seems
somewhat safe, but I hadn't proposed this sooner because I don't know if
we need it yet, and I'm certain that there are less secure edge cases
that need to be thought through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Glen Choo wrote (reply to this):
Glen Choo <chooglen@google.com> writes:
>>> +static int check_bare_repo_allowed(void)
>>> +{
>>> + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
>>> + read_very_early_config(discovery_bare_cb, NULL);
>>
>> This will add the third place where we use read_very_early_config(),
>> adding to the existing calls in tr2_sysenv_load() and
>> ensure_valid_ownership(). If I understand it correctly, that means
>> that every Git execution in a bare repository will now parse the
>> system and global config three times.
>>
>> This doesn't count the check for uploadpack.packobjectshook in
>> upload-pack.c that uses current_config_scope() to restrict its
>> value to the system and global config.
>>
>> We are probably at the point where we need to instead create a
>> configset that stores this "protected config" and allow us to
>> lookup config keys directly from that configset instead of
>> iterating through these config files repeatedly.
>
> Looking at all of the read_very_early_config() calls,
>
> - check_bare_repo_allowed() can use git_configset_get_string()
> - ensure_valid_ownership() can use git_configset_get_value_multi()
> - tr2_sysenv_load() reads every value with the "trace2." prefix. AFAICT
> configsets only support exact key lookups and I don't see an easy way
> teach configsets to support prefix lookups.
>
> (I didn't look too closely at uploadpack.packobjectshook because I don't
> know enough about config scopes to comment.)
>
> So using a configset, we'll still need to read the config files at least
> twice. That's better than thrice, but it doesn't cover the
> tr2_sysenv_load() use case, and we'll run into this yet again if add
> function that reads all config values with a given prefix.
>
> An hacky alternative that covers all of these use cases would be to read
> all protected config in a single pass, e.g.
>
> static struct protected_config {
> struct safe_directory_data safe_directory_data;
> const char *discovery_bare;
> struct string_list tr2_sysenv;
> };
>
> static int protected_config_cb()
> {
> /* Parse EVERYTHING that belongs in protected_config. */
> }
>
> but protected_config_cb() would have to parse too many unrelated things
> for my liking.
>
> So I'll use the configset for the cases where the key is known, and
> perhaps we'll punt on tr2_sysenv_load().
Since I'm trying to replace read_very_early_config() anyway, is this a
good time to teach git to respect "-c safe.directory"?
My understanding of [1] is that we only ignore "-c safe.directory"
because read_very_early_config() doesn't support it, but we would prefer
to support it if we could.
[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/
Documentation/config/discovery.txt
Outdated
@@ -0,0 +1,26 @@ | |||
discovery.bare:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> Add a 'cwd' option to discovery.bareRepository, which allows a bare
> repository to be used if and only if the cwd is the root of a bare
> repository. This covers the common case where a user works with a bare
> repository by cd-ing into the repository's root.
I don't consider this case valuable. In addition to allowing
the most-common use case, it also allows the most-common route
that an attacker would use to try to get a user to run a Git
command in a malicious embedded bare repo. I think we are
better off without it.
Thanks,
-Stolee
On the Git mailing list, Derrick Stolee wrote (reply to this): On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> Thanks all for the comments on v1, I've expanded this series somewhat to
> address them,...
Please include a full cover letter with each version, so reviewers
can respond to the full series goals.
Your series here intends to start protecting against malicious
embedded bare repositories by allowing users to opt-in to a more
protected state. When the 'discovery.bare' option is set, then
Git may die() on a bare repository that is discovered based on
the current working directory (these protections are ignored if
the user specifies the directory directly through --git-dir or
$GIT_DIR).
The 'discovery.bare' option has these values at the end of your
series:
* 'always' (default) allows all bare repos, matching the current
behavior of Git.
* 'never' avoids operating in bare repositories altogether.
* 'cwd' operates in a bare repository only if the current directory
is exactly the root of the bare repository.
It is important that we keep 'always' as the default at first,
because we do not want to introduce a breaking change without
warning (at least for an issue like this that has been around
for a long time).
The 'never' option is a good one for very security-conscious
users who really want to avoid problems. I don't anticipate that
users who know about this option and set it themselves are the
type that would fall for the social engineering required to
attack using this vector, but I can imagine an IT department
installing the value in system config across a fleet of machines.
I find the 'cwd' option to not be valuable. It unblocks most
existing users, but also almost completely removes the protection
that the option was supposed to provide.
I find neither the 'never' or 'cwd' options an acceptable choice
for a future default.
I also think that this protection is too rigid: it restricts
_all_ bare repositories, not just embedded ones. There is no check
to see if the parent directory of the bare repository is inside a
non-bare repository.
This leads to what I think would be a valuable replacement for
the 'cwd' option:
* 'no-embedded' allows non-embedded bare repositories. An
_embedded bare repository_ is a bare repository whose parent
directory is contained in the worktree of a non-bare Git
repository. When in this mode, embedded bare repositories are
not allowed unless the parent non-bare Git repository has a
'safe.embedded' config value storing the path to the current
embedded bare repository.
That was certainly difficult to write, but here it is as
pseudo-code to hopefully remove some doubt as to how this might
work:
if repo is bare:
if value == "always":
return ALLOWED
if value == "never":
return FORBIDDEN;
path = get_parent_repo()
if !path:
return ALLOWED
if config_file_has_value("{path}/.git/config", "safe.embedded", repo):
return ALLOWED
return FORBIDDEN
With this kind of option, we can protect users from these
social engineering attacks while providing an opt-in protection
for scenarios where embedded bare repos are currently being used
(while also not breaking anyone using non-embedded bare repos).
I think Taylor was mentioning something like this in his previous
replies, perhaps even to the previous thread on this topic.
This 'no-embedded' option is something that I could see as a
potential new default, after it has proven itself in a released
version of Git.
There are performance drawbacks to checking the parent path for
a Git repo, which is why it is only done when in "no-embedded"
mode.
I mentioned some other concerns in your PATCH 1 about how we
are now adding the third use of read_very_early_config() and that
we should probably refactor that before adding the third option,
in order to avoid additional performance costs as well as it
being difficult to audit which config options are only checked
from these "protected" config files.
Thanks,
-Stolee |
On the Git mailing list, Junio C Hamano wrote (reply to this): Glen Choo <chooglen@google.com> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> * die()-ing is necessary if we're trying to flip the default value of
>>> discovery.bare. We'd expect many bare repo users to be broken, and it's
>>> more helpful to fail loudly than to silently ignore the bare repo.
>>>
>>> But in the long term, long after we've flipped the default and users know
>>> that they need to opt into bare repo discovery, would it be a better UX
>>> to just silently ignore the bare repo?
>>
>> Would a middle-ground of giving a warning() message help? Can it be
>> loud and annoying enough to knudge the users to adjust without
>> breaking the functionality?
>
> Personally, when my tool changes its behavior, I would strongly prefer
> it to die than to "change behavior + warn". I'd feel more comfortable
> knowing that the tool did nothing as opposed to doing the wrong thing
> and only being informed after the fact. Also, I sometimes ignore
> warnings ;)
Heh, personally I would try very hard not to change the behaviour
without explicitly asked by the users with configuration or command
line option. Flipping the default has traditionally been done in
two or three phases.
(1) We start by giving a loud and annoying warning to those who
haven't configured and tell them the default *will* change, how
to keep the current behaviour forever, and how to live in the
future by adopting the future default early.
(2) After a while, we flip the default. Those who haven't
configured are given a notice that the default has changed, how
to keep the old behaviour forever, and how to explicitly choose
the same value as the default to squelch the notice.
(3) After yet another while, we stop giving the notice. If we
omitted (2), here is where we flip the default.
Strictly speaking, we can have (1) in one release and then could
directly jump to (3), but some distros may skip the releases that
has (1), and (2) is an attempt to help users of such distros.
>> Hopefully "git fetch" over ssh:// and file:/// would run the other
>> side with GIT_DIR explicitly set?
>
> Ah, I'll check this and get back to you.
>
>> I do not yet
>> find these "problems, such as..." so convincing.
>
> What would be a convincing rationale to you? I'll capture that here.
That is a wrong question. You are the one pushing for castrating
the bare repositories.
> I'm assuming that you already have such an rationale in mind when you
> say that the longer-term default is that "we respect bare repositories
> only if they are the cwd.". I'm also assuming that this rationale is
> something other than embedded bare repos, because "cwd-only" does not
> protect against that.
No, I do not have such a "different" rationale to justify the change
proposed in this patch. I was saying that the claim "embedded bare
repos are risky", backed by your two examples, did not sound all
that serious a problem. Presented with a more serious brekage
scenario, it may make the description more convincing.
Thanks. |
On the Git mailing list, Glen Choo wrote (reply to this): Junio C Hamano <gitster@pobox.com> writes:
> Glen Choo <chooglen@google.com> writes:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>
>>>> * die()-ing is necessary if we're trying to flip the default value of
>>>> discovery.bare. We'd expect many bare repo users to be broken, and it's
>>>> more helpful to fail loudly than to silently ignore the bare repo.
>>>>
>>>> But in the long term, long after we've flipped the default and users know
>>>> that they need to opt into bare repo discovery, would it be a better UX
>>>> to just silently ignore the bare repo?
>>>
>>> Would a middle-ground of giving a warning() message help? Can it be
>>> loud and annoying enough to knudge the users to adjust without
>>> breaking the functionality?
>>
>> Personally, when my tool changes its behavior, I would strongly prefer
>> it to die than to "change behavior + warn". I'd feel more comfortable
>> knowing that the tool did nothing as opposed to doing the wrong thing
>> and only being informed after the fact. Also, I sometimes ignore
>> warnings ;)
>
> Heh, personally I would try very hard not to change the behaviour
> without explicitly asked by the users with configuration or command
> line option. Flipping the default has traditionally been done in
> two or three phases.
>
> (1) We start by giving a loud and annoying warning to those who
> haven't configured and tell them the default *will* change, how
> to keep the current behaviour forever, and how to live in the
> future by adopting the future default early.
>
> (2) After a while, we flip the default. Those who haven't
> configured are given a notice that the default has changed, how
> to keep the old behaviour forever, and how to explicitly choose
> the same value as the default to squelch the notice.
>
> (3) After yet another while, we stop giving the notice. If we
> omitted (2), here is where we flip the default.
>
> Strictly speaking, we can have (1) in one release and then could
> directly jump to (3), but some distros may skip the releases that
> has (1), and (2) is an attempt to help users of such distros.
Ah, that is very helpful. Thanks. It's pretty clear that I misunderstood
what you meant by "giving a warning() message" - the warning() is there
to prepare users in advance of the change; we don't actually want the
warning() in the long term.
For something as disruptive as discovering bare repos, having all of
(1), (2) and (3) sounds appropriate.
>>> Hopefully "git fetch" over ssh:// and file:/// would run the other
>>> side with GIT_DIR explicitly set?
>>
>> Ah, I'll check this and get back to you.
>>
>>> I do not yet
>>> find these "problems, such as..." so convincing.
>>
>> What would be a convincing rationale to you? I'll capture that here.
>
> That is a wrong question. You are the one pushing for castrating
> the bare repositories.
Let me clarify in case this wasn't received the way I intended. Earlier
in the thread, you mentioned:
The longer-term default should be "cwd is allowed, but we do not
bother going up from object/04 subdirectory of a bare repository",
[...]
which I took to mean "Junio thinks that, by default, Git should stop
walking up to find a bare repo, and thinks this is better because of
rationale X.", and not, "Junio does not think that the default needs to
change, but is just suggesting a better default than Glen's".
If it is the former, then there is obviously some thought process here
that is worth sharing.
If it the latter, then I'm in favor of taking Stolee's suggestion to
drop "cwd", since nobody else finds it useful enough. (I like the
'simplification' story, but not enough to push "cwd" through, especially
since it does quite little security-wise.)
>> I'm assuming that you already have such an rationale in mind when you
>> say that the longer-term default is that "we respect bare repositories
>> only if they are the cwd.". I'm also assuming that this rationale is
>> something other than embedded bare repos, because "cwd-only" does not
>> protect against that.
>
> No, I do not have such a "different" rationale to justify the change
> proposed in this patch. I was saying that the claim "embedded bare
> repos are risky", backed by your two examples, did not sound all
> that serious a problem. Presented with a more serious brekage
> scenario, it may make the description more convincing.
Fair. I'll mull over this. |
On the Git mailing list, Junio C Hamano wrote (reply to this): Glen Choo <chooglen@google.com> writes:
> which I took to mean "Junio thinks that, by default, Git should stop
> walking up to find a bare repo, and thinks this is better because of
> rationale X."
The X is "it would not break existing use case too badly, just to
address a 'security' story whose severity is not so clearly
expressed".
> If it the latter, then I'm in favor of taking Stolee's suggestion to
> drop "cwd", since nobody else finds it useful enough. (I like the
> 'simplification' story, but not enough to push "cwd" through, especially
> since it does quite little security-wise.)
As long as you'll be there to answer the angry mob that complain
loudly (and irritatingly enough, the only do so after a release is
made to flip the default), I do not care too much either way ;-).
Thanks.
|
discovery.bare
and protected configsafe.bareRepository
and protected config
`uploadpack.packObjectsHook` is the only 'protected configuration only' variable today, but we've noted that `safe.directory` and the upcoming `safe.bareRepository` should also be 'protected configuration only'. So, for consistency, we'd like to have a single implementation for protected configuration. The primary constraints are: 1. Reading from protected configuration should be fast. Nearly all "git" commands inside a bare repository will read both `safe.directory` and `safe.bareRepository`, so we cannot afford to be slow. 2. Protected configuration must be readable when the gitdir is not known. `safe.directory` and `safe.bareRepository` both affect repository discovery and the gitdir is not known at that point [1]. The chosen implementation in this commit is to read protected configuration and cache the values in a global configset. This is similar to the caching behavior we get with the_repository->config. Introduce git_protected_config(), which reads protected configuration and caches them in the global configset protected_config. Then, refactor `uploadpack.packObjectsHook` to use git_protected_config(). The protected configuration functions are named similarly to their non-protected counterparts, e.g. git_protected_config_check_init() vs git_config_check_init(). In light of constraint 1, this implementation can still be improved. git_protected_config() iterates through every variable in protected_config, which is wasteful, but it makes the conversion simple because it matches existing patterns. We will likely implement constant time lookup functions for protected configuration in a future series (such functions already exist for non-protected configuration, i.e. repo_config_get_*()). An alternative that avoids introducing another configset is to continue to read all config using git_config(), but only accept values that have the correct config scope [2]. This technically fulfills constraint 2, because git_config() simply ignores the local and worktree config when the gitdir is not known. However, this would read incomplete config into the_repository->config, which would need to be reset when the gitdir is known and git_config() needs to read the local and worktree config. Resetting the_repository->config might be reasonable while we only have these 'protected configuration only' variables, but it's not clear whether this extends well to future variables. [1] In this case, we do have a candidate gitdir though, so with a little refactoring, it might be possible to provide a gitdir. [2] This is how `uploadpack.packObjectsHook` was implemented prior to this commit. Signed-off-by: Glen Choo <chooglen@google.com>
Use git_protected_config() to read `safe.directory` instead of read_very_early_config(), making it 'protected configuration only'. As a result, `safe.directory` now respects "-c", so update the tests and docs accordingly. It used to ignore "-c" due to how it was implemented, not because of security or correctness concerns [1]. [1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/ Signed-off-by: Glen Choo <chooglen@google.com>
9bfe369
to
e6f654e
Compare
There is a known social engineering attack that takes advantage of the fact that a working tree can include an entire bare repository, including a config file. A user could run a Git command inside the bare repository thinking that the config file of the 'outer' repository would be used, but in reality, the bare repository's config file (which is attacker-controlled) is used, which may result in arbitrary code execution. See [1] for a fuller description and deeper discussion. A simple mitigation is to forbid bare repositories unless specified via `--git-dir` or `GIT_DIR`. In environments that don't use bare repositories, this would be minimally disruptive. Create a config variable, `safe.bareRepository`, that tells Git whether or not to die() when working with a bare repository. This config is an enum of: - "all": allow all bare repositories (this is the default) - "explicit": only allow bare repositories specified via --git-dir or GIT_DIR. If we want to protect users from such attacks by default, neither value will suffice - "all" provides no protection, but "explicit" is impractical for bare repository users. A more usable default would be to allow only non-embedded bare repositories ([2] contains one such proposal), but detecting if a repository is embedded is potentially non-trivial, so this work is not implemented in this series. [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com [2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com Signed-off-by: Glen Choo <chooglen@google.com>
e6f654e
to
50069bb
Compare
So clearly me having no computer is an issue |
/submit |
Submitted as pull.1261.v8.git.git.1657834081.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via ff7938a. |
This patch series was integrated into seen via 7802e43. |
This patch series was integrated into next via 5206577. |
This patch series was integrated into seen via b247963. |
This patch series was integrated into seen via c5f32de. |
This patch series was integrated into seen via 5a25a74. |
There was a status update in the "Cooking" section about the branch Introduce a discovery.barerepository configuration variable that allows users to forbid discovery of bare repositories. Will merge to 'master'. source: <pull.1261.v8.git.git.1657834081.gitgitgadget@gmail.com> |
This patch series was integrated into seen via 6efbdba. |
This patch series was integrated into seen via ae5c21c. |
This patch series was integrated into seen via df367c2. |
This patch series was integrated into seen via 18bbc79. |
This patch series was integrated into master via 18bbc79. |
This patch series was integrated into next via 18bbc79. |
Closed via 18bbc79. |
@@ -81,6 +81,17 @@ static enum config_scope current_parsing_scope; | |||
static int pack_compression_seen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> `uploadpack.packObjectsHook` is the only 'protected configuration only'
> variable today, but we've noted that `safe.directory` and the upcoming
> `safe.bareRepository` should also be 'protected configuration only'. So,
> for consistency, we'd like to have a single implementation for protected
> configuration.
>
> The primary constraints are:
>
> 1. Reading from protected configuration should be fast. Nearly all "git"
> commands inside a bare repository will read both `safe.directory` and
> `safe.bareRepository`, so we cannot afford to be slow.
>
> 2. Protected configuration must be readable when the gitdir is not
> known. `safe.directory` and `safe.bareRepository` both affect
> repository discovery and the gitdir is not known at that point [1].
>
> The chosen implementation in this commit is to read protected
> configuration and cache the values in a global configset. This is
> similar to the caching behavior we get with the_repository->config.
>
> Introduce git_protected_config(), which reads protected configuration
> and caches them in the global configset protected_config. Then, refactor
> `uploadpack.packObjectsHook` to use git_protected_config().
>
> The protected configuration functions are named similarly to their
> non-protected counterparts, e.g. git_protected_config_check_init() vs
> git_config_check_init().
>
> In light of constraint 1, this implementation can still be improved.
> git_protected_config() iterates through every variable in
> protected_config, which is wasteful, but it makes the conversion simple
> because it matches existing patterns. We will likely implement constant
> time lookup functions for protected configuration in a future series
> (such functions already exist for non-protected configuration, i.e.
> repo_config_get_*()).
>
> An alternative that avoids introducing another configset is to continue
> to read all config using git_config(), but only accept values that have
> the correct config scope [2]. This technically fulfills constraint 2,
> because git_config() simply ignores the local and worktree config when
> the gitdir is not known. However, this would read incomplete config into
> the_repository->config, which would need to be reset when the gitdir is
> known and git_config() needs to read the local and worktree config.
> Resetting the_repository->config might be reasonable while we only have
> these 'protected configuration only' variables, but it's not clear
> whether this extends well to future variables.
>
> [1] In this case, we do have a candidate gitdir though, so with a little
> refactoring, it might be possible to provide a gitdir.
> [2] This is how `uploadpack.packObjectsHook` was implemented prior to
> this commit.
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> config.c | 43 ++++++++++++++++++++++++++++++++++++
> config.h | 16 ++++++++++++++
> t/t5544-pack-objects-hook.sh | 7 +++++-
> upload-pack.c | 27 +++++++++++++---------
> 4 files changed, 82 insertions(+), 11 deletions(-)
>
> diff --git a/config.c b/config.c
> index 9b0e9c93285..015bec360f5 100644
> --- a/config.c
> +++ b/config.c
> @@ -81,6 +81,17 @@ static enum config_scope current_parsing_scope;
> static int pack_compression_seen;
> static int zlib_compression_seen;
>
> +/*
> + * Config that comes from trusted scopes, namely:
> + * - CONFIG_SCOPE_SYSTEM (e.g. /etc/gitconfig)
> + * - CONFIG_SCOPE_GLOBAL (e.g. $HOME/.gitconfig, $XDG_CONFIG_HOME/git)
> + * - CONFIG_SCOPE_COMMAND (e.g. "-c" option, environment variables)
> + *
> + * This is declared here for code cleanliness, but unlike the other
> + * static variables, this does not hold config parser state.
> + */
> +static struct config_set protected_config;
> +
> static int config_file_fgetc(struct config_source *conf)
> {
> return getc_unlocked(conf->u.file);
> @@ -2378,6 +2389,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
> return git_config_from_file(config_set_callback, filename, cs);
> }
>
> +int git_configset_add_parameters(struct config_set *cs)
> +{
> + return git_config_from_parameters(config_set_callback, cs);
> +}
> +
> int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
> {
> const struct string_list *values = NULL;
> @@ -2619,6 +2635,33 @@ int repo_config_get_pathname(struct repository *repo,
> return ret;
> }
>
> +/* Read values into protected_config. */
> +static void read_protected_config(void)
> +{
> + char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
> +
> + git_configset_init(&protected_config);
> +
> + system_config = git_system_config();
> + git_global_config(&user_config, &xdg_config);
> +
> + git_configset_add_file(&protected_config, system_config);
> + git_configset_add_file(&protected_config, xdg_config);
> + git_configset_add_file(&protected_config, user_config);
> + git_configset_add_parameters(&protected_config);
> +
> + free(system_config);
> + free(xdg_config);
> + free(user_config);
> +}
> +
> +void git_protected_config(config_fn_t fn, void *data)
> +{
> + if (!protected_config.hash_initialized)
> + read_protected_config();
> + configset_iter(&protected_config, fn, data);
> +}
> +
> /* Functions used historically to read configuration from 'the_repository' */
> void git_config(config_fn_t fn, void *data)
> {
> diff --git a/config.h b/config.h
> index 7654f61c634..ca994d77147 100644
> --- a/config.h
> +++ b/config.h
> @@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
> */
> int git_configset_add_file(struct config_set *cs, const char *filename);
>
> +/**
> + * Parses command line options and environment variables, and adds the
> + * variable-value pairs to the `config_set`. Returns 0 on success, or -1
> + * if there is an error in parsing. The caller decides whether to free
> + * the incomplete configset or continue using it when the function
> + * returns -1.
> + */
> +int git_configset_add_parameters(struct config_set *cs);
> +
> /**
> * Finds and returns the value list, sorted in order of increasing priority
> * for the configuration variable `key` and config set `cs`. When the
> @@ -505,6 +514,13 @@ int repo_config_get_maybe_bool(struct repository *repo,
> int repo_config_get_pathname(struct repository *repo,
> const char *key, const char **dest);
>
> +/*
> + * Functions for reading protected config. By definition, protected
> + * config ignores repository config, so these do not take a `struct
> + * repository` parameter.
> + */
> +void git_protected_config(config_fn_t fn, void *data);
> +
> /**
> * Querying For Specific Variables
> * -------------------------------
> diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
> index dd5f44d986f..54f54f8d2eb 100755
> --- a/t/t5544-pack-objects-hook.sh
> +++ b/t/t5544-pack-objects-hook.sh
> @@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
> ! grep "hook running" stderr &&
> test_path_is_missing .git/hook.args &&
> test_path_is_missing .git/hook.stdin &&
> - test_path_is_missing .git/hook.stdout
> + test_path_is_missing .git/hook.stdout &&
> +
> + # check that global config is used instead
> + test_config_global uploadpack.packObjectsHook ./hook &&
> + git clone --no-local . dst2.git 2>stderr &&
> + grep "hook running" stderr
> '
>
> test_expect_success 'hook works with partial clone' '
> diff --git a/upload-pack.c b/upload-pack.c
> index 3a851b36066..09f48317b02 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
> data->advertise_sid = git_config_bool(var, value);
> }
>
> - if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
> - current_config_scope() != CONFIG_SCOPE_WORKTREE) {
> - if (!strcmp("uploadpack.packobjectshook", var))
> - return git_config_string(&data->pack_objects_hook, var, value);
> - }
> -
> if (parse_object_filter_config(var, value, data) < 0)
> return -1;
>
> return parse_hide_refs_config(var, value, "uploadpack");
> }
>
> +static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
> +{
> + struct upload_pack_data *data = cb_data;
> +
> + if (!strcmp("uploadpack.packobjectshook", var))
> + return git_config_string(&data->pack_objects_hook, var, value);
> + return 0;
> +}
> +
> +static void get_upload_pack_config(struct upload_pack_data *data)
> +{
> + git_config(upload_pack_config, data);
> + git_protected_config(upload_pack_protected_config, data);
> +}
> +
> void upload_pack(const int advertise_refs, const int stateless_rpc,
> const int timeout)
> {
> @@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
> struct upload_pack_data data;
>
> upload_pack_data_init(&data);
> -
> - git_config(upload_pack_config, &data);
> + get_upload_pack_config(&data);
>
> data.stateless_rpc = stateless_rpc;
> data.timeout = timeout;
> @@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
>
> upload_pack_data_init(&data);
> data.use_sideband = LARGE_PACKET_MAX;
> -
> - git_config(upload_pack_config, &data);
> + get_upload_pack_config(&data);
>
> while (state != FETCH_DONE) {
> switch (state) {
Noticed after it landed on master: This change fails with:
make SANITIZE=address test T=t0410*.sh
Running that manually shows that we fail like this:
$ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
AH01215: =================================================================: /home/avar/g/git/git-http-backend
AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
AH01215: #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
AH01215: #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
AH01215: #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
AH01215: #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
AH01215: #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
AH01215: #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
AH01215: #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
AH01215: #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
AH01215: #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
AH01215: #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
AH01215: #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
AH01215: #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
AH01215: #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
AH01215: #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
AH01215: #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
AH01215: #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
AH01215: #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
AH01215: #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
AH01215: #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
AH01215: #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
AH01215: #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
AH01215: : /home/avar/g/git/git-http-backend
AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
AH01215: SUMMARY: AddressSanitizer: SEGV ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167 in __sanitizer::internal_strlen(char const*): /home/avar/g/git/git-http-backend
AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
(We really should have a SANITIZE=address in CI, but it takes a while...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Glen Choo wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>
>> +/* Read values into protected_config. */
>> +static void read_protected_config(void)
>> +{
>> + char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>> +
>> + git_configset_init(&protected_config);
>> +
>> + system_config = git_system_config();
>> + git_global_config(&user_config, &xdg_config);
>> +
>> + git_configset_add_file(&protected_config, system_config);
>> + git_configset_add_file(&protected_config, xdg_config);
>> + git_configset_add_file(&protected_config, user_config);
>> + git_configset_add_parameters(&protected_config);
>> +
>> + free(system_config);
>> + free(xdg_config);
>> + free(user_config);
>> +}
>
> Noticed after it landed on master: This change fails with:
>
> make SANITIZE=address test T=t0410*.sh
>
> Running that manually shows that we fail like this:
>
> $ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
> AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
> AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
> AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
> AH01215: =================================================================: /home/avar/g/git/git-http-backend
> AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
> AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
> AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
> AH01215: #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
> AH01215: #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
> AH01215: #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
> AH01215: #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
> AH01215: #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
> AH01215: #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
> AH01215: #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
> AH01215: #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
> AH01215: #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
> AH01215: #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
> AH01215: #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
> AH01215: #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
> AH01215: #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
> AH01215: #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
> AH01215: #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
> AH01215: #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
> AH01215: #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
> AH01215: #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
> AH01215: #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
> AH01215: #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
> AH01215: #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
> AH01215: : /home/avar/g/git/git-http-backend
> AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
> AH01215: SUMMARY: AddressSanitizer: SEGV ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167 in __sanitizer::internal_strlen(char const*): /home/avar/g/git/git-http-backend
> AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
> AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>
> (We really should have a SANITIZE=address in CI, but it takes a while...)
Thanks. I narrowed the failure down to the hunk above, specifically this
line:
git_configset_add_file(&protected_config, xdg_config);
Since xdg_config can be NULL, this results in the failing call
fopen_or_warn(NULL, "r").
This logic was lifted from do_git_config_sequence(), which checks that
each of the paths are not NULL. So a fix might be something like:
----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
diff --git a/config.c b/config.c
index 015bec360f..208a3dd7a7 100644
--- a/config.c
+++ b/config.c
@@ -2645,9 +2645,13 @@ static void read_protected_config(void)
system_config = git_system_config();
git_global_config(&user_config, &xdg_config);
- git_configset_add_file(&protected_config, system_config);
- git_configset_add_file(&protected_config, xdg_config);
- git_configset_add_file(&protected_config, user_config);
+
+ if (system_config)
+ git_configset_add_file(&protected_config, system_config);
+ if (xdg_config)
+ git_configset_add_file(&protected_config, xdg_config);
+ if (user_config)
+ git_configset_add_file(&protected_config, user_config);
git_configset_add_parameters(&protected_config);
free(system_config);
----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
NULL when $HOME is unset, and xdg_config is also unset if
$GIT_CONFIG_GLOBAL is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Mon, Jul 25 2022, Glen Choo wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>>
>>> +/* Read values into protected_config. */
>>> +static void read_protected_config(void)
>>> +{
>>> + char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>>> +
>>> + git_configset_init(&protected_config);
>>> +
>>> + system_config = git_system_config();
>>> + git_global_config(&user_config, &xdg_config);
>>> +
>>> + git_configset_add_file(&protected_config, system_config);
>>> + git_configset_add_file(&protected_config, xdg_config);
>>> + git_configset_add_file(&protected_config, user_config);
>>> + git_configset_add_parameters(&protected_config);
>>> +
>>> + free(system_config);
>>> + free(xdg_config);
>>> + free(user_config);
>>> +}
>>
>> Noticed after it landed on master: This change fails with:
>>
>> make SANITIZE=address test T=t0410*.sh
>>
>> Running that manually shows that we fail like this:
>>
>> $ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
>> AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
>> AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
>> AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
>> AH01215: =================================================================: /home/avar/g/git/git-http-backend
>> AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
>> AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
>> AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
>> AH01215: #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
>> AH01215: #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
>> AH01215: #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
>> AH01215: #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
>> AH01215: #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
>> AH01215: #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
>> AH01215: #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
>> AH01215: #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
>> AH01215: #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
>> AH01215: #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
>> AH01215: #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
>> AH01215: #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
>> AH01215: #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
>> AH01215: #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
>> AH01215: #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
>> AH01215: #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
>> AH01215: #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
>> AH01215: #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
>> AH01215: #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
>> AH01215: #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
>> AH01215: #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
>> AH01215: : /home/avar/g/git/git-http-backend
>> AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
>> AH01215: SUMMARY: AddressSanitizer: SEGV
>> ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
>> in __sanitizer::internal_strlen(char const*):
>> /home/avar/g/git/git-http-backend
>> AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
>> AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>>
>> (We really should have a SANITIZE=address in CI, but it takes a while...)
>
> Thanks. I narrowed the failure down to the hunk above, specifically this
> line:
>
> git_configset_add_file(&protected_config, xdg_config);
>
> Since xdg_config can be NULL, this results in the failing call
> fopen_or_warn(NULL, "r").
>
> This logic was lifted from do_git_config_sequence(), which checks that
> each of the paths are not NULL. So a fix might be something like:
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>
> diff --git a/config.c b/config.c
> index 015bec360f..208a3dd7a7 100644
> --- a/config.c
> +++ b/config.c
> @@ -2645,9 +2645,13 @@ static void read_protected_config(void)
> system_config = git_system_config();
> git_global_config(&user_config, &xdg_config);
>
> - git_configset_add_file(&protected_config, system_config);
> - git_configset_add_file(&protected_config, xdg_config);
> - git_configset_add_file(&protected_config, user_config);
> +
> + if (system_config)
> + git_configset_add_file(&protected_config, system_config);
> + if (xdg_config)
> + git_configset_add_file(&protected_config, xdg_config);
> + if (user_config)
> + git_configset_add_file(&protected_config, user_config);
> git_configset_add_parameters(&protected_config);
>
> free(system_config);
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>
> I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
> NULL when $HOME is unset, and xdg_config is also unset if
> $GIT_CONFIG_GLOBAL is set.
Not having looked into it much at all: Doesn't this then introduce
another logic error where git_protected_config() is now buggy, i.e. it's
a "lazy load" method where we'll expect to read_protected_config()
first.
The assumption with that seems to have been that it's invariant within a
single process, is that still the case, or can e.g. HOME be set during
our runtime when we rely on these functions?
(I don't know)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Glen Choo wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Mon, Jul 25 2022, Glen Choo wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>>>
>>>> +/* Read values into protected_config. */
>>>> +static void read_protected_config(void)
>>>> +{
>>>> + char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>>>> +
>>>> + git_configset_init(&protected_config);
>>>> +
>>>> + system_config = git_system_config();
>>>> + git_global_config(&user_config, &xdg_config);
>>>> +
>>>> + git_configset_add_file(&protected_config, system_config);
>>>> + git_configset_add_file(&protected_config, xdg_config);
>>>> + git_configset_add_file(&protected_config, user_config);
>>>> + git_configset_add_parameters(&protected_config);
>>>> +
>>>> + free(system_config);
>>>> + free(xdg_config);
>>>> + free(user_config);
>>>> +}
>>>
>>> Noticed after it landed on master: This change fails with:
>>>
>>> make SANITIZE=address test T=t0410*.sh
>>>
>>> Running that manually shows that we fail like this:
>>>
>>> $ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
>>> AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
>>> AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
>>> AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
>>> AH01215: =================================================================: /home/avar/g/git/git-http-backend
>>> AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
>>> AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
>>> AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
>>> AH01215: #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
>>> AH01215: #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
>>> AH01215: #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
>>> AH01215: #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
>>> AH01215: #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
>>> AH01215: #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
>>> AH01215: #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
>>> AH01215: #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
>>> AH01215: #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
>>> AH01215: #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
>>> AH01215: #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
>>> AH01215: #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
>>> AH01215: #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
>>> AH01215: #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
>>> AH01215: #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
>>> AH01215: #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
>>> AH01215: #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
>>> AH01215: #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
>>> AH01215: #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
>>> AH01215: #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
>>> AH01215: #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
>>> AH01215: : /home/avar/g/git/git-http-backend
>>> AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
>>> AH01215: SUMMARY: AddressSanitizer: SEGV
>>> ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
>>> in __sanitizer::internal_strlen(char const*):
>>> /home/avar/g/git/git-http-backend
>>> AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
>>> AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>>>
>>> (We really should have a SANITIZE=address in CI, but it takes a while...)
>>
>> Thanks. I narrowed the failure down to the hunk above, specifically this
>> line:
>>
>> git_configset_add_file(&protected_config, xdg_config);
>>
>> Since xdg_config can be NULL, this results in the failing call
>> fopen_or_warn(NULL, "r").
>>
>> This logic was lifted from do_git_config_sequence(), which checks that
>> each of the paths are not NULL. So a fix might be something like:
>>
>> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>>
>> diff --git a/config.c b/config.c
>> index 015bec360f..208a3dd7a7 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -2645,9 +2645,13 @@ static void read_protected_config(void)
>> system_config = git_system_config();
>> git_global_config(&user_config, &xdg_config);
>>
>> - git_configset_add_file(&protected_config, system_config);
>> - git_configset_add_file(&protected_config, xdg_config);
>> - git_configset_add_file(&protected_config, user_config);
>> +
>> + if (system_config)
>> + git_configset_add_file(&protected_config, system_config);
>> + if (xdg_config)
>> + git_configset_add_file(&protected_config, xdg_config);
>> + if (user_config)
>> + git_configset_add_file(&protected_config, user_config);
>> git_configset_add_parameters(&protected_config);
>>
>> free(system_config);
>>
>> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>>
>> I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
>> NULL when $HOME is unset, and xdg_config is also unset if
>> $GIT_CONFIG_GLOBAL is set.
>
> Not having looked into it much at all: Doesn't this then introduce
> another logic error where git_protected_config() is now buggy, i.e. it's
> a "lazy load" method where we'll expect to read_protected_config()
> first.
>
> The assumption with that seems to have been that it's invariant within a
> single process, is that still the case, or can e.g. HOME be set during
> our runtime when we rely on these functions?
>
> (I don't know)
I don't think this introduces an error, or at least, not one that we
don't already have. This mimics do_git_config_sequence() (which also
assumes this invariant), which is used under the hood by
(git|repo)_read_config(),
In retrospect, it might have been a good idea to implement
read_protected_config() using do_git_config_sequence() /
config_with_options(); those functions are a bit bloated, but at least
we'd only have one implementation.
Thanks all! This version takes Junio's wording suggestions on the
previous round, renames the config variable and, because of conflicts
with 2.37.1, is newly rebased onto master.
The config variable is now named "safe.bareRepository"; I discarded this
name earlier in this series, but the intent behind the series has come
around to the point where "safe.bareRepository" makes sense again. And
thanks to Dscho's suggestion to explicitly create an option for "no
discovered bare repos" [1], the UI is self-documenting enough that we can
write terser docs.
As mentioned upthread, I would love to see "safe.*" namespace renamed (it's
an adjective, not a Git command/entity), but I'm doubling down on it anyway.
This config option does such a similar thing to "safe.directory" that they
really should be siblings, and I don't foresee a world where
"safe.directory" gets renamed.
I've triple-checked to make sure I've scrubbed the commits of the old name,
but I'd appreciate the extra eyes :)
= Description
There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [2] for a fuller
description and deeper discussion.
This series implements a simple way of preventing such attacks: create a
config option, safe.bareRepository, that tells Git whether or not to die
when it finds a bare repository. safe.bareRepository has two values:
behavior
or GIT_DIR
and users/system administrators who never expect to work with bare
repositories can secure their environments using "explicit". We
still trust explicit bare repositories because we are confident that the
user is not confused about which repository is being used.
This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [3]. "never" is too
restrictive and unlikely to be the default.
For security reasons, safe.bareRepository cannot be read from
repository-level config (because we would end up trusting the embedded bare
repository that we aren't supposed to trust to begin with). Since this would
introduce a 3rd variable that is only read from 'protected/trusted
configuration' (the others are safe.directory and
uploadpack.packObjectsHook) this series also defines and creates a shared
implementation for 'protected configuration'
= Patch organization
implementation.
= Series history
Changes in v8:
"always|never" -> "all|explicit"
Changes in v7:
functions coming in a later series.
since v4, but slipped under the radar until now.
Changes in v6:
message.
was passing for the wrong reason (because '' is an invalid value for
"discovery.bare"). I removed it because it wasn't doing anything useful
anyway - I was trying to make discovery.bare unset in the command line,
but the whole point of that test is to assert that we respect the CLI
arg.
Changes in v5:
"config" and "configuration". This required some unfortunate rewrapping.
configuration and focus on what Git does instead.
interact instead of saying "has no effect".
Changes in v4:
config
(previously it was a member of the_repository).
config
Changes in v3:
once
repo config using config scopes
Changes in v2:
= Future work
config. This will be done in a follow up series.
work on it any time soon, but I'd be more than happy to review if someone
sends patches.
doesn't die() and we don't tell users why their repository was rejected,
e.g. "git config" gives an opaque "fatal: not in a git directory". This
isn't a new problem though, since safe.directory has the same issue.
[1] https://lore.kernel.org/git/5ps2q552-1rr3-7161-4181-31556pp2ns12@tzk.qr
[2] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[3] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
Cc: Taylor Blau me@ttaylorr.com, Derrick Stolee derrickstolee@github.com, Junio C Hamano gitster@pobox.com, Emily Shaffer emilyshaffer@google.com, Jonathan Tan jonathantanmy@google.com, Ævar Arnfjörð Bjarmason avarab@gmail.com, Johannes Schindelin Johannes.Schindelin@gmx.de