Skip to content

nativeGitClient.LsFiles is racy, may drop relevant files #21754

@crenshaw-dev

Description

@crenshaw-dev

Describe the bug

If two parallel processes using two nativeGitClient instances with different root directories handle a LsFiles call at the same time with enableNewGitFileGlobbing=true, the method may return an incomplete list of files. This could be a major problem for ApplicationSets that rely on this function for the git files generator.

The underlying problem is that the doublestar library uses the current working directory to determine the root of a glob pattern, so we call os.Chdir. Calling that function in two parallel processes is not safe.

I think we should either contribute a mode to doublestar where we explicitly send the root directory, or we should consider using a different library. (But I'd really like to avoid changing libraries again.)

To Reproduce

Run this test:

func Test_LsFiles_RaceCondition(t *testing.T) {
	// Create two temporary directories and initialize them as git repositories
	tempDir1 := t.TempDir()
	tempDir2 := t.TempDir()

	client1, err := NewClient("file://"+tempDir1, NopCreds{}, true, false, "", "")
	require.NoError(t, err)
	client2, err := NewClient("file://"+tempDir2, NopCreds{}, true, false, "", "")
	require.NoError(t, err)

	err = client1.Init()
	require.NoError(t, err)
	err = client2.Init()
	require.NoError(t, err)

	// Add different files to each repository
	file1 := filepath.Join(client1.Root(), "file1.txt")
	err = os.WriteFile(file1, []byte("content1"), 0644)
	require.NoError(t, err)
	err = runCmd(client1.Root(), "git", "add", "file1.txt")
	require.NoError(t, err)
	err = runCmd(client1.Root(), "git", "commit", "-m", "Add file1")
	require.NoError(t, err)

	file2 := filepath.Join(client2.Root(), "file2.txt")
	err = os.WriteFile(file2, []byte("content2"), 0644)
	require.NoError(t, err)
	err = runCmd(client2.Root(), "git", "add", "file2.txt")
	require.NoError(t, err)
	err = runCmd(client2.Root(), "git", "commit", "-m", "Add file2")
	require.NoError(t, err)

	// Assert that LsFiles returns the correct files when called sequentially
	files1, err := client1.LsFiles("*", true)
	require.NoError(t, err)
	require.Contains(t, files1, "file1.txt")

	files2, err := client2.LsFiles("*", true)
	require.NoError(t, err)
	require.Contains(t, files2, "file2.txt")

	// Define a function to call LsFiles multiple times in parallel
	var wg sync.WaitGroup
	callLsFiles := func(client Client, expectedFile string) {
		defer wg.Done()
		for i := 0; i < 100; i++ {
			files, err := client.LsFiles("*", true)
			require.NoError(t, err)
			require.Contains(t, files, expectedFile)
		}
	}

	// Call LsFiles in parallel for both clients
	wg.Add(2)
	go callLsFiles(client1, "file1.txt")
	go callLsFiles(client2, "file2.txt")
	wg.Wait()
}

Expected behavior

LsFiles should always return the relevant files for a given directory, regardless of parallel processes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions