-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat(repo): add git repository metadata to reports #9252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add RepoMetadata struct with repository URL, branch, tag, commit info - Extract git metadata for repository artifacts in local filesystem scanner - Include git metadata fields in scan report output - Support both local and remote git repositories
- Combine gitCommitHash() and extractGitMetadata() into single extractGitInfo() function - Store git metadata in Artifact struct during construction to avoid duplicate git operations - Add comprehensive unit tests for git metadata extraction functionality - Test scenarios: clean/dirty repos, upstream/origin remotes, tagged commits, non-git directories
…ommitHash field - Replace redundant `commitHash string` field with `isClean bool` in Artifact struct - Update extractGitInfo function to return (bool, RepoMetadata, error) instead of (string, RepoMetadata, error) - Separate concerns: isClean for cache decisions, repoMetadata.Commit for actual hash value - Update cache logic to use `a.isClean && a.repoMetadata.Commit \!= ""` pattern - Update TestExtractGitInfo to use wantClean instead of wantHash - Eliminate code duplication while maintaining all existing functionality
… omitzero tags This change removes the type check that only populated RepoMetadata for TypeRepository artifacts. Now git metadata is populated for any directory that happens to be a git repository, regardless of scan type. Also updates JSON struct tags from omitempty to omitzero to better handle empty git metadata fields in reports.
…ository infrastructure - Replace programmatic git repository creation with existing test-repo - Simplify test cases from 5 complex scenarios to 2 focused scenarios - Use internal/gittest/testdata/test-repo as recommended - Remove unnecessary imports (time, go-git packages) - Fix compilation errors with boolean return values in extractGitInfo
- Fix gci formatting in RepoMetadata struct field alignment - Update integration test golden file to include git metadata fields
The multiple_lockfiles test scans a local directory without git information, so it should not expect git metadata in the results. Added an override function to clear the metadata for this specific test case. This resolves the test failure where the golden file was updated with git metadata for a different test (TestClientServer/scan_remote_repository), causing conflicts when both tests share the same golden file.
Changed Tag field to Tags []string in both RepoMetadata and Metadata structs to properly handle cases where multiple tags point to the same commit. - Updated artifact.RepoMetadata to use Tags []string - Updated types.Metadata to use Tags []string - Modified extractGitInfo to collect all tags pointing to HEAD - Updated tests to handle the new Tags field - Updated service.go to pass Tags array directly
Updated TestArtifact_Inspect test expectations to include the RepoMetadata that is now populated for git repositories. The tests were failing because they expected empty metadata, but our implementation now extracts git information for all repository scans. Updated test cases: - remote_repo: expects metadata from cloned remote repository - local_repo: expects metadata from local test repository - dirty_repository: expects metadata even when repository has uncommitted changes
Removed TestExtractGitInfo from fs_test.go as it's redundant. The git repository functionality is properly tested in pkg/fanal/artifact/repo/git_test.go.
- Use direct assignment with multiple return values - Consolidate artifact.Reference initialization - Remove intermediate variable for metadata assignment
- Consolidate error handling for repo.Tags() into a single if statement - Explicitly ignore the return value of tags.ForEach() - Fix variable reference in debug log message
- Add brief explanation of automatic git metadata extraction - List types of information included without implementation details - Direct users to JSON output for detailed field information - Keep documentation flexible for future changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to check client/server mode, but it looks like we have bug (from #8278):
➜ trivy -d repo --server 0.0.0.0:8080 --format json github.com/knqyf263/sou
2025-07-29T15:49:24+06:00 DEBUG Default config file "file_path=trivy.yaml" not found, using built in values
2025-07-29T15:49:24+06:00 DEBUG Cache dir dir="/Users/dmitriy/Library/Caches/trivy"
2025-07-29T15:49:24+06:00 DEBUG Cache dir dir="/Users/dmitriy/Library/Caches/trivy"
2025-07-29T15:49:24+06:00 DEBUG Parsed severities severities=[UNKNOWN LOW MEDIUM HIGH CRITICAL]
2025-07-29T15:49:24+06:00 DEBUG Ignore statuses statuses=[]
2025-07-29T15:49:24+06:00 WARN Trivy runs in client/server mode, but misconfiguration and license scanning will be done on the client side, see https://trivy.dev/v0.64/docs/references/modes/client-server
2025-07-29T15:49:24+06:00 DEBUG [pkg] Package types types=[library]
2025-07-29T15:49:24+06:00 DEBUG [pkg] Package relationships relationships=[unknown root workspace direct indirect]
2025-07-29T15:49:24+06:00 INFO [vuln] Vulnerability scanning is enabled
2025-07-29T15:49:24+06:00 INFO [secret] Secret scanning is enabled
2025-07-29T15:49:24+06:00 INFO [secret] If your scanning is slow, please try '--scanners vuln' to disable secret scanning
2025-07-29T15:49:24+06:00 INFO [secret] Please see also https://trivy.dev/v0.64/docs/scanner/secret#recommendation for faster secret detection
2025-07-29T15:49:24+06:00 DEBUG [notification] Running version check
2025-07-29T15:49:24+06:00 DEBUG [notification] Version check completed latest_version="0.64.1"
Enumerating objects: 39, done.
Counting objects: 100% (39/39), done.
Compressing objects: 100% (34/34), done.
Total 39 (delta 6), reused 24 (delta 3), pack-reused 0 (from 0)
2025-07-29T15:49:31+06:00 DEBUG [secret] No secret config detected config_path="trivy-secret.yaml"
2025-07-29T15:49:31+06:00 DEBUG [repo] Analyzing... root="/var/folders/8m/p1341n2941jbyc5gm7357x5c0000gn/T/trivy-remote-repo3088390424" original="github.com/knqyf263/sou"
2025-07-29T15:49:31+06:00 DEBUG [repo] Using the latest commit hash for calculating cache key commit_hash="378cf9606fe23bdb47639e29a4fb525ed7645e09"
2025-07-29T15:49:31+06:00 FATAL Fatal error
- run error:
github.com/aquasecurity/trivy/pkg/commands/artifact.Run
github.com/aquasecurity/trivy/pkg/commands/artifact/run.go:411
- repo scan error:
github.com/aquasecurity/trivy/pkg/commands/artifact.run
github.com/aquasecurity/trivy/pkg/commands/artifact/run.go:449
- scan error:
github.com/aquasecurity/trivy/pkg/commands/artifact.(*runner).scanArtifact
github.com/aquasecurity/trivy/pkg/commands/artifact/run.go:288
- scan failed:
github.com/aquasecurity/trivy/pkg/commands/artifact.(*runner).scan
github.com/aquasecurity/trivy/pkg/commands/artifact/run.go:678
- failed analysis:
github.com/aquasecurity/trivy/pkg/scan.Service.ScanArtifact
github.com/aquasecurity/trivy/pkg/scan/service.go:166
- unable to get missing blob:
github.com/aquasecurity/trivy/pkg/fanal/artifact/local.Artifact.Inspect
github.com/aquasecurity/trivy/pkg/fanal/artifact/local/fs.go:144
- unable to fetch missing layers:
github.com/aquasecurity/trivy/pkg/cache.RemoteCache.MissingBlobs
github.com/aquasecurity/trivy/pkg/cache/remote.go:81
- twirp error internal: could not build request: parse "0.0.0.0:8080/twirp/trivy.cache.v1.Cache/MissingBlobs": first path segment in URL cannot contain colon
Co-authored-by: DmitriyLewen <91113035+DmitriyLewen@users.noreply.github.com>
@DmitriyLewen Thanks for catching the bug! Do you think we should fix it in this PR? |
yes, no problem, we can fix it in another PR since the bug is not related to these changes. |
You may have missed
|
It's documented. We may want to show a more user-friendly message when the schema is missing. |
I became kind of absent-minded... |
No worries. Even maintainers can make mistakes with |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [mirror.gcr.io/aquasec/trivy](https://www.aquasec.com/products/trivy/) ([source](https://github.com/aquasecurity/trivy)) | minor | `0.64.1` -> `0.65.0` | --- ### Release Notes <details> <summary>aquasecurity/trivy (mirror.gcr.io/aquasec/trivy)</summary> ### [`v0.65.0`](https://github.com/aquasecurity/trivy/blob/HEAD/CHANGELOG.md#0650-2025-07-30) [Compare Source](aquasecurity/trivy@v0.64.1...v0.65.0) ##### Features - add graceful shutdown with signal handling ([#​9242](aquasecurity/trivy#9242)) ([2c05882](aquasecurity/trivy@2c05882)) - add HTTP request/response tracing support ([#​9125](aquasecurity/trivy#9125)) ([aa5b32a](aquasecurity/trivy@aa5b32a)) - **alma:** add AlmaLinux 10 support ([#​9207](aquasecurity/trivy#9207)) ([861d51e](aquasecurity/trivy@861d51e)) - **flag:** add schema validation for `--server` flag ([#​9270](aquasecurity/trivy#9270)) ([ed4640e](aquasecurity/trivy@ed4640e)) - **image:** add Docker context resolution ([#​9166](aquasecurity/trivy#9166)) ([99cd4e7](aquasecurity/trivy@99cd4e7)) - **license:** observe pkg types option in license scanner ([#​9091](aquasecurity/trivy#9091)) ([d44af8c](aquasecurity/trivy@d44af8c)) - **misconf:** add private ip google access attribute to subnetwork ([#​9199](aquasecurity/trivy#9199)) ([263845c](aquasecurity/trivy@263845c)) - **misconf:** added logging and versioning to the gcp storage bucket ([#​9226](aquasecurity/trivy#9226)) ([110f80e](aquasecurity/trivy@110f80e)) - **repo:** add git repository metadata to reports ([#​9252](aquasecurity/trivy#9252)) ([f4b2cf1](aquasecurity/trivy@f4b2cf1)) - **report:** add CVSS vectors in sarif report ([#​9157](aquasecurity/trivy#9157)) ([60723e6](aquasecurity/trivy@60723e6)) - **sbom:** add SHA-512 hash support for CycloneDX SBOM ([#​9126](aquasecurity/trivy#9126)) ([12d6706](aquasecurity/trivy@12d6706)) ##### Bug Fixes - **alma:** parse epochs from rpmqa file ([#​9101](aquasecurity/trivy#9101)) ([82db2fc](aquasecurity/trivy@82db2fc)) - also check `filepath` when removing duplicate packages ([#​9142](aquasecurity/trivy#9142)) ([4d10a81](aquasecurity/trivy@4d10a81)) - **aws:** update amazon linux 2 EOL date ([#​9176](aquasecurity/trivy#9176)) ([0ecfed6](aquasecurity/trivy@0ecfed6)) - **cli:** Add more non-sensitive flags to telemetry ([#​9110](aquasecurity/trivy#9110)) ([7041a39](aquasecurity/trivy@7041a39)) - **cli:** ensure correct command is picked by telemetry ([#​9260](aquasecurity/trivy#9260)) ([b4ad00f](aquasecurity/trivy@b4ad00f)) - **cli:** panic: attempt to get os.Args\[1] when len(os.Args) < 2 ([#​9206](aquasecurity/trivy#9206)) ([adfa879](aquasecurity/trivy@adfa879)) - **license:** add missed `GFDL-NIV-1.1` and `GFDL-NIV-1.2` into Trivy mapping ([#​9116](aquasecurity/trivy#9116)) ([a692f29](aquasecurity/trivy@a692f29)) - **license:** handle WITH operator for `LaxSplitLicenses` ([#​9232](aquasecurity/trivy#9232)) ([b4193d0](aquasecurity/trivy@b4193d0)) - migrate from `*.list` to `*.md5sums` files for `dpkg` ([#​9131](aquasecurity/trivy#9131)) ([f224de3](aquasecurity/trivy@f224de3)) - **misconf:** correctly adapt azure storage account ([#​9138](aquasecurity/trivy#9138)) ([51aa022](aquasecurity/trivy@51aa022)) - **misconf:** correctly parse empty port ranges in google\_compute\_firewall ([#​9237](aquasecurity/trivy#9237)) ([77bab7b](aquasecurity/trivy@77bab7b)) - **misconf:** fix log bucket in schema ([#​9235](aquasecurity/trivy#9235)) ([7ebc129](aquasecurity/trivy@7ebc129)) - **misconf:** skip rewriting expr if attr is nil ([#​9113](aquasecurity/trivy#9113)) ([42ccd3d](aquasecurity/trivy@42ccd3d)) - **nodejs:** don't use prerelease logic for compare npm constraints ([#​9208](aquasecurity/trivy#9208)) ([fe96436](aquasecurity/trivy@fe96436)) - prevent graceful shutdown message on normal exit ([#​9244](aquasecurity/trivy#9244)) ([6095984](aquasecurity/trivy@6095984)) - **rootio:** check full version to detect `root.io` packages ([#​9117](aquasecurity/trivy#9117)) ([c2ddd44](aquasecurity/trivy@c2ddd44)) - **rootio:** fix severity selection ([#​9181](aquasecurity/trivy#9181)) ([6fafbeb](aquasecurity/trivy@6fafbeb)) - **sbom:** merge in-graph and out-of-graph OS packages in scan results ([#​9194](aquasecurity/trivy#9194)) ([aa944cc](aquasecurity/trivy@aa944cc)) - **sbom:** use correct field for licenses in CycloneDX reports ([#​9057](aquasecurity/trivy#9057)) ([143da88](aquasecurity/trivy@143da88)) - **secret:** add UTF-8 validation in secret scanner to prevent protobuf marshalling errors ([#​9253](aquasecurity/trivy#9253)) ([54832a7](aquasecurity/trivy@54832a7)) - **secret:** fix line numbers for multiple-line secrets ([#​9104](aquasecurity/trivy#9104)) ([e579746](aquasecurity/trivy@e579746)) - **server:** add HTTP transport setup to server mode ([#​9217](aquasecurity/trivy#9217)) ([1163b04](aquasecurity/trivy@1163b04)) - supporting .egg-info/METADATA in python.Packaging analyzer ([#​9151](aquasecurity/trivy#9151)) ([e306e2d](aquasecurity/trivy@e306e2d)) - **terraform:** `for_each` on a map returns a resource for every key ([#​9156](aquasecurity/trivy#9156)) ([153318f](aquasecurity/trivy@153318f)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xLjMiLCJ1cGRhdGVkSW5WZXIiOiI0MS4xLjMiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImltYWdlIl19--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/1073 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
Co-authored-by: knqyf263 <knqyf263@users.noreply.github.com> Co-authored-by: DmitriyLewen <91113035+DmitriyLewen@users.noreply.github.com>
Description
This PR adds git repository metadata to Trivy scan reports, enabling detailed repository information in both local and remote git repository scans. This enhancement provides users with valuable context about the scanned repository, including commit history, authorship, and version information.
Key Features
Git Metadata Extraction
Comprehensive Metadata Collection
Universal Git Detection
trivy repo
)trivy fs
) when scanning git directoriesImplementation Details
omitzero
JSON tags to ensure empty metadata doesn't clutter reportsExample Output
Remote Repository Scan
Related Issues
Checklist