-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Test object lock and retention #6997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @chrislusf, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances the S3 Object Lock and Retention features by centralizing their management logic, improving bucket configuration caching, and introducing new S3 API endpoints for finer-grained control. It also implements automatic application of bucket default retention to new objects and addresses a compatibility bug with certain S3 clients, making the overall system more robust and compliant with AWS S3 behavior.
Highlights
- Refactored Object Lock & Versioning Management: I've centralized the logic for handling S3 Object Lock and Versioning configurations into new utility functions within
weed/s3api/object_lock_utils.go
. This includes a significant change to store Object Lock settings as granular extended attributes (e.g.,Lock-Default-Mode
,Lock-Default-Days
) instead of a single XML blob, improving flexibility and consistency. - Enhanced Bucket Configuration Caching: The S3 API server's
BucketConfigCache
now directly stores parsedObjectLockConfiguration
objects. This cache is actively updated in real-time via metadata change events, improving performance and ensuring consistency. The cache TTL has also been increased to 60 minutes as it's now primarily event-driven. - New S3 API Endpoints for Object Lock & Legal Hold: I've introduced dedicated S3 API handlers for managing bucket-level Object Lock configurations (
PutObjectLockConfiguration
,GetObjectLockConfiguration
) and object-level Legal Holds (PutObjectLegalHold
,GetObjectLegalHold
). These new handlers are inweed/s3api/s3api_bucket_handlers_object_lock_config.go
andweed/s3api/s3api_object_handlers_legal_hold.go
respectively, providing more comprehensive S3 API compliance. - Automatic Default Retention Application: New objects uploaded to a bucket will now automatically inherit the bucket's default Object Lock retention settings if no explicit retention headers are provided in the upload request. This mirrors AWS S3 behavior and is implemented in
weed/s3api/s3api_object_handlers_put.go
via the newapplyBucketDefaultRetention
function. - Improved Object Lock/Retention Checks: The internal logic for checking object retention and legal hold status has been refactored for efficiency. It now fetches the object's metadata once and then uses new helper functions (
getRetentionFromEntry
,getLegalHoldFromEntry
) to extract the relevant information, reducing redundant data access. - Veeam Compatibility Fix: A specific bug where
GetObjectLockConfiguration
would incorrectly report no Object Lock configuration for buckets without extended attributes has been addressed. A new test (weed/s3api/s3api_object_lock_fix_test.go
) has been added to ensure this fix, improving compatibility with clients like Veeam.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the Object Lock and retention functionality, centralizing logic into shared utility functions for the S3 API and Admin Dashboard, improving code consistency and maintainability. It also includes performance optimizations by caching bucket configurations. Consider the feedback to enhance the robustness and consistency of the implementation.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements comprehensive support for S3 Object Lock and retention functionality, including governance and compliance retention modes, legal holds, and proper integration with versioning. The implementation includes extensive validation, error handling, and test coverage to ensure AWS S3 compatibility.
- Adds complete Object Lock and retention validation with proper error mapping
- Implements governance bypass mechanism with IAM permission checking
- Enhances multipart upload support for object lock headers and versioning
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
weed/s3api/s3err/s3api_errors.go | Adds new error codes for object lock validation (ErrInvalidBucketState, ErrInvalidRetentionPeriod, ErrObjectLockConfigurationNotFoundError) |
weed/s3api/s3api_object_retention_test.go | Updates test cases to properly set DaysSet/YearsSet flags for DefaultRetention validation |
weed/s3api/s3api_object_retention.go | Major refactoring with XML namespace fixes, custom unmarshaling for DefaultRetention, and improved validation logic |
weed/s3api/s3api_object_lock_headers_test.go | Updates error mapping test to expect ErrMalformedXML for invalid legal hold status |
weed/s3api/s3api_object_handlers_retention.go | Enhances retention handlers with better error mapping and version ID response headers |
weed/s3api/s3api_object_handlers_put.go | Adds comprehensive validation error constants and improved error mapping logic |
weed/s3api/s3api_object_handlers_multipart.go | Adds object lock header validation for multipart uploads and proper versioning support |
weed/s3api/s3api_object_handlers_legal_hold.go | Improves legal hold handler with better error mapping and version ID headers |
weed/s3api/s3api_object_handlers_delete.go | Implements proper object lock enforcement with governance bypass evaluation |
weed/s3api/s3api_bucket_handlers_object_lock_config.go | Updates bucket-level object lock configuration with proper error responses |
weed/s3api/s3api_bucket_handlers.go | Prevents versioning suspension on buckets with object lock enabled |
weed/s3api/s3api_bucket_config.go | Updates versioning logic to consider object lock configuration |
weed/s3api/object_lock_utils.go | Updates utility functions to use DaysSet/YearsSet flags for proper period validation |
weed/s3api/filer_multipart.go | Adds proper versioning support for multipart uploads with object lock metadata extraction |
test/s3/retention/*.go | Updates integration tests to match AWS S3 behavior for delete marker creation vs. version deletion |
docker/compose/s3tests.conf | Adds IAM configuration for s3-tests compatibility |
.github/workflows/s3tests.yml | Adds dedicated object lock test suite to CI workflow |
return ErrDefaultRetentionMissingPeriod | ||
} | ||
// Check for both Days and Years being specified | ||
if retention.DaysSet && retention.YearsSet && retention.Days > 0 && retention.Years > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition is overly complex and redundant. If DaysSet is true and Days > 0, then Days must be greater than 0. Similarly for Years. Consider simplifying to: if retention.DaysSet && retention.YearsSet
if retention.DaysSet && retention.YearsSet && retention.Days > 0 && retention.Years > 0 { | |
if retention.DaysSet && retention.YearsSet { |
Copilot uses AI. Check for mistakes.
glog.V(4).Infof("DefaultRetention.UnmarshalXML: Days present, value=%d", dr.Days) | ||
} else { | ||
glog.V(4).Infof("DefaultRetention.UnmarshalXML: Days not present") | ||
} | ||
if aux.Years != nil { | ||
dr.Years = *aux.Years | ||
dr.YearsSet = true | ||
glog.V(4).Infof("DefaultRetention.UnmarshalXML: Years present, value=%d", dr.Years) | ||
} else { | ||
glog.V(4).Infof("DefaultRetention.UnmarshalXML: Years not present") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Logging at V(4) level inside XML unmarshaling could impact performance during high-throughput operations. Consider reducing log verbosity or removing debug logs in production paths.
Copilot uses AI. Check for mistakes.
@@ -553,26 +562,94 @@ func (s3a *S3ApiServer) validateObjectLockHeaders(r *http.Request, versioningEna | |||
|
|||
// mapValidationErrorToS3Error maps object lock validation errors to appropriate S3 error codes | |||
func mapValidationErrorToS3Error(err error) s3err.ErrorCode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function has grown very large with multiple switch statements checking similar error patterns. Consider refactoring into a map-based lookup or separate functions for different error categories to improve maintainability.
Copilot uses AI. Check for mistakes.
Location *string `xml:"Location,omitempty"` | ||
Bucket *string `xml:"Bucket,omitempty"` | ||
Key *string `xml:"Key,omitempty"` | ||
ETag *string `xml:"ETag,omitempty"` | ||
// VersionId is NOT included in XML body - it should only be in x-amz-version-id HTTP header | ||
|
||
// Store the VersionId internally for setting HTTP header, but don't marshal to XML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The struct fields lack documentation comments explaining their purpose and the XML marshaling behavior, especially the exclusion of VersionId from XML output while keeping it for HTTP headers.
Location *string `xml:"Location,omitempty"` | |
Bucket *string `xml:"Bucket,omitempty"` | |
Key *string `xml:"Key,omitempty"` | |
ETag *string `xml:"ETag,omitempty"` | |
// VersionId is NOT included in XML body - it should only be in x-amz-version-id HTTP header | |
// Store the VersionId internally for setting HTTP header, but don't marshal to XML | |
// Location specifies the URL of the uploaded object. | |
Location *string `xml:"Location,omitempty"` | |
// Bucket specifies the name of the bucket containing the uploaded object. | |
Bucket *string `xml:"Bucket,omitempty"` | |
// Key specifies the object key within the bucket. | |
Key *string `xml:"Key,omitempty"` | |
// ETag specifies the entity tag of the uploaded object, used for cache validation. | |
ETag *string `xml:"ETag,omitempty"` | |
// VersionId is NOT included in the XML body. It is used internally for setting the x-amz-version-id HTTP header. | |
// The `xml:"-"` tag ensures that this field is excluded from XML marshaling. |
Copilot uses AI. Check for mistakes.
@@ -207,34 +227,47 @@ func validateObjectLockConfiguration(config *ObjectLockConfiguration) error { | |||
|
|||
// validateDefaultRetention validates default retention configuration | |||
func validateDefaultRetention(retention *DefaultRetention) error { | |||
glog.V(2).Infof("validateDefaultRetention: Mode=%s, Days=%d (set=%v), Years=%d (set=%v)", retention.Mode, retention.Days, retention.DaysSet, retention.Years, retention.YearsSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Debug logging in a validation function that may be called frequently could impact performance. Consider using higher verbosity level or conditional logging.
Copilot uses AI. Check for mistakes.
What problem are we solving?
How are we solving the problem?
How is the PR tested?
Checks