Skip to content

Conversation

chrislusf
Copy link
Collaborator

@chrislusf chrislusf commented Jul 30, 2025

  • UI changes, componentized configuration fields
  • Configuration saved in protobuf format
  • EC worker implementations

chrislusf added 30 commits July 24, 2025 00:37
Current Status
✅ Master: Healthy and running (port 9333)
✅ Filer: Healthy and running (port 8888)
✅ Volume Servers: All 6 servers running (ports 8080-8085)
🔄 Admin/Workers: Will start when dependencies are ready
but:
1. one volume should not be repeatedly worked on.
2. ec shards needs to be distributed and source data should be deleted.
* distribution of ec shards needs improvement
* need configuration to enable ec
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @chrislusf, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented a series of improvements to the admin server and its interaction with workers, with a strong focus on Erasure Coding (EC) capabilities. This work includes a significant overhaul of the configuration management system, moving to a more robust protobuf-based, schema-driven approach. I've also added extensive Docker-based testing infrastructure to ensure the reliability of new EC worker implementations. Furthermore, the user interface has been enhanced to provide more detailed insights into EC volumes and shards, as well as comprehensive collection-level information.

Highlights

  • Modular Configuration System: The core configuration system has been refactored to use Protocol Buffers and a new schema-driven approach. This makes configurations more robust, type-safe, and easier to validate and extend in the future. The UI now dynamically renders configuration forms based on these schemas.
  • Enhanced Erasure Coding (EC) Support: The pull request significantly advances Erasure Coding (EC) functionality. This includes new task detection for EC needs, a refined assignment process for EC workers, and detailed UI pages to monitor individual EC shards and their distribution across the cluster, as well as aggregated views of EC volumes.
  • Comprehensive Docker Testing Environment: A comprehensive Docker-based testing environment has been introduced, allowing for easy setup and testing of EC and vacuum worker operations. This environment includes all necessary SeaweedFS components (master, volume servers, filer, admin, workers) and tools for load generation and monitoring.
  • Improved Maintenance Task Management: The maintenance manager now integrates with an 'active topology' and 'pending operations' tracker. This allows the system to make more intelligent decisions about task planning, considering the current cluster state and ongoing operations to avoid conflicts and optimize resource utilization.
  • Refined UI for Cluster Management: The UI has been updated with new pages for EC shards, EC volumes, and detailed collection views. These enhancements provide administrators with a much clearer and more granular understanding of their data distribution and the status of maintenance operations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant set of features for admin server and worker management, particularly for Erasure Coding (EC) tasks. The changes include a new design document, extensive testing infrastructure via Docker and Makefiles, a schema-driven configuration system, and new UI pages for managing EC volumes and collections. The overall design is robust and the move to protobuf-based, schema-driven configuration is a major improvement. I've identified a few areas for improvement, including a potential bug in a test script, a correctness issue in a reflection helper, and some inconsistencies in documentation and naming that could cause confusion. Addressing these points will enhance the quality and maintainability of this new functionality.

Comment on lines 363 to 381
func isZeroValue(v reflect.Value) bool {
switch v.Kind() {
case reflect.Bool:
return !v.Bool()
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
return v.Int() == 0
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
return v.Uint() == 0
case reflect.Float32, reflect.Float64:
return v.Float() == 0
case reflect.String:
return v.String() == ""
case reflect.Slice, reflect.Map, reflect.Array:
return v.IsNil() || v.Len() == 0
case reflect.Interface, reflect.Ptr:
return v.IsNil()
}
return false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The custom isZeroValue implementation is not fully correct and reinvents functionality already provided by the Go standard library. For example, v.IsNil() will panic on non-pointer array types. The reflect.Value.IsZero() method, available since Go 1.13, correctly checks if a value is the zero value for its type across all kinds and is more robust and maintainable.

func isZeroValue(v reflect.Value) bool {
	return v.IsZero()
}

@@ -0,0 +1,18 @@
FROM alpine:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the latest tag for a base image can lead to non-reproducible builds, as it may point to different versions of Alpine over time. It's a best practice to pin to a specific version (e.g., alpine:3.19) to ensure build consistency and avoid unexpected breakages.

FROM alpine:3.19

clean: ## Stop and remove all containers, networks, and volumes
@echo "🧹 Cleaning up SeaweedFS test environment..."
@docker-compose -f $(COMPOSE_FILE) down -v --remove-orphans
@docker system prune -f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docker system prune -f command is very aggressive and can remove any stopped containers, networks, and dangling images on the user's system, not just those related to this project. This could lead to unexpected data loss. The docker-compose -f $(COMPOSE_FILE) down -v --remove-orphans command is sufficient for cleaning up the environment. Consider removing this line to make the clean target safer.

@echo "Creating large files with random data to trigger EC (targeting ~60MB total to exceed 50MB limit)..."
@for i in {1..12}; do \
echo "Creating 5MB random file $$i..."; \
docker run --rm --network admin_integration_seaweed_net -v /tmp:/tmp --entrypoint sh chrislusf/seaweedfs:local -c "dd if=/dev/urandom of=/tmp/largefile$$i.dat bs=1M count=5 2>/dev/null && weed upload -master=master:9333 /tmp/largefile$$i.dat && rm /tmp/largefile$$i.dat"; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Redirecting stderr of the dd command to /dev/null suppresses all error messages, which can hide problems like disk space issues or permissions errors during test data generation. It's better to allow errors to be visible for easier debugging.

		docker run --rm --network admin_integration_seaweed_net -v /tmp:/tmp --entrypoint sh chrislusf/seaweedfs:local -c "dd if=/dev/urandom of=/tmp/largefile$$i.dat bs=1M count=5 && weed upload -master=master:9333 /tmp/largefile$$i.dat && rm /tmp/largefile$$i.dat"; \

Comment on lines +36 to +40
if docker-compose -f docker-compose-ec-test.yml config | grep -q "workingDir=/work"; then
echo -e "${GREEN}✅ Worker working directory option is properly configured${NC}"
else
echo -e "${RED}❌ Worker working directory option is missing${NC}"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This check for workingDir=/work will fail because the docker-compose-ec-test.yml file does not define a workingDir for the worker services, nor is the -workingDir flag passed to the worker command. To align with the test's expectation and the feature described in the README, the worker command in the compose file should be updated to include this setting.

}

// ShowClusterEcShards handles the cluster EC shards page (individual shards view)
func (h *ClusterHandlers) ShowClusterEcShards(c *gin.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function name ShowClusterEcShards is misleading as it fetches and renders a list of EC volumes, not individual shards. It calls GetClusterEcVolumes and uses the app.ClusterEcVolumes template. To improve code clarity and maintainability, consider renaming this function to ShowClusterEcVolumes to accurately reflect its purpose.

chrislusf and others added 6 commits July 30, 2025 11:44
@chrislusf chrislusf merged commit 891a2fb into master Jul 30, 2025
22 checks passed
@chrislusf chrislusf deleted the cleaner-base-before-adding-worker-logs-and-read-logs-from-admin branch July 30, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant