-
Notifications
You must be signed in to change notification settings - Fork 141
Refactor/data-code #1126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/data-code #1126
Conversation
Moved backup-related modules from swanlab.data.backup to swanlab.log.backup and transfer modules from swanlab.data.transfers to swanlab.transfers. Updated all relevant imports and usages throughout the codebase. Refactored Transfer and ProtoV0Transfer to accept media_dir and file_dir as constructor arguments. Improved error log uploading in CloudPyCallback and unified backup handler usage.
Refactored BackupHandler to accept run parameters directly instead of accessing run_store internally, improving modularity and testability. Updated cloud, local, and offline callbackers to pass run parameters explicitly and to use run_store consistently. Added access control to get_run_store to restrict usage to swanlab.data module. Moved uploader imports to explicit usage and centralized error log uploading in ProtoV0Transfer.
Moved and refactored callback logic from swanlab/data/run/callback.py into swanlab/data/callbacker/callback.py, and updated cloud, local, and offline callback implementations to inherit from the new base class. Centralized utility functions in swanlab/data/callbacker/utils.py and updated all callbackers to use these shared utilities for printing and path formatting. Removed redundant code and improved maintainability by reducing duplication and clarifying callback registration and cleanup logic.
Removed legacy transfer and backup modules, consolidating data transfer logic into a new ProtoTransfer singleton in swanlab/data/transfer.py. Updated callbackers to use ProtoTransfer for logging, metric, and runtime info handling. Migrated DataStore to swanlab/data, removed async_io utility, and cleaned up related imports and usages. Adjusted tests and internal references to reflect new module structure and APIs.
Replaces the previous ProtoTransfer and ModelsParser logic with a new DataPorter class that supports both experiment trace and sync upload modes, centralizing backup file parsing, data publishing, and resource management. Updates all callbackers and sync logic to use DataPorter, removes ModelsParser from proto/v0.py, and adds platformdirs to requirements. This refactor improves maintainability and consistency for experiment data handling and synchronization.
Applied the @synced decorator to DataPorter methods to ensure thread safety. Improved backup handling in SwanLabRun by deleting the run directory when backup is disabled. Updated SwanLabInitializer to use a system runtime directory for logs when backup is off, ensuring proper log storage.
Introduced an @inside decorator to restrict access to RunStore functions to the swanlab.data module or test runtime. Refactored test utilities to use a new UseMockRunState context manager for consistent client and store state management in tests. Updated imports and test code to align with these changes, improving test isolation and reliability.
Refactored LocalRunCallback to set logdir via environment variable and updated tests to use UseMockRunState for better isolation. Changed SwanLabRun to default operator to None. Cleared callbacks after validation in SwanLabInitializer. Improved and relocated error handling for invalid config parameters from test_config.py to test_sdk.py, adding more comprehensive tests for config input validation.
Introduced DisabledCallback for 'disabled' mode, ensuring proper handling when runs are not to be saved or uploaded. Refactored logdir initialization logic in SwanLabInitializer to unify and simplify directory setup for all modes, and moved run_dir cleanup logic from main.py to the cloud callback. Updated operator creation to use DisabledCallback in disabled mode.
62f96a3
to
c32e872
Compare
c32e872
to
918a977
Compare
df69061
to
e60cab9
Compare
e60cab9
to
07e818c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR restructures SwanLab’s data flow to prepare for protobuf support, replaces the legacy backup handler with a unified DataPorter pipeline, and simplifies runtime state management via a global RunStore.
- Introduces
RunStore
andDataPorter
to unify backup, read, and upload logic. - Deprecates
SwanLabSharedSettings
in favor ofRunStore
globals and forces backup in the user cache directory. - Refactors callbacks and SDK code to leverage the new architecture and updates tests to use
UseMockRunState
.
Reviewed Changes
Copilot reviewed 45 out of 46 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
tutils/setup.py | Renamed test setup context to UseMockRunState , set up RunStore . |
test/unit/sync/v0/test_sync.py | Updated sync tests to use DataPorter and validate backup file. |
test/unit/proto/v0/test_models.py | Switched to swanlab.proto.v0 models and adjusted file writes. |
test/unit/data/test_sdk.py | Adjusted disabled-mode assertions and added platformdirs import. |
test/unit/data/run/test_main.py | Wrapped run tests in UseMockRunState context. |
test/unit/data/run/test_config.py | Removed redundant error-config-input tests. |
test/unit/data/run/metadata/test_runtime.py | Replaced UseSetupHttp with UseMockRunState for API key masking. |
test/unit/data/callbacker/test_local.py | Updated local callback tests to use UseMockRunState . |
test/unit/data/backup/test_datastore.py | Updated import path for DataStore . |
test/unit/core_python/uploader/test_model.py | Corrected import for FileModel . |
test/unit/core_python/test_client.py | Switched client tests to UseMockRunState and client . |
test/unit/conftest.py | Added resets for DataPorter and run_store . |
test/unit/_/test_setup.py | Validated UseMockRunState behavior in setup tests. |
swanlab/toolkit/init.py | Removed SwanLabSharedSettings from exports. |
swanlab/sync/init.py | Replaced backup handler logic with DataPorter . |
swanlab/proto/v1.py | Added v1 protocol stub. |
swanlab/proto/v0.py | Updated import paths and streamlined model logic. |
swanlab/log/type.py | Expanded TypedDict to allow arbitrary return in LogHandler . |
swanlab/log/backup/writer.py | Removed old backup writer. |
swanlab/log/backup/handler.py | Removed old backup handler. |
swanlab/log/backup/init.py | Removed old backup init. |
swanlab/data/utils.py | Consolidated callback imports and updated prompt logic. |
swanlab/data/sdk.py | Refactored login/init logic to use RunStore; missing imports need fixes. |
swanlab/data/run/main.py | Overhauled run initialization and cleanup to use RunStore . |
swanlab/data/run/helper.py | Adjusted operator hook signatures. |
swanlab/data/run/exp.py | Refactored experiment class to use RunStore . |
swanlab/data/run/init.py | Streamlined exports and removed old register helper. |
swanlab/data/store.py | Introduced RunStore model and access guards. |
swanlab/data/porter/datastore.py | Updated error message in header version check. |
swanlab/data/porter/init.py | Implemented DataPorter end-to-end backup/sync pipeline. |
swanlab/data/callbacker/utils.py | Added utility callbacks for printing lifecycle messages. |
swanlab/data/callbacker/offline.py | Refactored offline callback to use DataPorter and RunStore . |
swanlab/data/callbacker/local.py | Refactored local callback to use DataPorter and RunStore . |
swanlab/data/callbacker/disabled.py | Added disabled-mode callback. |
swanlab/data/callbacker/cloud.py | Refactored cloud callback to use DataPorter and new imports. |
swanlab/data/callbacker/callback.py | Consolidated callback base class to use DataPorter and RunStore . |
swanlab/data/callbacker/init.py | Updated exports to match new callback classes. |
swanlab/data/init.py | Cleaned up legacy imports. |
swanlab/core_python/client/init.py | Removed rich status, streamlined project mounting logic. |
swanlab/core_python/auth/providers/api_key.py | Introduced create_login_info factory and updated imports. |
swanlab/core_python/init.py | Removed blanket uploader imports. |
swanlab/cli/commands/sync/init.py | Fixed variable naming in sync CLI command. |
swanlab/init.py | Added module exports for media types. |
Comments suppressed due to low confidence (2)
swanlab/data/sdk.py:74
- The name
auth
is not imported in this module; calls toauth.code_login
andauth.create_login_info
will fail. Please add an import, e.g.from swanlab.core_python import auth
.
login_info = auth.code_login(api_key, save) if api_key else auth.create_login_info(save)
swanlab/data/sdk.py:75
- The function
create_client
is not imported in this module; this call will raise a NameError. Please addfrom swanlab.core_python import create_client
.
create_client(login_info)
Description
本PR的工作为:
SwanLabSharedSettings
数据类型,改为(内部使用的)run_store全局变量,好处是更加方便,坏处是维护起来需要更加注意——这(应该)将是swanlab python部分最后一次增加全局变量,以后不再增加backup
备份功能,原本Settings(backup=False)
逻辑依旧保留,数据将存在user cache目录下(实验结束后删除)这将是实现 #1054 之前的最后一个相关PR