Skip to content

Conversation

SAKURA-CAT
Copy link
Member

@SAKURA-CAT SAKURA-CAT commented Jun 25, 2025

Description

本PR的工作为:

  • 重构了swanlab内部的数据流动,修改了部分状态的初始化时机,这为 [REQUEST] resume 功能 #1054 做了最后准备。
  • 为了后续能平滑地接入protobuf, 封装了从备份、读取到上传的一整套逻辑,因此 Integrate SwanLab for offline/online experiment tracking for Accelerate huggingface/accelerate#3605 的测试可能需要修改。
  • 舍弃了swankit中SwanLabSharedSettings数据类型,改为(内部使用的)run_store全局变量,好处是更加方便,坏处是维护起来需要更加注意——这(应该)将是swanlab python部分最后一次增加全局变量,以后不再增加
  • 修改了备份逻辑,为了未来实现 swanlab-core #1063 ,现在强制开启backup备份功能,原本Settings(backup=False)逻辑依旧保留,数据将存在user cache目录下(实验结束后删除)
  • 修复了sync等功能的一些bug

这将是实现 #1054 之前的最后一个相关PR

Moved backup-related modules from swanlab.data.backup to swanlab.log.backup and transfer modules from swanlab.data.transfers to swanlab.transfers. Updated all relevant imports and usages throughout the codebase. Refactored Transfer and ProtoV0Transfer to accept media_dir and file_dir as constructor arguments. Improved error log uploading in CloudPyCallback and unified backup handler usage.
Refactored BackupHandler to accept run parameters directly instead of accessing run_store internally, improving modularity and testability. Updated cloud, local, and offline callbackers to pass run parameters explicitly and to use run_store consistently. Added access control to get_run_store to restrict usage to swanlab.data module. Moved uploader imports to explicit usage and centralized error log uploading in ProtoV0Transfer.
Moved and refactored callback logic from swanlab/data/run/callback.py into swanlab/data/callbacker/callback.py, and updated cloud, local, and offline callback implementations to inherit from the new base class. Centralized utility functions in swanlab/data/callbacker/utils.py and updated all callbackers to use these shared utilities for printing and path formatting. Removed redundant code and improved maintainability by reducing duplication and clarifying callback registration and cleanup logic.
Removed legacy transfer and backup modules, consolidating data transfer logic into a new ProtoTransfer singleton in swanlab/data/transfer.py. Updated callbackers to use ProtoTransfer for logging, metric, and runtime info handling. Migrated DataStore to swanlab/data, removed async_io utility, and cleaned up related imports and usages. Adjusted tests and internal references to reflect new module structure and APIs.
Replaces the previous ProtoTransfer and ModelsParser logic with a new DataPorter class that supports both experiment trace and sync upload modes, centralizing backup file parsing, data publishing, and resource management. Updates all callbackers and sync logic to use DataPorter, removes ModelsParser from proto/v0.py, and adds platformdirs to requirements. This refactor improves maintainability and consistency for experiment data handling and synchronization.
Applied the @synced decorator to DataPorter methods to ensure thread safety. Improved backup handling in SwanLabRun by deleting the run directory when backup is disabled. Updated SwanLabInitializer to use a system runtime directory for logs when backup is off, ensuring proper log storage.
Introduced an @inside decorator to restrict access to RunStore functions to the swanlab.data module or test runtime. Refactored test utilities to use a new UseMockRunState context manager for consistent client and store state management in tests. Updated imports and test code to align with these changes, improving test isolation and reliability.
Refactored LocalRunCallback to set logdir via environment variable and updated tests to use UseMockRunState for better isolation. Changed SwanLabRun to default operator to None. Cleared callbacks after validation in SwanLabInitializer. Improved and relocated error handling for invalid config parameters from test_config.py to test_sdk.py, adding more comprehensive tests for config input validation.
Introduced DisabledCallback for 'disabled' mode, ensuring proper handling when runs are not to be saved or uploaded. Refactored logdir initialization logic in SwanLabInitializer to unify and simplify directory setup for all modes, and moved run_dir cleanup logic from main.py to the cloud callback. Updated operator creation to use DisabledCallback in disabled mode.
@SAKURA-CAT SAKURA-CAT marked this pull request as ready for review June 25, 2025 10:35
@SAKURA-CAT SAKURA-CAT self-assigned this Jun 25, 2025
@SAKURA-CAT SAKURA-CAT added 🐛 bug Something isn't working 💪 enhancement New feature or request labels Jun 25, 2025
@SAKURA-CAT SAKURA-CAT requested review from Copilot and Zeyi-Lin June 26, 2025 03:23
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR restructures SwanLab’s data flow to prepare for protobuf support, replaces the legacy backup handler with a unified DataPorter pipeline, and simplifies runtime state management via a global RunStore.

  • Introduces RunStore and DataPorter to unify backup, read, and upload logic.
  • Deprecates SwanLabSharedSettings in favor of RunStore globals and forces backup in the user cache directory.
  • Refactors callbacks and SDK code to leverage the new architecture and updates tests to use UseMockRunState.

Reviewed Changes

Copilot reviewed 45 out of 46 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tutils/setup.py Renamed test setup context to UseMockRunState, set up RunStore.
test/unit/sync/v0/test_sync.py Updated sync tests to use DataPorter and validate backup file.
test/unit/proto/v0/test_models.py Switched to swanlab.proto.v0 models and adjusted file writes.
test/unit/data/test_sdk.py Adjusted disabled-mode assertions and added platformdirs import.
test/unit/data/run/test_main.py Wrapped run tests in UseMockRunState context.
test/unit/data/run/test_config.py Removed redundant error-config-input tests.
test/unit/data/run/metadata/test_runtime.py Replaced UseSetupHttp with UseMockRunState for API key masking.
test/unit/data/callbacker/test_local.py Updated local callback tests to use UseMockRunState.
test/unit/data/backup/test_datastore.py Updated import path for DataStore.
test/unit/core_python/uploader/test_model.py Corrected import for FileModel.
test/unit/core_python/test_client.py Switched client tests to UseMockRunState and client.
test/unit/conftest.py Added resets for DataPorter and run_store.
test/unit/_/test_setup.py Validated UseMockRunState behavior in setup tests.
swanlab/toolkit/init.py Removed SwanLabSharedSettings from exports.
swanlab/sync/init.py Replaced backup handler logic with DataPorter.
swanlab/proto/v1.py Added v1 protocol stub.
swanlab/proto/v0.py Updated import paths and streamlined model logic.
swanlab/log/type.py Expanded TypedDict to allow arbitrary return in LogHandler.
swanlab/log/backup/writer.py Removed old backup writer.
swanlab/log/backup/handler.py Removed old backup handler.
swanlab/log/backup/init.py Removed old backup init.
swanlab/data/utils.py Consolidated callback imports and updated prompt logic.
swanlab/data/sdk.py Refactored login/init logic to use RunStore; missing imports need fixes.
swanlab/data/run/main.py Overhauled run initialization and cleanup to use RunStore.
swanlab/data/run/helper.py Adjusted operator hook signatures.
swanlab/data/run/exp.py Refactored experiment class to use RunStore.
swanlab/data/run/init.py Streamlined exports and removed old register helper.
swanlab/data/store.py Introduced RunStore model and access guards.
swanlab/data/porter/datastore.py Updated error message in header version check.
swanlab/data/porter/init.py Implemented DataPorter end-to-end backup/sync pipeline.
swanlab/data/callbacker/utils.py Added utility callbacks for printing lifecycle messages.
swanlab/data/callbacker/offline.py Refactored offline callback to use DataPorter and RunStore.
swanlab/data/callbacker/local.py Refactored local callback to use DataPorter and RunStore.
swanlab/data/callbacker/disabled.py Added disabled-mode callback.
swanlab/data/callbacker/cloud.py Refactored cloud callback to use DataPorter and new imports.
swanlab/data/callbacker/callback.py Consolidated callback base class to use DataPorter and RunStore.
swanlab/data/callbacker/init.py Updated exports to match new callback classes.
swanlab/data/init.py Cleaned up legacy imports.
swanlab/core_python/client/init.py Removed rich status, streamlined project mounting logic.
swanlab/core_python/auth/providers/api_key.py Introduced create_login_info factory and updated imports.
swanlab/core_python/init.py Removed blanket uploader imports.
swanlab/cli/commands/sync/init.py Fixed variable naming in sync CLI command.
swanlab/init.py Added module exports for media types.
Comments suppressed due to low confidence (2)

swanlab/data/sdk.py:74

  • The name auth is not imported in this module; calls to auth.code_login and auth.create_login_info will fail. Please add an import, e.g. from swanlab.core_python import auth.
    login_info = auth.code_login(api_key, save) if api_key else auth.create_login_info(save)

swanlab/data/sdk.py:75

  • The function create_client is not imported in this module; this call will raise a NameError. Please add from swanlab.core_python import create_client.
    create_client(login_info)

@SAKURA-CAT SAKURA-CAT changed the title Feat/backup proto Refactor/data-code Jun 26, 2025
@SAKURA-CAT SAKURA-CAT merged commit 66adcaf into main Jun 26, 2025
5 checks passed
@SAKURA-CAT SAKURA-CAT deleted the feat/backup-proto branch June 26, 2025 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working 💪 enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants