This code needs to be refactored. Specifically, both the vLLM and SGLang branches require changes. __enter__ function should not be called directly. ``` def wake_up(self, *args, **kwargs): """Load model weights and build kv cache.""" if not self.is_sleep: return self.sharding_manager.__enter__() # pylint: disable=C2801 self.is_sleep = False ``` cc @wuxibin89