Skip to content

Crash when get_search_dirs called with sys.path containing non-ascii characters on Windows #16669

@zyf722

Description

@zyf722

Crash Report

On non-English Windows machines, if sys.path contains non-ASCII characters, the get_search_dirs function in modulefinder.py will raise a UnicodeDecodeError.

It seems that this issue arises because the binary string returned by subprocess.check_output is encoded with the encoding of the current default active code page, rather than the default UTF-8 encoding.

Traceback

Traceback (most recent call last):
  File "mypy\dmypy_server.py", line 236, in serve
  File "mypy\dmypy_server.py", line 285, in run_command
  File "mypy\dmypy_server.py", line 353, in cmd_run
  File "mypy\dmypy_server.py", line 424, in check
  File "mypy\dmypy_server.py", line 463, in initialize_fine_grained
  File "mypy\build.py", line 189, in build
  File "mypy\build.py", line 222, in _build
  File "mypy\modulefinder.py", line 832, in compute_search_paths
  File "mypy\modulefinder.py", line 752, in get_search_dirs
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 547: invalid start byte

To Reproduce

  1. Create a Python package under a path that includes non-ASCII characters.
  2. Perform an editable installation using either pip -e or using poetry install to update sys.path.
  3. Execute mypy, causing the exception.

Your Environment

  • Mypy version used: mypy 1.7.1
  • Python version used: cp3.8
  • Operating system and version: Windows 10

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions