Skip to content

[Bug] tools/split_dataset_list.py 划分数据集异常 #2070

@GreatV

Description

@GreatV

环境信息

  1. PaddleSeg版本:PaddleSeg release/2.5
  2. PaddlePaddle版本:PaddlePaddle 2.2
  3. 操作系统信息:Linux
  4. Python版本号:Python3.7

使用的脚本命令

export dataset_root=/home/real_data
export images_dir_name=/home/real_data/images
export labels_dir_name=/home/real_data/annotations
python tools/split_dataset_list.py ${dataset_root} ${images_dir_name} ${labels_dir_name} \
    --split 0.7 0.2 0.1

报错信息

Traceback (most recent call last):
  File "tools/split_dataset_list.py", line 151, in <module>
    generate_list(args)
  File "tools/split_dataset_list.py", line 82, in generate_list
    raise ValueError("划分比例之和必须为1")
ValueError: 划分比例之和必须为1

原因分析

此处对浮点数进行了不等于判断

if sum(args.split) != 1.0:
raise ValueError("划分比例之和必须为1")

实际上不能直接对浮点数进行等于或不等于判读

Python 3.7.5 (default, Dec  9 2021, 17:04:37) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 0.7 + 0.2 + 0.1
0.9999999999999999
>>> 

修改建议

添加或者使用如下类似的判断

def float_equal(a: float, b: float) -> bool:
    allowed_error = 1e-8
    return abs(a - b) <= allowed_error
    if not float_equal(sum(args.split), 1.0):
        raise ValueError("划分比例之和必须为1")

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions