Skip to content

HYPERPARAMETER EVOLUTION #392

@glenn-jocher

Description

@glenn-jocher

Training hyperparameters in this repo are defined in train.py, including augmentation settings:

yolov3/train.py

Lines 35 to 54 in df4f25e

# Training hyperparameters f
hyp = {'giou': 1.2, # giou loss gain
'xy': 4.062, # xy loss gain
'wh': 0.1845, # wh loss gain
'cls': 15.7, # cls loss gain
'cls_pw': 3.67, # cls BCELoss positive_weight
'obj': 20.0, # obj loss gain
'obj_pw': 1.36, # obj BCELoss positive_weight
'iou_t': 0.194, # iou training threshold
'lr0': 0.00128, # initial learning rate
'lrf': -4., # final LambdaLR learning rate = lr0 * (10 ** lrf)
'momentum': 0.95, # SGD momentum
'weight_decay': 0.000201, # optimizer weight decay
'hsv_s': 0.8, # image HSV-Saturation augmentation (fraction)
'hsv_v': 0.388, # image HSV-Value augmentation (fraction)
'degrees': 1.2, # image rotation (+/- deg)
'translate': 0.119, # image translation (+/- fraction)
'scale': 0.0589, # image scale (+/- gain)
'shear': 0.401} # image shear (+/- deg)

We began with darknet defaults before evolving the values using the result of our hyp evolution code:

python3 train.py --data data/coco.data --weights '' --img-size 320 --epochs 1 --batch-size 64 -- accumulate 1 --evolve

The process is simple: for each new generation, the prior generation with the highest fitness (out of all previous generations) is selected for mutation. All parameters are mutated simultaneously based on a normal distribution with about 20% 1-sigma:

yolov3/train.py

Lines 390 to 396 in df4f25e

# Mutate
init_seeds(seed=int(time.time()))
s = [.15, .15, .15, .15, .15, .15, .15, .15, .15, .00, .05, .20, .20, .20, .20, .20, .20, .20] # sigmas
for i, k in enumerate(hyp.keys()):
x = (np.random.randn(1) * s[i] + 1) ** 2.0 # plt.hist(x.ravel(), 300)
hyp[k] *= float(x) # vary by sigmas

Fitness is defined as a weighted mAP and F1 combination at the end of epoch 0, under the assumption that better epoch 0 results correlate to better final results, which may or may not be true.

yolov3/utils/utils.py

Lines 605 to 608 in bd92457

def fitness(x):
# Returns fitness (for use with results.txt or evolve.txt)
return 0.5 * x[:, 2] + 0.5 * x[:, 3] # fitness = 0.5 * mAP + 0.5 * F1

An example snapshot of the results are here. Fitness is on the y axis (higher is better).
from utils.utils import *; plot_evolution_results(hyp)
evolve

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions