-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Description
Training hyperparameters in this repo are defined in train.py, including augmentation settings:
Lines 35 to 54 in df4f25e
# Training hyperparameters f | |
hyp = {'giou': 1.2, # giou loss gain | |
'xy': 4.062, # xy loss gain | |
'wh': 0.1845, # wh loss gain | |
'cls': 15.7, # cls loss gain | |
'cls_pw': 3.67, # cls BCELoss positive_weight | |
'obj': 20.0, # obj loss gain | |
'obj_pw': 1.36, # obj BCELoss positive_weight | |
'iou_t': 0.194, # iou training threshold | |
'lr0': 0.00128, # initial learning rate | |
'lrf': -4., # final LambdaLR learning rate = lr0 * (10 ** lrf) | |
'momentum': 0.95, # SGD momentum | |
'weight_decay': 0.000201, # optimizer weight decay | |
'hsv_s': 0.8, # image HSV-Saturation augmentation (fraction) | |
'hsv_v': 0.388, # image HSV-Value augmentation (fraction) | |
'degrees': 1.2, # image rotation (+/- deg) | |
'translate': 0.119, # image translation (+/- fraction) | |
'scale': 0.0589, # image scale (+/- gain) | |
'shear': 0.401} # image shear (+/- deg) | |
We began with darknet defaults before evolving the values using the result of our hyp evolution code:
python3 train.py --data data/coco.data --weights '' --img-size 320 --epochs 1 --batch-size 64 -- accumulate 1 --evolve
The process is simple: for each new generation, the prior generation with the highest fitness (out of all previous generations) is selected for mutation. All parameters are mutated simultaneously based on a normal distribution with about 20% 1-sigma:
Lines 390 to 396 in df4f25e
# Mutate | |
init_seeds(seed=int(time.time())) | |
s = [.15, .15, .15, .15, .15, .15, .15, .15, .15, .00, .05, .20, .20, .20, .20, .20, .20, .20] # sigmas | |
for i, k in enumerate(hyp.keys()): | |
x = (np.random.randn(1) * s[i] + 1) ** 2.0 # plt.hist(x.ravel(), 300) | |
hyp[k] *= float(x) # vary by sigmas | |
Fitness is defined as a weighted mAP and F1 combination at the end of epoch 0, under the assumption that better epoch 0 results correlate to better final results, which may or may not be true.
Lines 605 to 608 in bd92457
def fitness(x): | |
# Returns fitness (for use with results.txt or evolve.txt) | |
return 0.5 * x[:, 2] + 0.5 * x[:, 3] # fitness = 0.5 * mAP + 0.5 * F1 | |
An example snapshot of the results are here. Fitness is on the y axis (higher is better).
from utils.utils import *; plot_evolution_results(hyp)