-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Reproduction
entropy_threshold = torch.quantile(entropies.flatten().float(), self.token_entropy_percentile_threshold) maybe affected by the padding part when padding length is long as follows:
for a batch 3 x 266
completion_mask = tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]], device='cuda:0', dtype=torch.int32)
0 means padding part
entropies = tensor([[1.2702e+00, 1.3205e+00, 6.2836e-01, 2.1991e+00, 2.2179e+00, 1.6478e+00,
6.3501e-01, 9.1053e-01, 4.8306e-02, 1.0783e+00, 7.0509e-02, 2.1077e-01,
1.6892e-01, 1.1447e+00, 2.6262e+00, 3.2486e+00, 2.0156e+00, 2.7739e+00,
1.2785e+00, 1.1311e+00, 1.0281e-01, 4.9198e-01, 1.8831e-02, 2.3387e-01,
7.2308e-01, 2.4297e+00, 4.5663e-03, 1.8507e+00, 1.0836e+00, 3.1988e+00,
3.2380e-01, 1.8277e+00, 1.8826e-01, 1.4109e+00, 8.0878e-03, 4.3774e-01,
1.2556e+00, 2.2839e+00, 2.3225e+00, 2.0874e+00, 9.0465e-02, 3.5494e-01,
1.1227e+00, 2.3471e+00, 2.0810e+00, 1.2473e+00, 1.4121e+00, 1.5958e+00,
4.4065e-01, 1.1179e+00, 2.4339e+00, 2.6808e+00, 1.5477e+00, 1.2992e+00,
1.4326e+00, 3.8487e-02, 9.4686e-01, 6.5933e-01, 2.0350e+00, 1.2149e+00,
2.1054e+00, 5.6843e-01, 1.8403e+00, 7.4225e-01, 1.5231e+00, 9.5386e-01,
2.2875e+00, 1.2475e+00, 1.0944e-04, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00],
[1.3470e-01, 2.4224e-03, 8.3849e-04, 1.0720e-04, 1.1099e-05, 1.9082e-04,
6.6665e-03, 4.5110e-03, 2.2445e-04, 7.5642e-05, 1.0496e-03, 3.5285e-05,
8.8276e-04, 5.9501e-04, 1.3655e-04, 5.4653e-05, 3.5949e-04, 3.4237e-04,
7.3795e-06, 6.3655e-06, 4.7269e-06, 1.7719e-02, 1.7946e-02, 1.8263e-03,
4.6334e-04, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00],
[3.4264e+00, 1.1146e+00, 3.7437e+00, 2.6915e+00, 2.6340e-02, 3.4455e-01,
1.4153e-01, 6.3256e-03, 8.5822e-01, 3.9944e+00, 3.9376e+00, 4.5879e+00,
3.5924e+00, 5.2182e+00, 1.5317e+00, 5.1628e+00, 1.6261e+00, 1.2072e+00,
5.1156e+00, 3.9282e+00, 1.0089e+00, 9.9207e-02, 1.7462e+00, 5.0760e+00,
2.2809e+00, 7.6361e-02, 4.2759e+00, 4.2620e+00, 1.1679e+00, 3.1164e+00,
7.1298e-01, 4.8609e+00, 2.0440e+00, 1.3066e+00, 5.3272e-02, 4.2136e-02,
5.4197e+00, 3.9152e+00, 3.6927e+00, 8.0015e-01, 5.1841e+00, 2.5229e+00,
4.5350e-01, 5.4434e+00, 2.5333e+00, 1.2548e+00, 1.2609e+00, 2.9604e+00,
3.8737e+00, 5.1873e+00, 3.7877e+00, 4.7049e-01, 1.7894e+00, 4.3747e+00,
3.1842e+00, 7.1943e-01, 2.6951e-01, 7.1841e-01, 4.7726e+00, 2.7275e+00,
5.0485e+00, 4.2020e+00, 4.1917e-01, 2.3996e+00, 6.2737e+00, 4.7447e+00,
5.5378e+00, 2.1280e+00, 3.0073e+00, 2.8357e-01, 3.4029e+00, 2.9456e+00,
4.9045e-01, 1.8688e+00, 1.0355e+00, 1.3309e+00, 3.6269e+00, 3.5338e+00,
1.1587e+00, 4.0486e+00, 3.8523e+00, 2.9897e+00, 3.5559e+00, 4.4778e+00,
1.6323e+00, 4.0981e+00, 1.6340e+00, 4.3732e-01, 1.1709e-04, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00, 4.9226e+00,
4.9226e+00, 4.9226e+00]], device='cuda:0', grad_fn=)
4.9226e+00 means padding part
so after calculating, the entropy_threshold = tensor(4.9226, device='cuda:0', grad_fn=)
System Info
trl env
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete