Skip to content

XGBoost 1.1.0 SNAPSHOT gpu_hist still numerically unstable #5632

@Zethson

Description

@Zethson

Dear everyone,

according to #5023 a model using gpu_hist should be reproducible using 1.1.0

The version that I was using for these experiments is:
https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/xgboost-1.1.0_SNAPSHOT%2B86beb68ce8ffa73a90def6a9862f0e8a917b58c3-py2.py3-none-manylinux1_x86_64.whl

The code that I am using is:

#!/home/user/miniconda/envs/xgboost-1.0.2-cuda-10.1/bin/python
import click
import xgboost as xgb
import numpy as np
from sklearn.datasets import fetch_covtype, load_boston
from sklearn.model_selection import train_test_split
import time
import random
import os


@click.command()
@click.option('--seed', type=int, default=0)
@click.option('--epochs', type=int, default=25)
@click.option('--no-cuda', type=bool, default=False)
def train(seed, epochs, no_cuda, dataset):
    # Fetch dataset using sklearn
    dataset == 'covertype':
    dataset = fetch_covtype()
    param = {
        'objective': 'multi:softmax',
        'num_class': 8
        # 'single_precision_histogram': True
    }

    X = dataset.data
    y = dataset.target

    # Create 0.75/0.25 train/test split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, train_size=0.75, random_state=0)

    # Set random seeds
    random_seed(seed, param)
    param['subsample'] = 0.5
    param['colsample_bytree'] = 0.5
    param['colsample_bylevel'] = 0.5

    # Convert input data from numpy to XGBoost format
    dtrain = xgb.DMatrix(X_train, label=y_train)
    dtest = xgb.DMatrix(X_test, label=y_test)

    # Set CPU or GPU as training device
    if no_cuda:
        param['tree_method'] = 'hist'
    else:
        param['tree_method'] = 'gpu_hist'

    # Train on the chosen device
    results = {}
    gpu_runtime = time.time()
    xgb.train(param, dtrain, epochs, evals=[(dtest, 'test')], evals_result=results)
    if not no_cuda:
        print(f'GPU Run Time: {str(time.time() - gpu_runtime)} seconds')
    else:
        print(f'CPU Run Time: {str(time.time() - gpu_runtime)} seconds')

def random_seed(seed, param):
    os.environ['PYTHONHASHSEED'] = str(seed) # Python general
    np.random.seed(seed)
    random.seed(seed) # Python random
    param['seed'] = seed


if __name__ == '__main__':
    train()

When training the covertype dataset with 1000 epochs I get the following results:

  1. [999] test-merror:0.05985
    GPU Run Time: 470.45096254348755 seconds
  2. [999] test-merror:0.05978
    GPU Run Time: 493.8201196193695 seconds
  3. [999] test-merror:0.05894
    GPU Run Time: 484.21098017692566 seconds
  4. [999] test-merror:0.05962
    GPU Run Time: 477.36872577667236 seconds
  5. [999] test-merror:0.06025
    GPU Run Time: 487.09354853630066 seconds

Is this still to be expected or why do I not get perfectly reproducible results?

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions