GitHub - tdhock/cv-same-other-paper

Slides

See newer slides in my two-new-algos-sci-ml repo!

slides/ contains files for making presentation slides.

Title: Same versus Other Cross-Validation for comparing models trained on different groups of data

Abstract: cross-validation is an essential algorithm in any machine learning analysis. Standard K-Fold cross-validation is useful for comparing different algorithms on a single data set. We propose a new variant, Same versus Other cross-validation, which can be used to determine the extent to which you can get accurate predictions, by training on some different data subset/group (person, image, geographic region, year, etc). We discuss applications to several benchmark and real-world data sets, including predicting childhood autism, carbon emissions, and presence of objects in images.

See also https://github.com/tdhock/two-new-algos-sci-ml for other slides that include a subset of the figures.

And https://cloud.r-project.org/web/packages/mlr3resampling/vignettes/Newer_resamplers.html which explains how to use the software.

Textbook chapter with an explanation of cross-validation: Introduction to Machine Learning and Neural Networks.

Implementation tutorials:

When is it useful to train with combined groups? in R, most recommended, shows how to use my mlr3resampling package.

Older tutorials that show how to implement same/other cross-validation without my mlr3resampling package:

Generalization to new subsets in R
Generalization to new subsets (in python)

Tutorials about how to run machine learning experiments in parallel:

Cross-validation experiments on the cluster (in python)
The importance of hyper-parameter tuning explains how to use mlr3batchmark and batchtools R packages to parallelize machine learning experiments.

Introductions to cluster computing on NAU monsoon:

27 Mar 2025

conv_images.R computes results then

conv_images_10fold_figure.R makes

and

which shows there is a slight improvement (all-same) for the convolutional neural network, when the two subsets are the two digit image data sets.

25 Mar 2025

conv_images_figure.R makes

The figure above compares prediction error rates of five learning algorithms for the MNIST_EMNIST_rot data set, which has two subsets, both of which are images of digits, which look alike. The figure shows that training on the other subset is not sufficient to get the same level of prediction error, even for a convolutional network (1.8% same versus 8.9% other when predicting MNIST for example). These data suggest that learning is more difficult that we may expect, even with a convolutional neural network which is assumed to have good generalization/learning— actually this figure is slightly misleading because a ReLU activation was forgotton between the last two linear layers— this was fixed in the more recent figures, conv_images_10fold_*.

4 Apr 2024

data_Classif_simulation.R makes

6 Mar 2024

data-meta.R updated to handle more groups:

                  data.name memory.kb   rows n.groups small_group small_N large_group large_N features classes min.rows
                     <char>     <int>  <int>    <int>      <char>   <int>      <char>   <int>    <int>   <int>    <int>
 1:                   vowel        92    990        2        test     462       train     528       10      11       42
 2:                waveform       145    800        2       train     300        test     500       21       3       94
 3: CanadaFires_downSampled       353   1491        4         306     287         395     450       46       2      138
 4:                aztrees3       587   5956        3          NE    1464           S    2929       21       2       55
 5:                aztrees4       587   5956        4          SW     497          SE    2432       21       2       55
 6:         CanadaFires_all      1122   4827        4         306     364         326    2538       46       2      140
 7:                    spam      2078   4601        2        test    1536       train    3065       57       2      595
 8:                 zipUSPS     18752   9298        2        test    2007       train    7291      256      10      147
 9:             NSCH_autism     66242  46010        2        2019   18202        2020   27808      364       2      546
10:                  EMNIST    429712  70000        2        test   10000       train   60000      784      10     1000
11:            FashionMNIST    429712  70000        2        test   10000       train   60000      784      10     1000
12:                  KMNIST    429712  70000        2        test   10000       train   60000      784      10     1000
13:                   MNIST    429712  70000        2        test   10000       train   60000      784      10      892
14:                  QMNIST    736548 120000        2        test   60000       train   60000      784      10     5421
15:                 CIFAR10   1441256  60000        2        test   10000       train   50000     3072      10     1000
16:                   STL10   2813121  13000        2       train    5000        test    8000    27648      10      500

23 Feb 2024

data_Classif_canada_fires.R makes data_Classif/CanadaFires*csv and

> canada.fires[, table(classe2, classe3)]
                  classe3
classe2            charred green other road scorched shadow water
  bare                   0     0    66    0        0      0     0
  bog                    0     0    41    0        0      0     0
  charred              288     0     0    0        0      0     0
  green                  0   176     0    0        0      0     0
  Lichen                 0     0    47    0        0      0     0
  lowgreen               0   124     0    0        0      0     0
  Mortality              0     0    63    0        0      0     0
  road                   0     0     0  169        0      0     0
  scorched               0     0     0    0      300      0     0
  shadow(affected)       0     0     0    0        0     91     0
  shadow(green)          0     0     0    0        0     81     0
  water                  0     0     0    0        0      0   217

14 Feb 2024

data_Classif_batchmark_registry.R reads result of data_Classif_batchmark.R and writes data_Classif_batchmark_registry.csv and creates visualizations below,

6 Feb 2024

data-meta.R creates data-meta.csv

       data.name memory.kb test%   rows features classes min.rows.set.class
          <char>     <int> <int>  <int>    <int>   <int>              <int>
 1:        vowel        92    46    990       10      11                 42
 2:     waveform       145    62    800       21       3                 94
 3:         khan      2003    28     88     2308       4                  3
 4:         spam      2078    33   4601       57       2                595
 5:      zipUSPS     18752    21   9298      256      10                147
 6:     14cancer     22546    27    198    16063      14                  2
 7:       EMNIST    429712    14  70000      784      10               1000
 8: FashionMNIST    429712    14  70000      784      10               1000
 9:       KMNIST    429712    14  70000      784      10               1000
10:        MNIST    429712    14  70000      784      10                892
11:       QMNIST    736548    50 120000      784      10               5421
12:      CIFAR10   1441256    16  60000     3072      10               1000
13:        STL10   2813121    61  13000    27648      10                500

Motivation

is the iid assumption verified in real data?
train/test data sets
- mlbench? no explicit train/test column, see mlbench.R
- mlr3data https://mlr3data.mlr-org.com/ TODO
- caret https://topepo.github.io/caret/data-sets.html segmentationData has Case column with values Train and Test. TODO
- tidymodels https://modeldata.tidymodels.org/reference/index.html TODO
- ESL2 data processed in data_Classif_esl2.R
- list of image classification data sets: https://pytorch.org/vision/stable/datasets.html
- pages like https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST often have a split arg.
- https://github.com/pytorch/vision/tree/main/torchvision/datasets is source code.

Below we see about 10 torchvision data sets with train arg.

>>> torch.__version__
'1.13.0+cpu'
>>> import torchvision.datasets
>>> torchvision.__version__
'0.14.0+cpu'
>>> for data_name in dir(torchvision.datasets):
...     data_class = getattr(torchvision.datasets, data_name)
...     ann_dict = getattr(data_class.__init__, "__annotations__", {})
...     if "train" in ann_dict:
...         print(data_name)
CIFAR10
CIFAR100
FashionMNIST
HMDB51
KMNIST
Kitti
MNIST
PhotoTour
QMNIST
UCF101
USPS

newer versions show the same data sets.

Why doesn’t Caltech101/256 show up above? no split/train arg.

Why doesn’t CELEBA show up? it does have split arg.

split arg can be train/test/extra https://pytorch.org/vision/stable/generated/torchvision.datasets.SVHN.html#torchvision.datasets.SVHN

Some have both train and split https://pytorch.org/vision/stable/generated/torchvision.datasets.EMNIST.html#torchvision.datasets.EMNIST

classes instead of split https://pytorch.org/vision/stable/generated/torchvision.datasets.LSUN.html#torchvision.datasets.LSUN

exceptions / not parsed correctly:

{'STL10': ({'unlabeled', 'test', 'train+unlabeled', 'train'}, " One of {'train', 'test', 'unlabeled', 'train+unlabeled'}.\n            Accordingly, dataset is selected.\n")}
{'Cityscapes': (['fine', 'train', 'test', 'val', 'train', 'train_extra', 'val'], ' The image split to use, ``train``, ``test`` or ``val`` if mode="fine"\n            otherwise ``train``, ``train_extra`` or ``val``\n')}
{'EMNIST': (['byclass', 'bymerge', 'balanced', 'letters', 'digits', 'mnist'], ' The dataset has 6 different splits: ``byclass``, ``bymerge``,\n            ``balanced``, ``letters``, ``digits`` and ``mnist``. This argument specifies\n            which one to use.\n')}
{'LFWPairs': (['train', 'test', '10fold', '10fold'], ' The image split to use. Can be one of ``train``, ``test``,\n            ``10fold``. Defaults to ``10fold``.\n')}
{'MovingMNIST': (['train', 'test', 'None', 'split=None'], ' The dataset split, supports ``None`` (default), ``"train"`` and ``"test"``.\n            If ``split=None``, the full data is returned.\n')}

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
data_Classif		data_Classif
data_Classif_figures		data_Classif_figures
neurips_supplementary		neurips_supplementary
slides		slides
.gitignore		.gitignore
README.org		README.org
conv_images.R		conv_images.R
conv_images_10fold_figure.R		conv_images_10fold_figure.R
conv_images_10fold_figure_pval.png		conv_images_10fold_figure_pval.png
conv_images_10fold_figures_same_other.png		conv_images_10fold_figures_same_other.png
conv_images_10fold_test.csv		conv_images_10fold_test.csv
conv_images_figure.R		conv_images_figure.R
conv_images_figures_same_other.png		conv_images_figures_same_other.png
conv_images_test.csv		conv_images_test.csv
data-meta.R		data-meta.R
data-meta.csv		data-meta.csv
data_Classif_FishSonar.R		data_Classif_FishSonar.R
data_Classif_MNIST_other.R		data_Classif_MNIST_other.R
data_Classif_NSCH_autism.R		data_Classif_NSCH_autism.R
data_Classif_aztrees.R		data_Classif_aztrees.R
data_Classif_aztrees.csv		data_Classif_aztrees.csv
data_Classif_aztrees3.png		data_Classif_aztrees3.png
data_Classif_aztrees4.png		data_Classif_aztrees4.png
data_Classif_batchmark.R		data_Classif_batchmark.R
data_Classif_batchmark_algos.R		data_Classif_batchmark_algos.R
data_Classif_batchmark_algos_registries.R		data_Classif_batchmark_algos_registries.R
data_Classif_batchmark_algos_registries.csv		data_Classif_batchmark_algos_registries.csv
data_Classif_batchmark_algos_registries_time.csv		data_Classif_batchmark_algos_registries_time.csv
data_Classif_batchmark_algos_registry.R		data_Classif_batchmark_algos_registry.R
data_Classif_batchmark_algos_registry.csv		data_Classif_batchmark_algos_registry.csv
data_Classif_batchmark_algos_registry_error_mean_sd.png		data_Classif_batchmark_algos_registry_error_mean_sd.png
data_Classif_batchmark_algos_registry_error_mean_sd_p.png		data_Classif_batchmark_algos_registry_error_mean_sd_p.png
data_Classif_batchmark_algos_registry_minutes_mean_sd.png		data_Classif_batchmark_algos_registry_minutes_mean_sd.png
data_Classif_batchmark_algos_registry_time.csv		data_Classif_batchmark_algos_registry_time.csv
data_Classif_batchmark_registry.R		data_Classif_batchmark_registry.R
data_Classif_batchmark_registry.csv		data_Classif_batchmark_registry.csv
data_Classif_batchmark_registry_glmnet_featureless.png		data_Classif_batchmark_registry_glmnet_featureless.png
data_Classif_batchmark_registry_glmnet_featureless_mean_sd.png		data_Classif_batchmark_registry_glmnet_featureless_mean_sd.png
data_Classif_batchmark_registry_glmnet_mean_sd.png		data_Classif_batchmark_registry_glmnet_mean_sd.png
data_Classif_batchmark_registry_glmnet_median_quartiles.png		data_Classif_batchmark_registry_glmnet_median_quartiles.png
data_Classif_batchmark_registry_scatter_all.png		data_Classif_batchmark_registry_scatter_all.png
data_Classif_batchmark_registry_scatter_all_segments.png		data_Classif_batchmark_registry_scatter_all_segments.png
data_Classif_batchmark_registry_scatter_all_segments_flip.png		data_Classif_batchmark_registry_scatter_all_segments_flip.png
data_Classif_batchmark_registry_scatter_other.png		data_Classif_batchmark_registry_scatter_other.png
data_Classif_batchmark_registry_scatter_other_all.png		data_Classif_batchmark_registry_scatter_other_all.png
data_Classif_batchmark_registry_scatter_other_all_similar.png		data_Classif_batchmark_registry_scatter_other_all_similar.png
data_Classif_batchmark_registry_scatter_other_segments.png		data_Classif_batchmark_registry_scatter_other_segments.png
data_Classif_batchmark_registry_scatter_other_segments_flip.png		data_Classif_batchmark_registry_scatter_other_segments_flip.png
data_Classif_batchmark_registry_scatter_other_zoom.png		data_Classif_batchmark_registry_scatter_other_zoom.png
data_Classif_batchmark_sizes.R		data_Classif_batchmark_sizes.R
data_Classif_batchmark_sizes_registry.R		data_Classif_batchmark_sizes_registry.R
data_Classif_canada_fires.R		data_Classif_canada_fires.R
data_Classif_esl2.R		data_Classif_esl2.R
data_Classif_simulation.R		data_Classif_simulation.R
data_Classif_simulation_error_panels.png		data_Classif_simulation_error_panels.png
data_Classif_simulation_scatter.png		data_Classif_simulation_scatter.png
data_Classif_unbalanced.R		data_Classif_unbalanced.R
data_Classif_vision_orig.py		data_Classif_vision_orig.py
data_Classif_vision_simple.py		data_Classif_vision_simple.py
data_Regr_exoskeleton_metabolic.py		data_Regr_exoskeleton_metabolic.py
data_Regr_simulation.R		data_Regr_simulation.R
mlbench.R		mlbench.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Slides

27 Mar 2025

25 Mar 2025

4 Apr 2024

6 Mar 2024

23 Feb 2024

14 Feb 2024

6 Feb 2024

Motivation

About

Uh oh!

Releases

Packages

Languages

tdhock/cv-same-other-paper

Folders and files

Latest commit

History

Repository files navigation

Slides

27 Mar 2025

25 Mar 2025

4 Apr 2024

6 Mar 2024

23 Feb 2024

14 Feb 2024

6 Feb 2024

Motivation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages