Skip to content

train - Train deep learning networks for neuroimaging classification

This task enables the training of a convolutional neural network (CNN) classifier using different formats of inputs (whole 3D images, 3D patches or 2D slices), as defined in [Wen et al., 2020]. It mainly relies on the PyTorch deep learning library [Paszke et al., 2019].

Prerequisites

You need to execute the clinicadl tsvtool getlabels and clinicadl tsvtool {split|kfold} commands prior to running this task to have the correct TSV file organization. Moreover, there should be a CAPS, obtained running the t1-linear pipeline of ClinicaDL.

Running the task

The training task can be run with the following command line:

clinicadl train <mode> <network_type> <caps_directory> \
                <preprocessing> <tsv_path> <output_directory> <architecture>
where mandatory arguments are:

  • mode (str) is the type of input used. Must be chosen between image, patch, roi and slice.
  • network_type (str) is the type of network used. The options depend on the type of input used, but at most it can be chosen between autoencoder, cnn and multicnn.
  • caps_directory (str) is the input folder containing the neuroimaging data in a CAPS hierarchy.
  • preprocessing (str) corresponds to the preprocessing pipeline whose outputs will be used for training. The current version only supports t1-linear, but t1-extensive will be implemented in next versions of clinicadl.
  • tsv_path (str) is the input folder of a TSV file tree generated by clinicadl tsvtool {split|kfold}.
  • output_directory (str) is the folder where the results are stored.
  • architecture (str) is the name of the architecture used (e.g. Conv5_FC3). It must correspond to a class that inherits from nn.Module imported in tools/deep_learning/models/__init__.py.

Options shared for all values of mode are organized in groups:

  • Computational resources
    • --use_cpu (bool) forces to use CPU. Default behaviour is to try to use a GPU and to raise an error if it is not found.
    • --nproc (int) is the number of workers used by the DataLoader. Default value: 2.
    • --batch_size (int) is the size of the batch used in the DataLoader. Default value: 2.
    • --evaluation_steps (int) gives the number of iterations to perform an evaluation internal to an epoch. Default will only perform an evaluation at the end of each epoch.
  • Data management
    • --diagnoses (list of str) is the list of the labels that will be used for training. These labels must be chosen from {AD,CN,MCI,sMCI,pMCI}. Default will use AD and CN labels.
    • --baseline (bool) is a flag to load only _baseline.tsv files instead of .tsv files comprising all the sessions. Default: False.
    • --unnormalize (bool) is a flag to disable min-max normalization that is performed by default. Default: False.
    • --data_augmentation (list of str) is the list of data augmentation transforms applied to the training data. Must be chosen in [None, Noise, Erasing, CropPad, Smoothing]. Default: False.
    • --sampler (str) is the sampler used on the training set. It must be chosen in [random, weighted]. weighted will give a stronger weight to underrepresented classes. Default: random.
  • Cross-validation arguments
    • --n_splits (int) is a number of splits k to load in the case of a k-fold cross-validation. Default will load a single-split.
    • --split (list of int) is a subset of folds that will be used for training. By default all splits available are used.
  • Optimization parameters
    • --epochs (int) is the maximum number of epochs. Default: 20.
    • --learning_rate (float) is the learning rate used to perform weight update. Default: 1e-4.
    • --weight_decay (float) is the weight decay used by the Adam optimizer. Default: 1e-4.
    • --dropout (float) is the rate of dropout applied in dropout layers. Default: 0.
    • --patience (int) is the number of epochs for early stopping patience. Default: 0.
    • --tolerance (float) is the value used for early stopping tolerance. Default: 0.
    • --accumulation_steps (int) gives the number of iterations during which gradients are accumulated before performing the weights update. This allows to virtually increase the size of the batch. Default: 1.

Specific options

Other options are highly dependent on the input and the type of network used. Please refer to the corresponding sections for more information.

Tip

Typing clinicadl train {image|patch|roi|slice} --help will show you the networks that are available for training in this category.

Outputs

At the first level of the file system, outputs are identical regardless of the mode and network_type. Below is an example of the output file system for a network trained with data split between train and validation sets corresponding to a 5-fold cross-validation.

results
├── commandline.json
├── environment.txt
├── fold-0
├── fold-1
├── fold-2
├── fold-3
└── fold-4

where:

  • commandline.json is a file containing all the arguments necessary to reproduce the experiment,
  • environment.txt contains description of the environment used for the experiment,
  • fold-<i> is a folder containing the result of the run on the i-th split of the 5-fold cross-validation.

Validation procedure

A run of clinicadl train is necessarily associated to a TSV file system defining a series of data splits (k-fold cross-validation or single split). In the case of a single split the results folder will only contain a folder named fold-0.

The structure of the fold-<i> folders partly depends on the type of network trained. They may contain the following folders:

  • models is the folder containing checkpoints saved at the end of each epoch, as well as the best model according to a specific metric on the validation set. The selection of a best model is only performed at the end of an epoch (a model cannot be selected based on internal evaluations in an epoch).
  • tensorboard_logs contains logs that can be visualized with TensorBoard.
  • cnn_classification specific to (multi)cnn contains TSV files corresponding to the evaluation of the best models as saved in models.
  • autoencoder_reconstruction specific to autoencoder contains reconstructions of the best model selected on the validation loss.