Skip to content

train roi - Train deep learning networks using predefined regions of interest (ROI)

This option allows training a network using a set of regions of interest (ROI) defined by masks.

If no ROI is specified, the inputs will correspond to two patches of size 50x50x50 voxels manually centered on each hippocampus. This manual centering has only been done for the t1-linear pipeline, and is only available if --use_extracted_roi is not enabled.

Coronal view of ROI patches

One architecture is implemented in clinicadl for the roi mode: Conv4_FC3, adapted to t1-linear pipeline outputs.

Adding a custom architecture

It is possible to add a custom architecture and train it with clinicadl. Detailed instructions can be found here.

Definition of masks

Regions of interest must correspond to masks that are defined in the CAPS directory caps_directory at <caps_directory>/masks/roi_based/tpl-<tpl>. Here tpl corresponds to the template used for registration in the preprocessing pipeline wanted.

The mask corresponding to the region roi must be named according to the following pattern: tpl-<tpl>_key1-<value_1>...keyN-<value_N>_roi-<roi>_mask.nii.gz.

Keys that are included in the mask name correspond only to those describing an operation modifying the size of the image compared to the original template.

train roi autoencoder - Train autoencoders using ROI

The objective of an autoencoder is to learn to reconstruct images given in input while performing a dimension reduction.

The difference between the input and the output image is given by the mean squared error. In clinicadl, autoencoders are designed based on a CNN architecture.

Running the task

Here is the command line to train an autoencoder on t1-linear outputs with the predefined architecture of ClinicaDL:

clinicadl train roi autoencoder <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
where mandatory arguments are:

  • caps_directory (str) is the input folder containing the neuroimaging data in a CAPS hierarchy.
  • tsv_path (str) is the input folder of a TSV file tree generated by clinicadl tsvtool {split|kfold}.
  • output_directory (str) is the folder where the results are stored.

Common options

Options that are common to all train input and network types can be found in the introduction of clinicadl train.

The options specific to this task are the following:

  • --roi_list (list of str) includes the names of the regions wanted. Each region must correspond to a mask defined in caps_directory. See the dedicated section for more information. Default will extract two patches centered on the hippocampi (available for t1-linear preprocessing only).
  • --uncropped_roi (bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.
  • --use_extracted_roi (bool) if this flag is given, the outputs of clinicadl extract are used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if --roi_list is set to default.
  • --transfer_learning_path (str) is the path to a result folder (output of clinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.
  • --visualization (bool) if this flag is given, inputs of the train and the validation sets and their corresponding reconstructions are written in autoencoder_reconstruction. Inputs are reconstructed based on the model that obtained the best validation loss.

Masks

For more information on the masks needed for ROI extraction please refer to the section on extract.

Outputs

The complete output file system is the following (the folder autoencoder_reconstruction is created only if the flag --visualization was given):

results
├── commandline.json
├── environment.txt
└── fold-0
    ├── autoencoder_reconstruction
    │   ├── train
    │   │   ├── input-0.nii.gz
    │   │   ├── ...
    │   │   ├── input-5.nii.gz
    │   │   ├── output-0.nii.gz
    │   │   ├── ...
    │   │   └── output-5.nii.gz
    │   └── validation
    │        ├── input-0.nii.gz
    │        ├── ...
    │        ├── input-5.nii.gz
    │        ├── output-0.nii.gz
    │        ├── ...
    │        └── output-5.nii.gz
    ├── models
    │    └── best_loss
    │        └── model_best.pth.tar
    └── tensorboard_logs
         ├── train
         │    └── events.out.tfevents.XXXX
         └── validation
              └── events.out.tfevents.XXXX

autoencoder_reconstruction contains the reconstructions of the two regions of the first three participants of the dataset.

train roi cnn - Train classification CNN using ROI

The objective of this unique CNN is to learn to predict labels associated to images. The set of images used corresponds to the two hippocampi in MR volumes.

The output of the CNN is a vector of size equals to the number of classes in this dataset. This vector can be preprocessed by the softmax function to produce a probability for each class. During training, the CNN is optimized according to the cross-entropy loss. Its value becomes null for a subset of images if the probability of the CNN is 1, with respect to the true class (ground truth) of each image in the subset.

Running the task

Here is the command line to train a CNN on t1-linear outputs with the predefined architecture of ClinicaDL:

clinicadl train roi cnn <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
where mandatory arguments are:

  • caps_directory (str) is the input folder containing the neuroimaging data in a CAPS hierarchy.
  • tsv_path (str) is the input folder of a TSV file tree generated by clinicadl tsvtool {split|kfold}.
  • output_directory (str) is the folder where the results are stored.

Common options

Options that are common to all train input and network types can be found in the introduction of clinicadl train.

The options specific to this task are the following:

  • --roi_list (list of str) includes the names of the targeted regions. Each region corresponds to a mask defined in caps_directory. See the dedicated section for more information. Default behavior will extract two patches centered on the hippocampi (available for t1-linear preprocessing only).
  • --uncropped_roi (bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.
  • --use_extracted_roi (bool) if this flag is given, the outputs of clinicadl extract are used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if --roi_list is set to default.
  • --transfer_learning_path (str) is the path to a result folder (output of clinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.
  • --transfer_learning_selection (str) corresponds to the metric according to which the best model of transfer_learning_path will be loaded. This argument will only be taken into account if the source network is a CNN. Choices are best_loss and best_balanced_accuracy. Default: best_balanced_accuracy.
  • --selection_threshold (float) threshold on the balanced accuracies to compute the image-level performance. Regions are selected if their balanced accuracy is greater than the threshold. Default corresponds to no selection.

Outputs

The complete output file system is the following:

results
├── commandline.json
├── environment.txt
└── fold-0
    ├── cnn_classification
    │   ├── best_balanced_accuracy
    │   │   ├── train_image_level_metrics.tsv
    │   │   ├── train_image_level_prediction.tsv
    │   │   ├── train_roi_level_metrics.tsv
    │   │   ├── train_roi_level_prediction.tsv
    │   │   ├── validation_image_level_metrics.tsv
    │   │   ├── validation_image_level_prediction.tsv
    │   │   ├── validation_roi_level_metrics.tsv
    │   │   └── validation_roi_level_prediction.tsv
    │   └── best_loss
    │       └── ...
    ├── models
    │   ├── best_balanced_accuracy
    │   │   └── model_best.pth.tar
    │   └── best_loss
    │       └── model_best.pth.tar
    └── tensorboard_logs
         ├── train
         │    └── events.out.tfevents.XXXX
         └── validation
              └── events.out.tfevents.XXXX

Level of performance

The performance metrics are obtained at two different levels: region-level and image-level. Region-level performance corresponds to an evaluation in which both ROI are considered to be independent. However it is not the case, and what is more interesting is the evaluation at the image-level, for which the predictions of the two regions were assembled.

train roi multicnn - Train one classification CNN per region

Contrary to the preceding network in which all regions of interest were used as input of a unique CNN, this option allows to train a CNN per region. Then the predictions of the CNNs are assembled to determine the label at the image level.

The output of each CNN is a vector of size equals to the number of classes in this dataset. This vector can be preprocessed by the softmax function to produce a probability for each class. During training, the CNN is optimized according to the cross-entropy loss. Its value becomes null for a subset of images if the probability of the CNN is 1, with respect to the true class (ground truth) of each image in the subset.

Running the task

Here is the command line to train a multi-CNN on t1-linear outputs with the predefined architecture of ClinicaDL:

clinicadl train roi multicnn <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
where mandatory arguments are:

  • caps_directory (str) is the input folder containing the neuroimaging data in a CAPS hierarchy.
  • tsv_path (str) is the input folder of a TSV file tree generated by clinicadl tsvtool {split|kfold}.
  • output_directory (str) is the folder where the results are stored.

Common options

Options that are common to all train input and network types can be found in the introduction of clinicadl train.

The options specific to this task are the following:

  • --roi_list (list of str) includes the names of the regions wanted. Each region must correspond to a mask defined in caps_directory. See the dedicated section for more information. Default will extract two patches centered on the hippocampi (available for t1-linear preprocessing only).
  • --uncropped_roi (bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.
  • --use_extracted_roi (bool) if this flag is given, the outputs of clinicadl extract are used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if --roi_list is set to default.
  • --transfer_learning_path (str) is the path to a result folder (output of clinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.
  • --transfer_learning_selection (str) corresponds to the metric according to which the best model of transfer_learning_path will be loaded. This argument will only be taken into account if the source network is a CNN. Choices are best_loss and best_balanced_accuracy. Default: best_balanced_accuracy.
  • --selection_threshold (float) threshold on the balanced accuracies to compute the image-level performance. Regions are selected if their balanced accuracy is greater than the threshold. Default corresponds to no selection.

Outputs

The complete output file system is the following:

results
├── commandline.json
├── environment.txt
└── fold-0
    ├── cnn_classification
    │   ├── best_balanced_accuracy
    │   │   ├── train_image_level_metrics.tsv
    │   │   ├── train_image_level_prediction.tsv
    │   │   ├── train_roi_level_metrics.tsv
    │   │   ├── train_roi_level_prediction.tsv
    │   │   ├── validation_image_level_metrics.tsv
    │   │   ├── validation_image_level_prediction.tsv
    │   │   ├── validation_roi_level_metrics.tsv
    │   │   └── validation_roi_level_prediction.tsv
    │   └── best_loss
    │       └── ...
    ├── models
    │   ├── cnn-0
    │   │   ├── best_balanced_accuracy
    │   │   │   └── model_best.pth.tar
    │   │   └── best_loss
    │   │       └── model_best.pth.tar
    │   ├── ...
    │   └── cnn-<N>
    │       ├── best_balanced_accuracy
    │       │   └── model_best.pth.tar
    │       └── best_loss
    │           └── model_best.pth.tar    
    └── tensorboard_logs
        ├── cnn-0
        │   ├── train
        │   │   └── events.out.tfevents.XXXX
        │   └── validation
        │       └── events.out.tfevents.XXXX
        ├── ...
        └── cnn-<N>
            ├── train
            │   └── events.out.tfevents.XXXX
            └── validation
                └── events.out.tfevents.XXXX

models and tensorboard_logs contain one output per CNN trained. The number of networks N is equal to the number of regions.

Level of performance

The performance metrics are obtained at two different levels: region-level and image-level. Region-level performance corresponds to the concatenation of the performance metrics of all CNNs. The evaluation at the image-level is obtained by assembling the predictions of all the CNNs.