train roi - Train deep learning networks using predefined regions of interest (ROI)¶
This option allows training a network using a set of regions of interest (ROI) defined by masks.
If no ROI is specified, the inputs will correspond to two patches of size 50x50x50 voxels
manually centered on each hippocampus.
This manual centering has only been done for the t1-linear pipeline, and is only available if
--use_extracted_roi is not enabled.

One architecture is implemented in clinicadl for the roi mode:
Conv4_FC3, adapted to t1-linear pipeline outputs.
Adding a custom architecture
It is possible to add a custom architecture and train it with clinicadl.
Detailed instructions can be found here.
Definition of masks¶
Regions of interest must correspond to masks that are defined in the CAPS directory caps_directory
at <caps_directory>/masks/roi_based/tpl-<tpl>. Here tpl corresponds to the template used for registration
in the preprocessing pipeline wanted.
The mask corresponding to the region roi must be named according to the following pattern:
tpl-<tpl>_key1-<value_1>...keyN-<value_N>_roi-<roi>_mask.nii.gz.
Keys that are included in the mask name correspond only to those describing an operation modifying the size of the image compared to the original template.
train roi autoencoder - Train autoencoders using ROI¶
The objective of an autoencoder is to learn to reconstruct images given in input while performing a dimension reduction.
The difference between the input and the output image is given by the mean squared error. In clinicadl, autoencoders are designed based on a CNN architecture.
Running the task¶
Here is the command line to train an autoencoder on t1-linear outputs with the predefined architecture of ClinicaDL:
clinicadl train roi autoencoder <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
caps_directory(str) is the input folder containing the neuroimaging data in a CAPS hierarchy.tsv_path(str) is the input folder of a TSV file tree generated byclinicadl tsvtool {split|kfold}.output_directory(str) is the folder where the results are stored.
Common options
Options that are common to all train input and network types can be found in the introduction of
clinicadl train.
The options specific to this task are the following:
--roi_list(list of str) includes the names of the regions wanted. Each region must correspond to a mask defined incaps_directory. See the dedicated section for more information. Default will extract two patches centered on the hippocampi (available fort1-linearpreprocessing only).--uncropped_roi(bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.--use_extracted_roi(bool) if this flag is given, the outputs ofclinicadl extractare used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if--roi_listis set to default.--transfer_learning_path(str) is the path to a result folder (output ofclinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.--visualization(bool) if this flag is given, inputs of the train and the validation sets and their corresponding reconstructions are written inautoencoder_reconstruction. Inputs are reconstructed based on the model that obtained the best validation loss.
Masks
For more information on the masks needed for ROI extraction please refer to the section on
extract.
Outputs¶
The complete output file system is the following (the folder autoencoder_reconstruction is created only if the
flag --visualization was given):
results
├── commandline.json
├── environment.txt
└── fold-0
├── autoencoder_reconstruction
│ ├── train
│ │ ├── input-0.nii.gz
│ │ ├── ...
│ │ ├── input-5.nii.gz
│ │ ├── output-0.nii.gz
│ │ ├── ...
│ │ └── output-5.nii.gz
│ └── validation
│ ├── input-0.nii.gz
│ ├── ...
│ ├── input-5.nii.gz
│ ├── output-0.nii.gz
│ ├── ...
│ └── output-5.nii.gz
├── models
│ └── best_loss
│ └── model_best.pth.tar
└── tensorboard_logs
├── train
│ └── events.out.tfevents.XXXX
└── validation
└── events.out.tfevents.XXXX
autoencoder_reconstruction contains the reconstructions of the two regions of the first three participants of the dataset.
train roi cnn - Train classification CNN using ROI¶
The objective of this unique CNN is to learn to predict labels associated to images. The set of images used corresponds to the two hippocampi in MR volumes.
The output of the CNN is a vector of size equals to the number of classes in this dataset. This vector can be preprocessed by the softmax function to produce a probability for each class. During training, the CNN is optimized according to the cross-entropy loss. Its value becomes null for a subset of images if the probability of the CNN is 1, with respect to the true class (ground truth) of each image in the subset.
Running the task¶
Here is the command line to train a CNN on t1-linear outputs with the predefined architecture of ClinicaDL:
clinicadl train roi cnn <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
caps_directory(str) is the input folder containing the neuroimaging data in a CAPS hierarchy.tsv_path(str) is the input folder of a TSV file tree generated byclinicadl tsvtool {split|kfold}.output_directory(str) is the folder where the results are stored.
Common options
Options that are common to all train input and network types can be found in the introduction of
clinicadl train.
The options specific to this task are the following:
--roi_list(list of str) includes the names of the targeted regions. Each region corresponds to a mask defined incaps_directory. See the dedicated section for more information. Default behavior will extract two patches centered on the hippocampi (available fort1-linearpreprocessing only).--uncropped_roi(bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.--use_extracted_roi(bool) if this flag is given, the outputs ofclinicadl extractare used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if--roi_listis set to default.--transfer_learning_path(str) is the path to a result folder (output ofclinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.--transfer_learning_selection(str) corresponds to the metric according to which the best model oftransfer_learning_pathwill be loaded. This argument will only be taken into account if the source network is a CNN. Choices arebest_lossandbest_balanced_accuracy. Default:best_balanced_accuracy.--selection_threshold(float) threshold on the balanced accuracies to compute the image-level performance. Regions are selected if their balanced accuracy is greater than the threshold. Default corresponds to no selection.
Outputs¶
The complete output file system is the following:
results
├── commandline.json
├── environment.txt
└── fold-0
├── cnn_classification
│ ├── best_balanced_accuracy
│ │ ├── train_image_level_metrics.tsv
│ │ ├── train_image_level_prediction.tsv
│ │ ├── train_roi_level_metrics.tsv
│ │ ├── train_roi_level_prediction.tsv
│ │ ├── validation_image_level_metrics.tsv
│ │ ├── validation_image_level_prediction.tsv
│ │ ├── validation_roi_level_metrics.tsv
│ │ └── validation_roi_level_prediction.tsv
│ └── best_loss
│ └── ...
├── models
│ ├── best_balanced_accuracy
│ │ └── model_best.pth.tar
│ └── best_loss
│ └── model_best.pth.tar
└── tensorboard_logs
├── train
│ └── events.out.tfevents.XXXX
└── validation
└── events.out.tfevents.XXXX
Level of performance
The performance metrics are obtained at two different levels: region-level and image-level. Region-level performance corresponds to an evaluation in which both ROI are considered to be independent. However it is not the case, and what is more interesting is the evaluation at the image-level, for which the predictions of the two regions were assembled.
train roi multicnn - Train one classification CNN per region¶
Contrary to the preceding network in which all regions of interest were used as input of a unique CNN, this option allows to train a CNN per region. Then the predictions of the CNNs are assembled to determine the label at the image level.
The output of each CNN is a vector of size equals to the number of classes in this dataset. This vector can be preprocessed by the softmax function to produce a probability for each class. During training, the CNN is optimized according to the cross-entropy loss. Its value becomes null for a subset of images if the probability of the CNN is 1, with respect to the true class (ground truth) of each image in the subset.
Running the task¶
Here is the command line to train a multi-CNN on t1-linear outputs with the predefined architecture of ClinicaDL:
clinicadl train roi multicnn <caps_directory> t1-linear <tsv_path> <output_directory> Conv4_FC3
caps_directory(str) is the input folder containing the neuroimaging data in a CAPS hierarchy.tsv_path(str) is the input folder of a TSV file tree generated byclinicadl tsvtool {split|kfold}.output_directory(str) is the folder where the results are stored.
Common options
Options that are common to all train input and network types can be found in the introduction of
clinicadl train.
The options specific to this task are the following:
--roi_list(list of str) includes the names of the regions wanted. Each region must correspond to a mask defined incaps_directory. See the dedicated section for more information. Default will extract two patches centered on the hippocampi (available fort1-linearpreprocessing only).--uncropped_roi(bool) if given the extracted region is not cropped. Default will crop the image with the smallest bounding box possible.--use_extracted_roi(bool) if this flag is given, the outputs ofclinicadl extractare used. Otherwise, the whole 3D MR volumes are loaded and patches are extracted on-the-fly. Cannot be used if--roi_listis set to default.--transfer_learning_path(str) is the path to a result folder (output ofclinicadl train). The best model of this folder will be used to initialize the network as explained in the implementation details. If nothing is given the initialization will be random.--transfer_learning_selection(str) corresponds to the metric according to which the best model oftransfer_learning_pathwill be loaded. This argument will only be taken into account if the source network is a CNN. Choices arebest_lossandbest_balanced_accuracy. Default:best_balanced_accuracy.--selection_threshold(float) threshold on the balanced accuracies to compute the image-level performance. Regions are selected if their balanced accuracy is greater than the threshold. Default corresponds to no selection.
Outputs¶
The complete output file system is the following:
results
├── commandline.json
├── environment.txt
└── fold-0
├── cnn_classification
│ ├── best_balanced_accuracy
│ │ ├── train_image_level_metrics.tsv
│ │ ├── train_image_level_prediction.tsv
│ │ ├── train_roi_level_metrics.tsv
│ │ ├── train_roi_level_prediction.tsv
│ │ ├── validation_image_level_metrics.tsv
│ │ ├── validation_image_level_prediction.tsv
│ │ ├── validation_roi_level_metrics.tsv
│ │ └── validation_roi_level_prediction.tsv
│ └── best_loss
│ └── ...
├── models
│ ├── cnn-0
│ │ ├── best_balanced_accuracy
│ │ │ └── model_best.pth.tar
│ │ └── best_loss
│ │ └── model_best.pth.tar
│ ├── ...
│ └── cnn-<N>
│ ├── best_balanced_accuracy
│ │ └── model_best.pth.tar
│ └── best_loss
│ └── model_best.pth.tar
└── tensorboard_logs
├── cnn-0
│ ├── train
│ │ └── events.out.tfevents.XXXX
│ └── validation
│ └── events.out.tfevents.XXXX
├── ...
└── cnn-<N>
├── train
│ └── events.out.tfevents.XXXX
└── validation
└── events.out.tfevents.XXXX
models and tensorboard_logs contain one output per CNN trained.
The number of networks N is equal to the number of regions.
Level of performance
The performance metrics are obtained at two different levels: region-level and image-level. Region-level performance corresponds to the concatenation of the performance metrics of all CNNs. The evaluation at the image-level is obtained by assembling the predictions of all the CNNs.