resume
- Resume a prematurely stopped job¶
This functionality allows to resume a prematurely stopped job trained with
clinicadl train
of clinicadl random-search generate
tasks.
The files that are used by this function are the following:
maps.json
describes the training parameters used to create the model,checkpoint.pth.tar
contains the last version of the weights of the network,optimizer.pth.tar
contains the last version of the parameters of the optimizer,training.tsv
contains the successive values of the metrics during training.
These files are organized in model_path
using the MAPS format.
You should also ensure that the data at tsv_path
and caps_dir
in maps.json
is still present and correspond to the ones used during training.
Prerequisites¶
Please check which preprocessing needs to
be performed in the maps.json
file in the results folder. If it has
not been performed, execute the preprocessing pipeline as well as clinicadl
extract
to obtain the tensor versions of the images.
Running the task¶
This task can be run with the following command line:
clinicadl train resume INPUT_MAPS_DIRECTORY
INPUT_MAPS_DIRECTORY
(Path) is a path to the MAPS folder of the model.
The splits that must be resumed can be specified with the option --split
. Default will resume and train
all possible splits allowed by the validation setting.
Outputs¶
The outputs are formatted according to the MAPS.
Note
The files checkpoint.pth.tar
and optimizer.pth.tar
are automatically removed as soon
as the stopping criterion is reached, and the
performances of the models are evaluated on the training and validation datasets.