clinicadl.data.datasets.Dataset¶
- class clinicadl.data.datasets.Dataset[source]¶
Abstract class for
ClinicaDLdatasets, which inherits fromtorch.utils.data.Dataset, to work with 3D neuroimaging data.To work properly with
ClinicaDL, all datasets must inherit from this class.See also
BidsDatasetA
Datasetto read data organized in a BIDS.
- property df: DataFrame¶
A DataFrame containing metadata on the images present in the dataset.
Each image must have its associated line in the DataFrame, which must contain at least the columns “participant_id” and “session_id”, with the ids (strings) of the participant and the session.
Example
participant_id session_id age sex diagnosis sub-001 ses-M000 55.0 M CN sub-001 ses-M003 55.0 M AD sub-002 ses-M000 62.0 F MCI sub-002 ses-M003 62.0 F AD sub-003 ses-M000 67.0 F CN
- abstract eval() None[source]¶
Sets the dataset to evaluation mode.
For example, disabling data augmentation in the transformation pipeline.
- abstract train() None[source]¶
Sets the dataset to training mode.
For example, enabling data augmentation in the transformation pipeline.
- get_participant_session_couples() set[tuple[str, str]][source]¶
Retrieves all (participant, session) pairs in the dataset.
- Returns:
set[tuple[str, str]] – The set of (participant, session).
- subset(participants_sessions: Path | str | DataFrame | Iterable[tuple[str, str]]) Self[source]¶
To get a subset of the dataset from a list of (participant, session) pairs.
- Parameters:
data (Union[DataFrameType, Sequence[tuple[str, str]]]) –
Can be either:
a sequence of (participant, session);
a
pandas.DataFrame(or a path to aTSVfile containing the dataframe) with the list of (participant, session) pairs to extract. This list must be passed via two columns named"participant_id"and"session_id"(other columns won’t be considered).
- Returns:
Self – A subset of the original dataset, restricted to the (participant, session) pairs mentioned in
data.
- abstract get_sample_info(idx: int, column: str) Any[source]¶
Retrieves information on a given sample in the metadata DataFrame. The information corresponds to the information on the image the sample was extracted from.
- abstract __len__() int[source]¶
Computes the total number of samples in the dataset.
- Returns:
int – Total number of samples in the dataset, i.e. the number of images times the number of samples per image.