clinicadl.data.datasets.ConcatDataset¶
- class clinicadl.data.datasets.ConcatDataset(datasets: Iterable[Dataset])[source]¶
For assembling multiple
Dataset(e.g., images coming from different BIDS datasets).ConcatDatasetconcatenates the input datasets, so the length of the new dataset will be equal to the sum of the lengths of each individual dataset.- Parameters:
datasets (Iterable[Dataset]) – The
Datasetsto concatenate.
Examples
bids_1 ├── sub-001 │ ├── ses-M000 │ │ └── pet │ │ └── sub-001_ses-M000_pet.nii.gz │ ... ... bids_2 ├── sub-A │ ├── ses-M003 │ │ └── pet │ │ └── sub-A_ses-M003_pet.nii.gz │ ... ...
from clinicadl.data.datasets import BidsDataset, ConcatDataset from clinicadl.io.bids import BidsFileType bids_1 = BidsDataset("bids_1", file_type=BidsFileType(data_type="pet", suffix="pet")) bids_2 = BidsDataset("bids_2", file_type=BidsFileType(data_type="pet", suffix="pet")) full_dataset = ConcatDataset([bids_1, bids_2])
>>> len(bids_1) 4 >>> len(bids_2) 8 >>> len(full_dataset) 12 >>> full_dataset[0].participant_id, full_dataset[0].session_id ('sub-001', 'ses-M000') >>> full_dataset[4].participant_id, full_dataset[4].session_id ('sub-A', 'ses-M003')
- property df¶
The concatenation of the two underlying metadata DataFrames.
- subset(particpants_sessions: Path | str | DataFrame | Iterable[tuple[str, str]]) Self[source]¶
To get a subset of the dataset from a list of (participant, session) pairs.
- Parameters:
data (Union[DataFrameType, Sequence[tuple[str, str]]]) –
Can be either:
a sequence of (participant, session);
a
pandas.DataFrame(or a path to aTSVfile containing the dataframe) with the list of (participant, session) pairs to extract. This list must be passed via two columns named"participant_id"and"session_id"(other columns won’t be considered).
- Returns:
Self – A subset of the original dataset, restricted to the (participant, session) pairs mentioned in
data.
- get_sample_info(idx: int, column: str) Any[source]¶
Retrieves information on a given sample in the metadata DataFrame. The information corresponds to the information on the image the sample was extracted from.
- __len__() int[source]¶
Computes the total number of samples in the dataset.
- Returns:
int – Total number of samples in the dataset, i.e. the number of images times the number of samples per image.
- get_participant_session_couples() set[tuple[str, str]]¶
Retrieves all (participant, session) pairs in the dataset.
- Returns:
set[tuple[str, str]] – The set of (participant, session).
- sanity_check(spatial_checks: Iterable[str | SpatialCheck] | None = ('affine', 'shape', 'global_spacing')) None¶
Performs a sanity check on the current dataset.
It will iterate over the whole dataset to check if images are loaded and transformed correctly, and potentially perform spatial checks on the loaded images.
- Parameters:
spatial_checks (Optional[Iterable[str | SpatialCheck]], default=("affine", "shape", "global_spacing")) –
Spatial checks to perform on the images:
"spacing": checks intra-sample voxel spacing consistency, i.e. that all the images and masks in theSampleoutput by the current dataset have the same voxel spacing."affine": checks intra-sample affine matrix consistency (so it includes"spacing")."shape": checks intra-sample spatial shape consistency."global_spacing": checks inter-sample voxel spacing consistency, i.e. that all theSamplesin the dataset have the same voxel spacing (so it includes"spacing").”
global_shape": checks inter-sample spatial shape consistency (so it includes"shape").
If
None, no spatial check performed.