The usage of the nnUNet framework

I want to record a highly recognized framework for graph convolutional neural networks encountered in summer research on artificial intelligence medical image segmentation, nnUNet.

Repo Url: https://github.com/MIC-DKFZ/nnUNet

DeepWiki: https://deepwiki.com/MIC-DKFZ/nnUNet

Variants of U-Net such as U-Mamba and nnU-Net all utilize this framework.

Encoder

Usually uses downsampling operations (such as convolution + pooling) to extract high-dimensional semantic features layer by layer. In 3D U-Net, 3D convolutions are used to process volumetric images (such as CT, MRI data).

Decoder

Uses upsampling operations (such as transposed convolution) to gradually restore feature resolution. At the same time, skip connections are used to integrate shallow information from the encoder, preserving spatial details.

Bottleneck

The deepest part between encoding and decoding, usually performs some processing (convolution + activation function, etc.) to determine high-level semantic information.

Skip Connections

Concatenate shallow features from the encoder with upsampled features at the corresponding level in the decoder. Helps maintain spatial information and alleviate the problem of gradient vanishing.

U-Mamba combines the advantages of convolutional layers and state space models (SSM), capable of capturing local features while aggregating long-range dependencies, outperforming existing CNN- and Transformer-based segmentation networks.

① Usage Background（According to the Project README）

Image datasets exhibit extreme diversity

Image dimensions (2D, 3D)
Modalities/input channels (RGB images, CT, MRI, microscope images, etc.)
Image size, voxel size
Class ratios and target structure characteristics vary significantly across different datasets.

Traditionally, when facing a new problem, it is necessary to manually design and optimize customized solutions—a process that is error-prone, difficult to scale, and largely depends on the skill of the practitioner. Not only must one consider numerous design choices and data attributes, but these factors are also tightly interrelated, making a reliable manual workflow optimization nearly impossible.

nnU-Net, on the other hand, is a semantic segmentation method that can automatically adapt to the given dataset. It analyzes the provided training cases and automatically configures a matching U-Net-based segmentation pipeline.

Just convert your dataset to the nnU-Net format, ideal for image segmentation in large models, training domain-specific large models.

Robustness means: remains stable and reliable in the face of disturbances and uncertainty.

② How it works

Creates several U-Net configurations for each dataset:

2d: a 2D U-Net (applicable to both 2D and 3D datasets)
3d_fullres: a 3D U-Net operating at high image resolution (only for 3D datasets)
3d_lowres → 3d_cascade_fullres: a 3D U-Net cascade where the first 3D U-Net runs on low-resolution images, then the second high-resolution 3D U-Net refines the predictions from the first (only for 3D datasets with large image dimensions)

③ Data Structure

TaskXXX_TaskName/
├── dataset.json  # contains category labels, modalities, and metadata
├── imagesTr/     # training images (name should contain patient ID, such as patient001_0000.nii.gz)
├── labelsTr/     # training labels
└── imagesTs/     # test images (optional)

imagesTs
├── la_001_0000.nii.gz
├── la_002_0000.nii.gz
├── la_006_0000.nii.gz
├── ...

dataset.json format

{
 "channel_names": {  # formerly modalities
   "0": "T2",
   "1": "ADC"
 },
 "labels": {  # THIS IS DIFFERENT NOW!
   "background": 0,
   "PZ": 1,
   "TZ": 2
 },
 "numTraining": 32,
 "file_ending": ".nii.gz"
 "overwrite_image_reader_writer": "SimpleITKIO"  # optional! If not provided, nnUNet will automatically determine the ReaderWriter
}

The data format used for inference must be consistent with that used for the original data. The filename must begin with a unique identifier, followed by a four-digit modality identifier.

nnUNet_preprocessed: Preprocessed datasets

nnUNet_raw: Original raw datasets

nnUNet_results: Model training results

④ Common commands

MSD stands for Medical Segmentation Decathlon, a public dataset in the field of medical image segmentation.

Dataset-Related Processing

nnUNetv2_convert_MSD_dataset -i RAW_FOLDER
nnUNetv2_convert_old_nnUNet_dataset OLD_FOLDER
nnUNetv2_plan_and_preprocess -d DATASET_ID --verify_dataset_integrity

Model Training

nnUNetv2_train DATASET_NAME_OR_ID UNET_CONFIGURATION FOLD --val --npz
nnUNetv2_find_best_configuration DATASET_NAME_OR_ID -c CONFIGURATIONS

5-fold cross-validation is a commonly used cross-validation method, belonging to techniques for evaluating model generalization capabilities in machine learning. In 5-fold cross-validation, the dataset is divided into five non-overlapping subsets (also known as "folds"). The model is trained and tested five rounds: in each round, a different subset is selected as the test set while the remaining four are combined and used for training.

Parameter Name	Description
`dataset_name_or_id`	specify the dataset name or ID for training (e.g. `Task001_BrainTumour` or `1`)
`configuration`	specify the training configuration, e.g. `3d_fullres`, `2d`, `3d_lowres`, `3d_cascade_fullres`, etc.
`fold`	specify the cross-validation fold, an integer from 0 to 4 (supports 5-fold cross-validation)

Inference and Postprocessing

nnUNetv2_predict -i INPUT_FOLDER -o OUTPUT_FOLDER -d DATASET_NAME_OR_ID -c CONFIGURATION --save_probabilities

Category	Purpose	Key Commands
Dataset Setup	Process raw datasets	`nnUNetv2_extract_fingerprint`, `nnUNetv2_convert_MSD_dataset`
Experiment Planning	Configure training parameters	`nnUNetv2_plan_experiment`
Data Preprocessing	Prepare data for training	`nnUNetv2_preprocess`
Full Preparation	Execute all preparation steps	`nnUNetv2_plan_and_preprocess`
Model Training	Train segmentation models	`nnUNetv2_train`
Inference	Run prediction tasks	`nnUNetv2_predict`, `nnUNetv2_predict_from_modelfolder`
Model Ensemble	Combine predictions from multiple models	`nnUNetv2_ensemble`
Model Evaluation	Evaluate model performance	`nnUNetv2_find_best_configuration`, `nnUNetv2_evaluate_folder`
Model Management	Share and reuse models	`nnUNetv2_export_model_to_zip`, `nnUNetv2_download_pretrained_model_by_url`
Tools & Utilities	Visualization and helper tools	`nnUNetv2_plot_overlay_pngs`