Colorectal cancer is the third leading cause of cancer-related deaths worldwide. The localised stage of colon/rectal cancer has a high 5-year survival rate (91%/90%). However, according to the SEER statistics maintained by the American cancer society, the survival rate dramatically decreases at the regional stage (72%/74%) and the distant stage (13%/17%). From clinical practice, regular colorectal screening is vital for cancer prevention, aiming to find and remove precancerous growths (e.g., abnormal colon or rectum polyps) before they turn malignant. This procedure usually relies on the physicians’ experience, and a less experienced physician may fail to identify the precancerous conditions, motivating the need for automatic polyp segmentation techniques.
In the past decades, fully-supervised strategies have been extensively explored to segment colorectal polyps in a data-driven manner. However, they all suffer from some data-related limitations: 1) insufficiently diverse data: it is hard to collect diverse positive samples (i.e., those with polyps) since they occur with low-frequency during colonoscopy compared to negative ones; and 2) expensive labelling: only experienced physicians can provide ground-truth for the medical images, leading to expensive data annotation. To alleviate the above limitations, data-efficient learning becomes a potential solution, which harnesses the power of artificial learners with less human supervision, such as semi-supervised and weakly-supervised strategies. However, they still require an adequate number of positive samples during training. Further, compared with the fully-supervised setting, semi-labelled or weakly-labelled data can cause more serious model biases.
Alternatively, unsupervised anomaly segmentation solutions hypothesise that a model trained exclusively on normal colonoscopy images can identify anomalous regions when analysing an abnormal sample. Previous methods built a one-class classifier using contrastive learning, where auxiliary pretext tasks (e.g., using synthesised or augmented images) were designed for differentiating normal and abnormal patterns. Their performance relies heavily on an elaborately-designed training pipeline and risks over-fitting on those pseudo-abnormal patterns. Reconstruction-based methods can solve such problems by training on medical images in a self-supervisedmanner, where the essential assumption is that an autoencoder trained to rebuild in-distribution (ID) samples cannot reconstruct out-of-distribution (OOD) samples, i.e., colorectal polyps, as effectively. However, recent research shows that naive autoencoders can still recon- struct OOD samples with relatively low error, indicating that this framework can not be used directly. To address this issue, Tian et al. presented a memory-augmented self-attention encoder and a multi-level cross-attention decoder based on a masked autoencoder with a large masking ratio, aiming to obtain high reconstruction error for the anomalous regions. Instead of complicating a model architecture with a different training pipeline, this paper streamlines the training procedure by using negative (healthy) samples and then performs data-adaptive inference to identify anomalous regions.
Researchers argue that, compared with healthy data, medically abnormal data can be treated as OOD, allowing us to define colorectal polyp segmentation as a per-pixel OOD detection task. The underlying assumption is that the anomalous regions will have a different distribution compared to the healthy samples. Following the principles of reconstruction-based detection, this paper directly uses the well-designed training pipeline of masked autoencoder (MAE) and serves the reconstruction error to assign an anomaly score. Then, inference becomes a pixel-wise OOD detection task, allowing us to benefit from the strong distribution-modelling capabilities of MAEs. However, researchers find that colorectal polyps vary significantly in appearance, leading to different representations in the latent space. Directly using MAE is then problematic, because the colorectal polyp features are not compactly distributed, degrading the ability of the network to identify them. To address this, this paper proposes feature space standardisation to produce a compact but distinctive feature representation for colorectal polyp regions, leading to a simple and effective inference stage.
The main contributions are 1) redefining the polyp segmentation task as an out-of-distribution detection problem; 2) learning a distribution for healthy samples using a masked autoencoder, advantageously requiring only easily-obtainable healthy samples for training; and 3) demonstrating that feature space standardisation improves the network’s ability to identify anomalous regions at inference time, and to generalise across datasets. The proposed approach demonstrates outstanding performance in unsupervised anomaly (i.e., polyp) segmentation methods.
See the article:
Ji, GP., Zhang, J., Campbell, D. et al. Rethinking Polyp Segmentation From An Out-of-distribution Perspective. Mach. Intell. Res. 21, 631–639 (2024).
Discover more from Science
Subscribe to get the latest posts sent to your email.