PCT: Pyramid convolutional transformer for parotid gland tumor segmentation in ultrasound images. (March 2023)
- Record Type:
- Journal Article
- Title:
- PCT: Pyramid convolutional transformer for parotid gland tumor segmentation in ultrasound images. (March 2023)
- Main Title:
- PCT: Pyramid convolutional transformer for parotid gland tumor segmentation in ultrasound images
- Authors:
- Zhang, Gang
Zheng, Chenhong
He, Jianfeng
Yi, Sanli - Abstract:
- Highlights: We propose a Pyramid Convolutional Transformer (PCT) architecture based on the shrinking pyramid framework and Fusion Attention Transformer CNN (FTC) block for parotid gland tumors segmentation. In this architecture, the shrinking pyramid framework can effectively capture parotid gland tumor image features with dense pixel by integrating multi-scale dependencies of images. And the FTC block is constructed to address complex and variable contour characteristics of parotid gland tumors, which combines Transformer with CNN for preferable extracting global and local features of images by dual branch structure. Furthermore, in order to strengthen the connections of sequence information, we improve the multi-head attention mechanism (MHSA) as multi-head fusion attention (MHFA) module and introduce it into the Transformer branch. Finally, we establish PCT architecture based on FTC blocks and pyramid framework for accurate segmentation of parotid gland tumor images. Abstract: Preoperative segmentation of parotid gland tumor regions using deep learning is of great significance for treatment decisions. However, there are still two major limitations: to the best of our knowledge, no networks are designed specifically for parotid gland tumor segmentation tasks; and neither convolutional neural network (CNN) nor Transformer can extract both global and local feature solely. To address these issues, we first propose a Pyramid Convolutional Transformer (PCT) architecture basedHighlights: We propose a Pyramid Convolutional Transformer (PCT) architecture based on the shrinking pyramid framework and Fusion Attention Transformer CNN (FTC) block for parotid gland tumors segmentation. In this architecture, the shrinking pyramid framework can effectively capture parotid gland tumor image features with dense pixel by integrating multi-scale dependencies of images. And the FTC block is constructed to address complex and variable contour characteristics of parotid gland tumors, which combines Transformer with CNN for preferable extracting global and local features of images by dual branch structure. Furthermore, in order to strengthen the connections of sequence information, we improve the multi-head attention mechanism (MHSA) as multi-head fusion attention (MHFA) module and introduce it into the Transformer branch. Finally, we establish PCT architecture based on FTC blocks and pyramid framework for accurate segmentation of parotid gland tumor images. Abstract: Preoperative segmentation of parotid gland tumor regions using deep learning is of great significance for treatment decisions. However, there are still two major limitations: to the best of our knowledge, no networks are designed specifically for parotid gland tumor segmentation tasks; and neither convolutional neural network (CNN) nor Transformer can extract both global and local feature solely. To address these issues, we first propose a Pyramid Convolutional Transformer (PCT) architecture based on the shrinking pyramid framework and Fusion Attention Transformer CNN (FTC) block for parotid gland tumors segmentation. In this architecture, the shrinking pyramid framework can effectively capture parotid gland tumor image features with dense pixel by integrating multi-scale dependencies of images. And the FTC block is constructed to address complex and variable contour characteristics of parotid gland tumors, which combines Transformer with CNN for preferable extracting global and local features of images by dual branch structure. The experimental results suggest that proposed PCT achieved intersection-over-union (IoU) of 0.8434 and Dice similarity coefficient (Dice) of 0.9151 on parotid gland tumor segmentation (PGTSeg) dataset, and attained new state-of-the-art performance on multiple challenging benchmarks with IoU of 0.8521 on MoNuSeg and IoU of 0.9080 on ISIC 2018. Meanwhile, common backbones equipped with FTC block outperformed the baseline model. The code and models will be available at: https://github.com/Twoverz/PCT-Pyramid-Convolutional-Transformer. … (more)
- Is Part Of:
- Biomedical signal processing and control. Volume 81(2023)
- Journal:
- Biomedical signal processing and control
- Issue:
- Volume 81(2023)
- Issue Display:
- Volume 81, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 81
- Issue:
- 2023
- Issue Sort Value:
- 2023-0081-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-03
- Subjects:
- Transformer -- Attention mechanism -- Parotid gland tumor -- Medical image segmentation -- Dense pixel prediction -- Convolutional neural network
Signal processing -- Periodicals
Biomedical engineering -- Periodicals
Signal Processing, Computer-Assisted -- Periodicals
Image Processing, Computer-Assisted -- Periodicals
Biomedical Engineering -- Periodicals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/17468094 ↗
http://www.elsevier.com/journals ↗
http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%2329675%232006%23999989998%23626449%23FLA%23&_cdi=29675&_pubType=J&_auth=y&_acct=C000045259&_version=1&_urlVersion=0&_userid=836873&md5=664b5cf9a57fc91971a17faf20c32ec1 ↗ - DOI:
- 10.1016/j.bspc.2022.104498 ↗
- Languages:
- English
- ISSNs:
- 1746-8094
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2087.880400
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25985.xml