Translating medical image to radiological report: Adaptive multilevel multi-attention approach. (June 2022)
- Record Type:
- Journal Article
- Title:
- Translating medical image to radiological report: Adaptive multilevel multi-attention approach. (June 2022)
- Main Title:
- Translating medical image to radiological report: Adaptive multilevel multi-attention approach
- Authors:
- Gajbhiye, Gaurav O.
Nandedkar, Abhijeet V.
Faye, Ibrahima - Abstract:
- Highlights: A novel encoder-decoder based "Adaptive Multilevel Multi-Attention Approach" is proposed to generate medical report for any view of chest X-ray image. The proposed model is trained to generate the admissible medical report by incorporating the medical imaging and radiological information via adaptive attention mechanisms. Residual Attention Module is introduced to intensify characteristics of standard feature maps and further aids in multi-label classification task. Two specific adaptive attention mechanisms are introduced, one for focusing on proper visual region by extending the semantic information at each time step and another for attending on relevant semantic knowledge aligned by visual features. The statistical based "Unique Index" parameter is proposed to measure the generic ability of model to generate unbiased and unique report. Abstract: Background and Objective: Medical imaging techniques are widely employed in disease diagnosis and treatment. A readily available medical report can be a useful tool in assisting an expert for investigating the patient's health. A radiologist can benefit from an automatic medical image to radiological report translation system while preparing a final report. Previous attempts on automatic medical report generation task includes image captioning algorithms without taking domain-specific visual and textual contents into account, thus arises the question about credibility of generated report. Methods: In this work, a novelHighlights: A novel encoder-decoder based "Adaptive Multilevel Multi-Attention Approach" is proposed to generate medical report for any view of chest X-ray image. The proposed model is trained to generate the admissible medical report by incorporating the medical imaging and radiological information via adaptive attention mechanisms. Residual Attention Module is introduced to intensify characteristics of standard feature maps and further aids in multi-label classification task. Two specific adaptive attention mechanisms are introduced, one for focusing on proper visual region by extending the semantic information at each time step and another for attending on relevant semantic knowledge aligned by visual features. The statistical based "Unique Index" parameter is proposed to measure the generic ability of model to generate unbiased and unique report. Abstract: Background and Objective: Medical imaging techniques are widely employed in disease diagnosis and treatment. A readily available medical report can be a useful tool in assisting an expert for investigating the patient's health. A radiologist can benefit from an automatic medical image to radiological report translation system while preparing a final report. Previous attempts on automatic medical report generation task includes image captioning algorithms without taking domain-specific visual and textual contents into account, thus arises the question about credibility of generated report. Methods: In this work, a novel Adaptive Multilevel Multi-Attention (AMLMA) approach is proposed by offering domain-specific visual-textual knowledge to generate a thorough and believable radiological report for any view of a human chest X-ray image. The proposed approach leverages the encoder-decoder framework incorporated with multiple adaptive attention mechanisms. The potential of a convolutional neural network (CNN) with residual attention module (RAM) is demonstrated as a strong visual encoder for multi-label abnormality detection. The multilevel visual features (local and global) are extracted from proposed visual encoder to retrieve regional-level and abstract-level radiology-based semantic information. The Word2Vec and FastText word embeddings are trained on medical reports to acquire radiological knowledge and further used as textual encoders, feeding as input to Bi-directional Long Short Term Memory (Bi-LSTM) network to learn the co-relationship between medical terminologies in radiological reports. The AMLMA employs a weighted multilevel association of adaptive visual-semantic attention and visual-based linguistic attention mechanisms. This association of adaptive attention is exploited as a decoder and produces significant improvements in the report generation task. Results: The proposed approach is evaluated on a publicly available Indiana University chest X-ray (IU-CXR) dataset. The CNN with RAM shows the significant improvement in recall (0.4423), precision (0.1803) and F1-score (0.2551) for prediction of multiple abnormalities in X-ray image. The results of language generation metrics for proposed variants were acquired using the COCO-caption evaluation Application Program Interface (API). The trained embeddings with AMLMA model generates the convincing radiology report and outperform state-of-the-art (SOTA) approaches with high evaluation metrics scores for Bleu-4 (0.172), Meteor (0.247), Rouge_L (0.376) and CIDEr (0.381). In addition, a new "Unique Index" (UI) statistic is introduced to highlight the model's ability for generating unique reports. Conclusion: The overall architecture aids to the understanding of various X-ray image views and generating the relevant normal and abnormal radiography statements. The proposed model is emphasized on multi-level visual-textual knowledge with adaptive attention mechanism to balance visual and linguistic information for the generation of admissible radiology report. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 221(2022)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 221(2022)
- Issue Display:
- Volume 221, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 221
- Issue:
- 2022
- Issue Sort Value:
- 2022-0221-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Radiology report generation -- X-ray -- Encoder-Decoder -- Residual attention module -- Multilevel multi-attention mechanism -- Radiology-trained word embedding
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2022.106853 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22255.xml