Audiovisual Multimodal Cough Data Analysis for Tuberculosis Detection

Jyoti Yadav, Aparna S. Varde, Hao Liu, George Antoniou, Lei Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Early detection of tuberculosis (TB) remains a critical challenge. This research presents a novel approach leveraging audio information from cough recordings for predicting TB. We move beyond traditional image-based methods (such as sputum smear microscopy and chest X-rays) and explore the feasibility of leveraging cough recordings for differentiating TB cases. Two main audio processing techniques, i.e. Mel-Spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs), are utilized to feature encoding audio recording into deep learning models for TB classification. Our proposed methods leverage a large challenge dataset encompassing clinical data from over 1,105 participants and over 502,252 cough recordings. Notably, a simple 1D convolutional neural network (CNN) trained on MFCC features achieves an accuracy of 91%, exceeding the World Health Organization's (WHO) requirements for TB screening tests. Our findings highlight the potential of MFCC features and 1D CNNs for accurate TB detection using cough sounds data. This approach aligns with the Occam's Razor principle, favoring simpler models (such as 1D CNNs) when both achieve good results. This research opens the door to further study in diverse populations and translation to accessible TB screening solutions, especially in resource-limited settings where only cough recording can be collected, highlighting its real-world impact.

Original languageEnglish
Title of host publication15th International Conference on Information, Intelligence, Systems and Applications, IISA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368833
DOIs
StatePublished - 2024
Event15th International Conference on Information, Intelligence, Systems and Applications, IISA 2024 - Chania, Greece
Duration: 17 Jul 202420 Jul 2024

Publication series

Name15th International Conference on Information, Intelligence, Systems and Applications, IISA 2024

Conference

Conference15th International Conference on Information, Intelligence, Systems and Applications, IISA 2024
Country/TerritoryGreece
CityChania
Period17/07/2420/07/24

Keywords

  • AI in health
  • audiovisual data
  • CNN models
  • holistic methods
  • Mel-Spectrogram
  • MFCC
  • sustainable AI
  • TB

Fingerprint

Dive into the research topics of 'Audiovisual Multimodal Cough Data Analysis for Tuberculosis Detection'. Together they form a unique fingerprint.

Cite this