Abstract Details

Title Machine-learning Assisted Swallowing Assessment: A Deep Learning-based Quality Improvement Tool to Screen for Post-stroke Dysphagia

Topic Cerebrovascular Disease and Interventional Neurology

Presentation(s) S6 - Stroke Pathophysiology and Prediction (4:06 PM-4:18 PM)

Poster/Presentation
Number 004

Background

Post-stroke dysphagia is common and associated with significant morbidity and mortality. Existing tools to assess for dysphagia include the gold standard (Barium swallow test) or screening tools administered by trained professionals. Each approach presents draw-backs including costs, human-resource requirements and subjectivity. Patients often must wait for swallowing assessments and are prohibited from intaking food orally, negatively impacting their quality of life and outcomes. In this study, we examine the application of convolutional neural networks (CNNs) to rapidly classify patient swallowing status using voice samples alone.

Objective To develop a proof-of-concept machine learning classifier based on voice analysis to screen for post-stroke dysphagia, thereby decreasing screening subjectivity and potentially improving access to screening by bed-side providers.

Design/Methods Vocal samples from 68 post-stroke patients on the neurovascular ward at Sunnybrook Hospital (Toronto, Canada) were studied (average age 68±16), with 40 in training and 28 in testing-cohorts. Samples consisted of vowel sounds and speech components of the National Institute of Health (NIH) Stroke Scale. Patients were labeled according to dysphagia screening status (Toronto Bedside Swallowing Screening Test). Individual vocal samples were then segmented into 1,579 audio clips (0.5-sec clips, 50% overlap) and converted into 6,655 Mel-spectrograms (224x224-pixel images) which were used to train two convolutional neural networks (DenseNet and ConvNext) separately and in ensemble.

Results Clip-level and patient-level swallowing-status predictions were obtained through an unweighted averaging ensemble method. The ensemble network demonstrated an F1-score of 0.81 and area under the receiver operating characteristic curve of 0.912 with a sensitivity and specificity of 0.89 and 0.79 respectively.

Conclusions Our study demonstrates the feasibility and effectiveness of applying state-of-the-art CNNs to classify Mel-spectrogram images of vocalizations for the detection of post-stroke dysphagia. This study is relevant to healthcare professionals caring for stroke patients and may offer an avenue for developing rapid, non-invasive, and more objective dysphagia screening tools.

Authors/Disclosures
Arjun Balachandar, MD PRESENTER	Dr. Balachandar has nothing to disclose.
Rami Saab	No disclosure on file
Hamza Mahdi	No disclosure on file
Eptehal Nashnoush	No disclosure on file
Houman Khosravani, MD, PhD (Sunnybrook Health Sciences Centre, University of Toronto)	Dr. Khosravani has nothing to disclose.