(344f) Feature Engineering and Machine Learning for Computer-Assisted Screening of Children with Speech Disorders | AIChE

(344f) Feature Engineering and Machine Learning for Computer-Assisted Screening of Children with Speech Disorders

Authors 

Yousefi Zowj, F. - Presenter, Auburn University
Suthar, K., Auburn University
He, Q. P., Auburn University
Speights Atkins, M., Auburn University
Speech-language deficits are among the most prevalent childhood disabilities affecting about 1 in 12 children between three and five [1]. Approximately 40% of children with speech and language disorders do not receive the intervention because their impairment goes undetected [2, 3]. However, early identification and treatment of communication disorders have been essential for school readiness and have significantly improved communication, literacy, and mental health outcomes for young children [1, 4, 5]. Another factor to consider is that some children may be reluctant to participate in long testing sessions [6], and even if they do, transcription of large data sets of audio recordings is time-consuming and requires a high level of expertise from therapists [7, 8]. These limitations have led to an increasing need for automated methods to quantify child speech patterns and help them be diagnosed with impaired speech [9].

To achieve this goal, we use an acoustic landmark detection (ALD) tool, the SpeechMark® system [10], to extract landmarks and syllabic clusters from 39 children with typically developing (TD) speech and 12 children with speech disorder (SD) uttering 11 triage words. Instead of utilizing kernel-type or algorithmically generated features that are often without physical meaning or intuition, we produce knowledge-guided features. The goal was to create robust, meaningful, and predictive features that optimize automated screening. Our feature engineering strategies focus on: 1) multiplex/multivariate feature sets where the interrelations among features are considered; 2) integration of both landmarks (LMs) and syllabic clusters (SCs) features so that both articulatory energy-based and rule-based features are acknowledged; and 3) significant expansion of our previous feature pool (where the focus had been on counts or scaled counts) by also considering between-feature relations (e.g., ratios), patterns and clusters (e.g., bi-grams, tri-grams), as well as dynamics and transitions (e.g., durations, strength difference of consecutive LMs). Using those feature engineering strategies and robust machine learning (ML) approaches, we build classification models that categorize these two groups using only one word (flowers) from the 11 triage words. Like many medical diagnosis problems, the availability of positive samples in this study is limited, which leads to class imbalance problem. We use the Synthetic Minority Oversampling Technique (SMOTE) to deal with class imbalance. It is worth noting that, to avoid overfitting, SMOTE samples are generated using training samples only and used in training only. Test data are set aside and not used for generating SMOTE samples, nor are SMOTE samples used as test data to ensure the approach's integrity. Using robust machine learning techniques and leveraging class priors/weights, we investigate our proposed approach's performance as an effective speech disorder screening tool. We compare four different classification algorithms: linear discriminant analysis (LDA), support vector machine (SVM), extreme gradient boosting (XGBoost), and random forest (RF). In addition, we investigate and compare the performance of models based on raw data and our knowledge-guided SpeechMark (SM)-based features and demonstrate our proposed features' effectiveness. Finally, we investigate feature selection to find a minimum set of features deployed in speech disorder screening. The high sensitivity and specificity achieved by the proposed framework that integrates feature engineering and selection, SMOTE sampling, and robust machine learning, can yield significant advantages in providing a robust screening tool for identifying children at-risk for speech disorders and have the potential to be used in clinics.

References:

[1] Prelock, P. A., Hutchins, T., & Glascoe, F. P. (2008). Speech-language impairment: How to identify the most common and least diagnosed disability of childhood. The Medscape Journal of Medicine, 10(6), 136.

[2] National Academies of Sciences, Engineering, and Medicine. (2016). Speech and language disorders in children: Implications for the Social Security Administration's Supplemental Security Income program.

[3] Nelson, H. D., Nygren, P., Walker, M., & Panoscha, R. (2006). Screening for speech and language delay in preschool children: systematic evidence reviews for the US Preventive Services Task Force. Pediatrics, 117(2), e298-e319.

[4] Irwin, C. E., Adams, S. H., Park, M. J., & Newacheck, P. W. (2009). Preventive care for adolescents: few gets visits and fewer get services. Pediatrics, 123(4), e565-e572.

[5] American Academy of Pediatrics. (2006). Council on children with disabilities, section on developmental behavioral pediatrics, Bright Futures Steering Committee, Medical Home Initiatives for Children with Special Needs Project Advisory Committee. Identifying infants and young children with developmental disorders in the medical home: An algorithm for developmental surveillance and screening. Pediatrics, 118(1), 405-420.

[6] Tyler, A. A., & Tolbert, L. C. (2002). Speech-language assessment in the clinical setting.

[7] Stoel-Gammon, C. (2001). Transcribing the speech of young children. Topics in language disorders, 21(4), 12-21.

[8] Ball, M. J., & Rahilly, J. (2002). Transcribing disordered speech: The segmental and prosodic layers. Clinical linguistics & phonetics, 16(5), 329-344.

[9] Berisha, V., Utianski, R., & Liss, J. (2013, May). Towards a clinical tool for automatic intelligibility assessment. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 2825-2828). IEEE.

[10] Boyce, S., Fell, H., & MacAuslan, J. (2012). SpeechMark: Landmark detection tool for speech analysis. In Thirteenth Annual Conference of the International Speech Communication Association.