Speech Emotion Recognition for Autism Spectrum Disorder

10.55434/CBI.2023.20105

Authors

  • Sailaja Maddela PVPSIT

DOI:

https://doi.org/10.55434/CBI.2023.20105

Keywords:

Autism Spectrum Disorder (ASD), Mel frequency cepstral coefficients (MFCC), CNN, MLP, RAVDESS, TESS

Abstract

Our project demonstrates how an emotion can be inferred from the audio file in which the speaker has spoken. The primary goal of this project development is to assist children with Autism Spectrum Disorder (ASD) who cannot identify emotion from speech and may benefit from this project's ability to do o. We are concentrating mostly on this model's fundamental problem of high variance (Overfitting), which may be caused by the lack of audio recordings utilized to train the model. Ryerson's Audio-Visual Database of Emotional Speech and Song (RAVDESS) and the Toronto Emotional Speech Set (TESS) dataset were excellent training datasets for this model to overcome the overfitting issue. Noise removal will be performed as part of the pre-processing step using Python algorithms. Mel-frequency Cepstrum Coefficient (MFCC) simulates the audio features mechanism. Numerous applications support human-computer interactions, but in this case, we're introducing deep learning neural networks (CNN) and Multilayer Perceptron (MLP) to recognize and categorize the precise output. Our final model will classify 8 different emotions (anger, calm, disgust, fearful, happy, neutral, sad, and surprised) with better accuracy.

Metrics

Metrics Loading ...
GA

Downloads

Published

31-12-2023

How to Cite

Maddela, S. (2023). Speech Emotion Recognition for Autism Spectrum Disorder: 10.55434/CBI.2023.20105. Caribbean Journal of Sciences and Technology, 11(2), 24–31. https://doi.org/10.55434/CBI.2023.20105

Issue

Section

Research Article