Assessing Noise Impact on Machine Learning Models for Voice Disorder Evaluation

Deep learning has become a powerful tool for classifying pathological voices, particularly in the GRBAS (Grade, Roughness, Breathiness, Asthenia, Strain) scale assessment. The GRBAS scale is a standardized method clinicians use to evaluate voice disorders based on auditory-perceptual judgment. Traditional methods for classifying pathological voices often rely on manual feature extraction and subjective analysis, which can be time-consuming and inconsistent. Deep learning techniques such as 1D convolutional neural networks (1D-CNNs) offer significant advantages by automatically learning relevant features from raw audio data, capturing complex patterns and nuances indicative of specific pathological conditions.

However, noise can significantly impact the accuracy of these models. Since they rely on extracting subtle features from voice signals, any background noise or distortion can obscure important characteristics, leading to misclassification. Noise from recording environments, equipment, or background sounds poses a critical challenge in developing reliable voice pathology detection systems. Preprocessing techniques like noise reduction and signal enhancement are often employed, but they may only sometimes be sufficient to eliminate the effects of noise on classification performance.

In this context, a new paper was recently published in the journal The Laryngoscope, which aims to assess the impact of background noise on machine learning models used for evaluating the GRBAS scale in voice disorder assessments.

In this study, the authors created a unique dataset from clinical patientsâ€™ voice samples recorded in a soundproof room. These samples were rated according to the GRBAS scale by otolaryngologists and an expert speech and language therapist. The ratingsâ€™ median values were adopted as the correct answers, and inter-rater agreement was evaluated using Krippendorffâ€™s alpha.

The machine learning model was a 5-layer 1D-CNN, constructed and evaluated using TensorFlow. The dataset was divided into 80% training, 10% validation, and 10% test data. The training process was conducted without noise data. Gaussian noise of various intensities was added to the test samples to assess noise resilience. The modelâ€™s performance was evaluated using accuracy, F1 score, and quadratic weighted Cohenâ€™s kappa score under different noise conditions. The study highlights the significance of noise as a challenge in applying machine learning models to real-world scenarios like examination rooms.

The dataset of voice samples, balanced for age and gender, showed that the deep learning model performed well with noise-free data. As Gaussian noise intensity increased, performance metrics dropped significantly, with accuracy falling dramatically at the highest noise level. This degradation was observed across all GRBAS parameters, with certain scales showing the most significant declines.

The study found that background noise severely affects the modelâ€™s accuracy and performance metrics. The modelâ€™s effectiveness decreased as noise levels increased, highlighting its vulnerability to real-world conditions. Certain GRBAS components were more sensitive to noise. The study suggests incorporating noise-resilient techniques such as data augmentation and noise reduction to improve model robustness. Limitations include the small number of evaluators and using only one type of vocal sample, which may not fully capture the variability in voice disorders. Future work should address these issues to enhance the modelâ€™s generalizability and performance in noisy environments.

To conclude, the modelâ€™s performance significantly declined with increased background noise, impacting the evaluation metrics. Future research should focus on developing noise-tolerant methods, such as data augmentation, to enhance the modelâ€™s resilience in real-world conditions. Improving the GRBAS scaleâ€™s reliability can make it a valuable tool for both physicians and patients. Automated evaluations can facilitate earlier disease detection, leading to more effective treatments and better support for rehabilitation.

Check out the Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 47k+ ML SubReddit

Find Upcoming AI Webinars here

Arcee AI Released DistillKit: An Open Source, Easy-to-Use Tool Transforming Model Distillation for Creating Efficient, High-Performance Small Language Models

The post Assessing Noise Impact on Machine Learning Models for Voice Disorder Evaluation appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Assessing Noise Impact on Machine Learning Models for Voice Disorder Evaluation

Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

Facebook Banna GNU/Linux: Cosa Sta Succedendo e Alternative per la Comunità GNU/Linux

Prime Day 2024: Amazon finally confirms a start date, but some early deals are already live â€” Here’s everything you need to know

Microsoft just made Windows 10 worse, and there’s (almost) nothing you can do about it

Long-Context Multimodal Understanding No Longer Requires Massive Models: NVIDIA AI Introduces Eagle 2.5, a Generalist Vision-Language Model that Matches GPT-4o on Video Tasks Using Just 8B Parameters

NATO Innovation Fund announces its new investment team

OpenAI CEO Sam Altman anticipates GPT-5 as a â€œsignificant leap forwardâ€ over GPT-4, which occasionally â€œgoes off the railsâ€ with mistakes even a six-year-old wouldnâ€™t make

Both of Getty’s commercial-safe AI image generators just got smarter and faster

A Developer’s Guide to Protecting Personal Data: Best Practices and Tools

Assessing Noise Impact on Machine Learning Models for Voice Disorder Evaluation

Related Posts