AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

August 20, 2024

*Work done during internship at Apple
Audio-visual speech contains synchronized audio and visual information that provides cross-modal supervision to learn representations for both automatic speech recognition (ASR) and visual speech recognition (VSR). We introduce continuous pseudo-labeling for audio-visual speech recognition (AV-CPL), a semi-supervised method to train an audio-visual speech recognition (AVSR) model on a combination of labeled and unlabeled videos with continuously regenerated pseudo-labels. Our models are trained for speech recognition from audio-visual inputs and canâ€¦

Source: Read MoreÂ

Previous ArticleNovel-View Acoustic Synthesis From 3D Reconstructed Rooms

Next Article Whatâ€™s Included in a Research Operations Job Description? A ReOps Hiring Guide (+Template)

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

Amazon Q Apps supports customization and governance of generative AI-powered apps

This new framework aims to finally set the standard for open-source AI models

CVE-2024-42212 – HCL BigFix Compliance CSRF Vulnerability

Penguin Travel ERP software intrdouces PenAir App for Travel Agency

How a signed driver exposed users to kernel-level threats â€“ Week in Security with Tony Anscombe

Android Spyware Disguised as Alpine Quest App Targets Russian Military Devices

Petition filed to cancel Oracleâ€™s trademark for JavaScript

Power Checklist: New Workstation

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

Related Posts