*Work done during internship at Apple Audio-visual speech contains synchronized audio and visual information that provides cross-modal supervision to learn…
Machine Learning
We investigate the benefit of combining blind audio recordings with 3D scene information for novel-view acoustic synthesis. Given audio recordings…
One of the most common applications of generative artificial intelligence (AI) and large language models (LLMs) in an enterprise environment…
Amazon SageMaker Ground Truth significantly reduces the cost and time required for labeling data by integrating human annotators with machine…
Computational social science (CSS) leverages advanced computational techniques to analyze and interpret vast amounts of social data. This field increasingly…
Quantum computing has shown great potential to transform specific algorithms and applications and is expected to work alongside traditional High-Performance…
Language models (LMs), while powerful in generating human-like text, often produce unstructured and inconsistent outputs. The lack of structure in…
Amazon SageMaker Data Wrangler provides a visual interface to streamline and accelerate data preparation for machine learning (ML), which is…
Data centers are poised to be among the world’s largest electricity consumers. If there is no meaningful change, they will…
Automated design in artificial intelligence (AI) is an emerging field focusing on developing systems capable of independently generating and optimizing…
Language models (LMs) have gained significant prominence in computational text analysis, offering enhanced accuracy and versatility. However, a critical challenge…
Cloud AI infrastructure is vital to modern technology, providing the backbone for various AI workloads and services. Ensuring the reliability…
High-fidelity waveform generation, particularly in text-to-speech (TTS) and audio generation applications, involves several critical challenges. Accurately generating natural-sounding audio remains…
Achieving high-fidelity waveform generation in audio synthesis is a significant challenge, particularly due to the slow inference times associated with…
Text-to-SQL conversion is a vital aspect of Natural Language Processing (NLP) that enables users to query databases using everyday language…
Dense Retrieval (DR) models are an advanced method in information retrieval (IR) that uses deep learning techniques to map passages…
The year 2023 witnessed a rapid rise in generative AI, which has led to the development of numerous AI applications…
Professionals and enthusiasts in the finance industry need to have dependable tools for accessing and analyzing large amounts of data…
Mental health profoundly impacts individuals’ quality of life, yet accessing mental health services can be challenging due to stigma, insufficient…
Building Information Modeling (BIM) is an all-encompassing method of representing built assets using geometric and semantic data. This data can…