Large Language Models (LLMs) are increasingly used in applications requiring long context lengths, but the key-value (KV) cache often becomes…
Machine Learning
The adoption of text-to-image diffusion models raises concerns over reliability, drawing scrutiny under the lens of various metrics like calibration,…
This post is co-written with Payal Singh from Cohere. The Cohere Embed 4 multimodal embeddings model is now generally available…
Evaluating the quality of AI responses across multiple languages presents significant challenges for organizations deploying generative AI solutions globally. How…
In this post, we demonstrate how to build an end-to-end solution for text classification using the Amazon Bedrock batch inference…
Financial fraud detection isn’t just important to banks—it’s essential. With global fraud losses surpassing $40 billion annually and sophisticated criminal…
The world’s population is expanding at a rapid rate. The growing global population requires innovative solutions to produce food, fiber,…
CMU researchers are presenting 127 papers at the Forty-Second International Conference on Machine Learning (ICML 2025), held from July 13th-19th…
Today, we’re excited to announce a significant improvement to the developer experience of Amazon Bedrock: API keys. API keys provide…
Driven by steady progress in deep generative modeling, simulation-based inference (SBI) has emerged as the workhorse for inferring the parameters…
This paper was presented at the Workshop on Reliable and Responsible Foundation Models at ICML 2025. Large Language Models (LLMs)…
Wearable devices record physiological and behavioral signals that can improve health predictions. While foundation models are increasingly used for such…
Large language models (LLMs) have demonstrated impressive performance on several tasks and are increasingly deployed in real-world applications. However, especially…
We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as tracking…
Organizations deploying video monitoring systems face a critical challenge: processing continuous video streams while maintaining accurate situational awareness. Traditional monitoring…
Software as a service (SaaS) companies managing multiple tenants face a critical challenge: efficiently extracting meaningful insights from vast document…
Today, we are excited to announce that Qwen3, the latest generation of large language models (LLMs) in the Qwen family,…
This post is co-written with Shashank Saraogi, Nat Gale, and Durran Kelly from INRIX. The complexity of modern traffic management…
We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits,…
Large-scale models are routinely trained on a mixture of different data sources. Different data mixtures yield very different downstream performances.…