Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Droip: The Modern Website Builder WordPress Needed

      July 8, 2025

      Last week in AI dev tools: Cloudflare blocking AI crawlers by default, Perplexity Max subscription, and more (July 7, 2025)

      July 7, 2025

      Infragistics Launches Ultimate 25.1 With Major Updates to App Builder, Ignite UI

      July 7, 2025

      Design Guidelines For Better Notifications UX

      July 7, 2025

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025

      This 360Hz QD-OLED monitor is more than magnificent — and it’s $280 off right now

      July 8, 2025

      Diablo 4, one of Blizzard’s best Xbox games, is now 64% off — a devilish Anti-Amazon Prime Day discount that’s worth taking over Amazon’s deals

      July 8, 2025

      “One of the best and most premium charging accessories” — Razer Universal Quick Charging Stand for Xbox is 40% off

      July 8, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      AI and Digital Trends Marketing and IT Leaders Need to Know

      July 8, 2025
      Recent

      AI and Digital Trends Marketing and IT Leaders Need to Know

      July 8, 2025

      Blade Authorization Directives for View Security

      July 8, 2025

      Laravel AI Chat Starter Kit

      July 8, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025
      Recent

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025

      This 360Hz QD-OLED monitor is more than magnificent — and it’s $280 off right now

      July 8, 2025

      Diablo 4, one of Blizzard’s best Xbox games, is now 64% off — a devilish Anti-Amazon Prime Day discount that’s worth taking over Amazon’s deals

      July 8, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Carnegie Mellon University at ICML 2025

    Carnegie Mellon University at ICML 2025

    July 8, 2025

    CMU researchers are presenting 127 papers at the Forty-Second International Conference on Machine Learning (ICML 2025), held from July 13th-19th at the Vancouver Convention Center. Here is a quick overview of the areas our researchers are working on:

    Here are our most frequent collaborator institutions:

    Table of Contents

    • Oral Papers
    • Spotlight Papers
    • Poster Papers
      • Accountability, Transparency, And Interpretability
      • Active Learning And Interactive Learning
      • Applications
      • Causality
      • Chemistry, Physics, And Earth Sciences
      • Computer Vision
      • Deep Learning
      • Discrete And Combinatorial Optimization
      • Domain Adaptation And Transfer Learning
      • Evaluation
      • Everything Else
      • Fairness
      • Foundation Models
      • Game Theory
      • General Machine Learning
      • Graph Neural Networks
      • Graphical Models
      • Health / Medicine
      • Language, Speech And Dialog
      • Large Language Models
      • Learning Theory
      • Multi-agent
      • Online Learning And Bandits
      • Online Learning, Active Learning And Bandits
      • Optimization
      • Privacy
      • Probabilistic Methods
      • Reinforcement Learning And Planning
      • Representation Learning
      • Research Priorities, Methodology, And Evaluation
      • Robotics
      • Safety
      • Security
      • Sequential Models, Time Series
      • Social Aspects
      • Structure Learning
      • Supervised Learning
      • Theory
      • Time Series

    Oral Papers

    Expected Variational Inequalities

    Authors: Brian Zhang, Ioannis Anagnostides, Emanuel Tewolde, Ratip Emin Berker, Gabriele Farina, Vincent Conitzer, Tuomas Sandholm

    This paper introduces expected variational inequalities (EVIs), a relaxed version of variational inequalities (VIs) where the goal is to find a distribution that satisfies the VI condition in expectation. While VIs are generally hard to solve, the authors show that EVIs can be solved efficiently, even under challenging, non-monotone conditions, by leveraging ideas from game theory. EVIs generalize the concept of correlated equilibria and unify various results across smooth games, constrained games, and settings with non-concave utilities, making them broadly applicable beyond traditional game-theoretic contexts.

    Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

    Authors: Yangsibo Huang, Milad Nasr, Anastasios Angelopoulos, Nicholas Carlini, Wei-Lin Chiang, Christopher A. Choquette Choo, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Ken Ziyu Liu, Ion Stoica, Florian Tramer, Chiyuan Zhang

    This paper shows that voting-based benchmarks for evaluating LLMs (such as Chatbot Arena) can be vulnerable to adversarial manipulation if proper defenses aren’t in place. The authors show that an attacker can identify which model generated a response and then strategically vote to boost or demote specific models, altering the leaderboard with only around a thousand votes in a simulated environment. They collaborate with Chatbot Arena’s developers to propose and implement security measures such as reCAPTCHA and login requirements that significantly raise the cost of such attacks and enhance the platform’s robustness.

    High-Dimensional Prediction for Sequential Decision Making

    Authors: Georgy Noarov, Ramya Ramalingam, Aaron Roth, Stephan Xie

    This paper presents a new algorithmic framework for making reliable, multi-dimensional forecasts in adversarial, nonstationary environments. Unlike existing online learning methods, this approach offers simultaneous performance guarantees for many agents, even when they face different objectives, act over large action spaces, or care about specific conditions (e.g. weather or route choice). The algorithm ensures low bias across many conditional events and enables each agent to achieve strong guarantees like diminishing regret. Applications include efficient solutions for online combinatorial optimization and multicalibration.

    LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models

    Authors: Parshin Shojaee, Ngoc Hieu Nguyen, Kazem Meidani, Amir Barati Farimani, Khoa Doan, Chandan Reddy

    This paper introduces LLM-SRBench, a new benchmark designed to rigorously evaluate the ability of LLMs to discover scientific equations (rather than merely recall them from training data). Existing tests often rely on well-known equations, making it hard to tell whether models are truly reasoning or just memorizing. LLM-SRBench addresses this by including 239 challenging problems across four scientific domains, split into two categories: one that disguises familiar physics equations (LSR-Transform) and another that features fully synthetic, reasoning-driven tasks (LSR-Synth). Evaluations show that even the best current models only achieve 31.5% accuracy, highlighting the difficulty of the task and establishing LLM-SRBench as a valuable tool for driving progress in LLM-based scientific discovery.

    On Differential Privacy for Adaptively Solving Search Problems via Sketching

    Authors: Shiyuan Feng, Ying Feng, George Li, Zhao Song, David Woodruff, Lichen Zhang

    This paper explores how to use differential privacy to protect against information leakage in adaptive search queries, a harder problem than traditional private estimation tasks. Unlike prior work that only returns numerical summaries (e.g., cost), the authors design algorithms that return actual solutions, like nearest neighbors or regression vectors, even when the inputs or queries change over time. They show how key problem parameters (like the number of approximate near neighbors or condition number of the data matrix) affect the performance of these private algorithms. This work has practical implications for AI systems that rely on private database searches or real-time regression, enabling them to provide useful results while safeguarding sensitive information from attackers.

    Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

    Authors: Vaishnavh Nagarajan, Chen Wu, Charles Ding, Aditi Raghunathan

    This paper proposes a set of simple, abstract tasks designed to probe the creative limits of today’s language models in a controlled and measurable way. These tasks mimic real-world open-ended challenges like generating analogies or designing puzzles, where success requires discovering new connections or constructing novel patterns. The authors show that standard next-token prediction tends to be short-sighted and overly reliant on memorization, while alternative approaches like teacherless training and diffusion models produce more diverse, original outputs. They also introduce a technique called seed-conditioning, which adds randomness at the input rather than the output and can improve coherence without sacrificing creativity.

    Training a Generally Curious Agent

    Authors: Fahim Tajwar, Yiding Jiang, Abitha Thankaraj, Sumaita Rahman, Zico Kolter, Jeff Schneider, Russ Salakhutdinov

    This paper introduces Paprika, a fine-tuning method that equips language models with general decision-making and exploration strategies, enabling them to adapt to new tasks through interaction alone (i.e. without further training). Paprika trains models on synthetic environments requiring different exploration behaviors, encouraging them to learn flexible strategies rather than memorizing solutions. To improve efficiency, it uses a curriculum learning-based approach that prioritizes tasks with high learning value, making the most of limited interaction data. Models trained with Paprika show strong transfer to completely new tasks, suggesting a promising direction for building AI agents that can learn to solve unfamiliar, sequential problems with minimal supervision.

    Spotlight Papers

    GMAIL: Generative Modality Alignment for generated Image Learning

    Authors: Shentong Mo, Sukmin Yun

    Generative models can create realistic images that could help train machine learning models, but using them as if they were real images can lead to problems because of differences between the two. This paper introduces a method called GMAIL that treats real and generated images as separate types (or modalities) and aligns them in a shared latent space during training, rather than just mixing them at the pixel level. The approach fine-tunes models on generated data using a special loss to bridge the gap, then uses these aligned models to improve training on tasks like image captioning and retrieval. The results show that GMAIL improves performance on several vision-language tasks and scales well as more generated data is added.

    LOCATE 3D: Real-World Object Localization via Self-Supervised Learning in 3D

    Authors: Paul McVay, Sergio Arnaud, Ada Martin, Arjun Majumdar, Krishna Murthy Jatavallabhula, Phillip Thomas, Ruslan Partsey, Daniel Dugas, Abha Gejji, Alexander Sax, Vincent-Pierre Berges, Mikael Henaff, Ayush Jain, Ang Cao, Ishita Prasad, Mrinal Kalakrishnan, Michael Rabbat, Nicolas Ballas, Mahmoud Assran, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier

    LOCATE 3D is a model that can find specific objects in 3D scenes based on natural language descriptions (like “the small coffee table between the sofa and the lamp”). It achieves state-of-the-art performance on standard benchmarks and works well in real-world settings, like on robots or AR devices, by using RGB-D sensor data. A key component is 3D-JEPA, a new self-supervised learning method that uses features from 2D vision models (like CLIP or DINO) to understand 3D point clouds through masked prediction tasks. The model is trained on a newly introduced large dataset (130K+ examples), helping it generalize better across different environments.

    Masked Autoencoders Are Effective Tokenizers for Diffusion Models

    Authors: Hao Chen, Yujin Han, Fangyi Chen, Xiang Li, Yidong Wang, Jindong Wang, Ze Wang, Zicheng Liu, Difan Zou, Bhiksha Raj

    This paper introduces MAETok, a masked autoencoder designed to create a high-quality, semantically meaningful latent space for diffusion models. The authors show that having a well-structured latent space, meaning fewer Gaussian modes and more discriminative features, leads to better image generation without needing complex variational autoencoders. MAETok outperforms existing methods on ImageNet using just 128 tokens, and it’s also much faster: 76× quicker to train and 31× faster during inference. The key takeaway is that the structure of the latent space, not variational constraints, is what truly matters for high-quality diffusion-based generation.

    Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI

    Authors: Shayne Longpre, Kevin Klyman, Ruth Elisabeth Appel, Sayash Kapoor, Rishi Bommasani, Michelle Sahar, Sean McGregor, Avijit Ghosh, Borhane Blili-Hamelin, Nathan Butters, Alondra Nelson, Amit Elazari, Andrew Sellars, Casey Ellis, Dane Sherrets, Dawn Song, Harley Geiger, Ilona Cohen, Lauren McIlvenny, Madhulika Srikumar, Mark Jaycox, Markus Anderljung, Nadine Johnson, Nicholas Carlini, Nicolas Miailhe, Nik Marda, Peter Henderson, Rebecca Portnoff, Rebecca Weiss, Victoria Westerhoff, Yacine Jernite, Rumman Chowdhury, Percy Liang, Arvind Narayanan

    This paper highlights the lack of robust systems for identifying and reporting flaws in general-purpose AI (GPAI), especially compared to mature fields like software security. The authors propose three key solutions: (1) standardized reporting formats and engagement rules to streamline flaw reporting and triaging, (2) formal disclosure programs with legal protections for researchers (similar to bug bounties), and (3) better infrastructure for distributing flaw reports to relevant stakeholders. These steps aim to address growing risks like jailbreaks and cross-system vulnerabilities, ultimately improving the safety and accountability of GPAI systems.

    Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Authors: Amrith Setlur, Nived Rajaraman, Sergey Levine, Aviral Kumar

    This paper explores how to best scale test-time compute for large language models (LLMs), comparing two strategies: (1) distilling search traces (verifier-free, or VF) and (2) using verifiers or rewards to guide learning (verifier-based, or VB). The authors show—both theoretically and through experiments—that VB methods significantly outperform VF ones when working with limited compute or data. They explain that this performance gap grows as models and tasks get more complex, especially when solution paths vary in style or quality. Ultimately, the paper argues that verification is essential for effectively scaling LLM performance, especially for reasoning tasks.

    ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

    Authors: Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen

    As long-context LLMs become more common, their growing memory demands during inference slow down performance, especially due to the expanding key-value (KV) cache. This paper introduces ShadowKV, a system that significantly improves throughput by compressing the key cache using low-rank representations and offloading the value cache without major latency costs. It reconstructs only the necessary KV pairs during decoding to maintain speed and accuracy. Experiments show ShadowKV supports much larger batch sizes (up to 6×) and improves throughput by over 3× on standard hardware, all while preserving model quality across several LLMs and benchmarks.

    Poster Papers

    Accountability, Transparency, And Interpretability

    A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

    Authors: Junwei Deng, Weijing Tang, Jiaqi Ma

    Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding

    Authors: Mingyu Jin, Kai Mei, Wujiang Xu, Mingjie Sun, Ruixiang Tang, Mengnan Du, Zirui Liu, Yongfeng Zhang

    Validating Mechanistic Interpretations: An Axiomatic Approach

    Authors: Nils Palumbo, Ravi Mangal, Zifan Wang, Saranya Vijayakumar, Corina Pasareanu, Somesh Jha

    Active Learning And Interactive Learning

    Optimistic Algorithms for Adaptive Estimation of the Average Treatment Effect

    Authors: Ojash Neopane, Aaditya Ramdas, Aarti Singh

    Applications

    Agent Workflow Memory

    Authors: Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig

    Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

    Authors: Rogerio Bonatti, Dan Zhao, Francesco Bonacci, Dillon Dupont, Sara Abdali, Yinheng Li, Yadong Lu, Justin Wagle, Kazuhito Koishida, Arthur Bucker, Lawrence Jang, Zheng Hui

    Causality

    A Sample Efficient Conditional Independence Test in the Presence of Discretization

    Authors: Boyang Sun, Yu Yao, Xinshuai Dong, Zongfang Liu, Tongliang Liu, Yumou Qiu, Kun Zhang

    Extracting Rare Dependence Patterns via Adaptive Sample Reweighting

    Authors: YIQING LI, Yewei Xia, Xiaofei Wang, Zhengming Chen, Liuhua Peng, Mingming Gong, Kun Zhang

    Isolated Causal Effects of Natural Language

    Authors: Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael

    Latent Variable Causal Discovery under Selection Bias

    Authors: Haoyue Dai, Yiwen Qiu, Ignavier Ng, Xinshuai Dong, Peter Spirtes, Kun Zhang

    Permutation-based Rank Test in the Presence of Discretization and Application in Causal Discovery with Mixed Data

    Authors: Xinshuai Dong, Ignavier Ng, Boyang Sun, Haoyue Dai, Guangyuan Hao, Shunxing Fan, Peter Spirtes, Yumou Qiu, Kun Zhang

    Chemistry, Physics, And Earth Sciences

    Maximum Update Parametrization and Zero-Shot Hyperparameter Transfer for Fourier Neural Operators

    Authors: Shanda Li, Shinjae Yoo, Yiming Yang

    Multi-Timescale Dynamics Model Bayesian Optimization for Plasma Stabilization in Tokamaks

    Authors: Rohit Sonker, Alexandre Capone, Andrew Rothstein, Hiro Kaga, Egemen Kolemen, Jeff Schneider

    OmniArch: Building Foundation Model for Scientific Computing

    Authors: Tianyu Chen, Haoyi Zhou, Ying Li, Hao Wang, Chonghan Gao, Rongye Shi, Shanghang Zhang, Jianxin Li

    PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design

    Authors: Zhenqiao Song, Tianxiao Li, Lei Li, Martin Min

    Computer Vision

    David and Goliath: Small One-step Model Beats Large Diffusion with Score Post-training

    Authors: Weijian Luo, colin zhang, Debing Zhang, Zhengyang Geng

    From Thousands to Billions: 3D Visual Language Grounding via Render-Supervised Distillation from 2D VLMs

    Authors: Ang Cao, Sergio Arnaud, Oleksandr Maksymets, Jianing Yang, Ayush Jain, Ada Martin, Vincent-Pierre Berges, Paul McVay, Ruslan Partsey, Aravind Rajeswaran, Franziska Meier, Justin Johnson, Jeong Joon Park, Alexander Sax

    GenZSL: Generative Zero-Shot Learning Via Inductive Variational Autoencoder

    Authors: Shiming Chen, Dingjie Fu, Salman Khan, Fahad Khan

    Understanding Complexity in VideoQA via Visual Program Generation

    Authors: Cristobal Eyzaguirre, Igor Vasiljevic, Achal Dave, Jiajun Wu, Rareș Ambruș, Thomas Kollar, Juan Carlos Niebles, Pavel Tokmakov

    Unifying 2D and 3D Vision-Language Understanding

    Authors: Ayush Jain, Alexander Swerdlow, Yuzhou Wang, Sergio Arnaud, Ada Martin, Alexander Sax, Franziska Meier, Katerina Fragkiadaki

    Deep Learning

    A Theoretical Study of (Hyper) Self-Attention through the Lens of Interactions: Representation, Training, Generalization

    Authors: Muhammed Ustaomeroglu, Guannan Qu

    Towards characterizing the value of edge embeddings in Graph Neural Networks

    Authors: Dhruv Rohatgi, Tanya Marwah, Zachary Lipton, Jianfeng Lu, Ankur Moitra, Andrej Risteski

    Discrete And Combinatorial Optimization

    EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations

    Authors: Haotian Zhai, Connor Lawless, Ellen Vitercik, Liu Leqi

    Faster Global Minimum Cut with Predictions

    Authors: Helia Niaparast, Benjamin Moseley, Karan Singh

    Regularized Langevin Dynamics for Combinatorial Optimization

    Authors: Shengyu Feng, Yiming Yang

    Domain Adaptation And Transfer Learning

    A General Representation-Based Approach to Multi-Source Domain Adaptation

    Authors: Ignavier Ng, Yan Li, Zijian Li, Yujia Zheng, Guangyi Chen, Kun Zhang

    Evaluation

    Copilot Arena: A Platform for Code LLM Evaluation in the Wild

    Authors: Wayne Chi, Valerie Chen, Anastasios Angelopoulos, Wei-Lin Chiang, Aditya Mittal, Naman Jain, Tianjun Zhang, Ion Stoica, Chris Donahue, Ameet Talwalkar

    RAGGED: Towards Informed Design of Scalable and Stable RAG Systems

    Authors: Jennifer Hsia, Afreen Shaikh, Zhiruo Wang, Graham Neubig

    RBench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation

    Authors: Meng-Hao Guo, Jiajun Xu, Yi Zhang, Jiaxi Song, Haoyang Peng, Yi-Xuan Deng, Xinzhi Dong, Kiyohiro Nakayama, Zhengyang Geng, Chen Wang, Bolin Ni, Guo-Wei Yang, Yongming Rao, Houwen Peng, Han Hu, Gordon Wetzstein, Shi-min Hu

    Everything Else

    On Fine-Grained Distinct Element Estimation

    Authors: Ilias Diakonikolas, Daniel Kane, Jasper Lee, Thanasis Pittas, David Woodruff, Samson Zhou

    Safety Certificate against Latent Variables with Partially Unidentifiable Dynamics

    Authors: Haoming Jing, Yorie Nakahira

    Understanding the Kronecker Matrix-Vector Complexity of Linear Algebra

    Authors: Raphael Meyer, William Swartworth, David Woodruff

    Fairness

    FDGen: A Fairness-Aware Graph Generation Model

    Authors: Zichong Wang, Wenbin Zhang

    Fairness on Principal Stratum: A New Perspective on Counterfactual Fairness

    Authors: Haoxuan Li, Zeyu Tang, Zhichao Jiang, Zhuangyan Fang, Yue Liu, zhi geng, Kun Zhang

    Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs

    Authors: Yinong O Wang, Nivedha Sivakumar, Falaah Arif Khan, Katherine Metcalf, Adam Golinski, Natalie Mackraz, Barry-John Theobald, Luca Zappella, Nicholas Apostoloff

    Kandinsky Conformal Prediction: Beyond Class- and Covariate-Conditional Coverage

    Authors: Konstantina Bairaktari, Jiayun Wu, Steven Wu

    Relative Error Fair Clustering in the Weak-Strong Oracle Model

    Authors: Vladimir Braverman, Prathamesh Dharangutte, Shaofeng Jiang, Hoai-An Nguyen, Chen Wang, Yubo Zhang, Samson Zhou

    Foundation Models

    Rethinking the Bias of Foundation Model under Long-tailed Distribution

    Authors: Jiahao Chen, Bin Qin, Jiangmeng Li, Hao Chen, Bing Su

    Game Theory

    Observation Interference in Partially Observable Assistance Games

    Authors: Scott Emmons, Caspar Oesterheld, Vincent Conitzer, Stuart Russell

    General Machine Learning

    On the Power of Learning-Augmented Search Trees

    Authors: Jingbang Chen, Xinyuan Cao, Alicia Stepin, Li Chen

    Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges

    Authors: Nayoung Lee, Jack Cai, Avi Schwarzschild, Kangwook Lee, Dimitris Papailiopoulos

    Graph Neural Networks

    CurvGAD: Leveraging Curvature for Enhanced Graph Anomaly Detection

    Authors: Karish Grover, Geoff Gordon, Christos Faloutsos

    Graph World Model

    Authors: Tao Feng, Yexin Wu, Guanyu Lin, Jiaxuan You

    Graphical Models

    A Generic Family of Graphical Models: Diversity, Efficiency, and Heterogeneity

    Authors: Yufei Huang, Changhu Wang, Junjie Tang, Weichi Wu, Ruibin Xi

    Health / Medicine

    Distributed Parallel Gradient Stacking(DPGS): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning

    Authors: Boyuan Wu, wang, Xianwei Lin, Jiachun Xu, Jikai Yu, Zhou Shicheng, Hongda Chen, Lianxin Hu

    SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

    Authors: Qingtian Zhu, Yumin Zheng, Yuling Sang, Yifan Zhan, Ziyan Zhu, Jun Ding, Yinqiang Zheng

    Language, Speech And Dialog

    A Variational Framework for Improving Naturalness in Generative Spoken Language Models

    Authors: Li-Wei Chen, Takuya Higuchi, Zakaria Aldeneh, Ahmed Hussen Abdelaziz, Alexander Rudnicky

    OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models

    Authors: William Chen, Jinchuan Tian, Yifan Peng, Brian Yan, Chao-Han Yang, Shinji Watanabe

    Synthesizing Privacy-Preserving Text Data via Finetuning *without* Finetuning Billion-Scale LLMs

    Authors: Bowen Tan, Zheng Xu, Eric Xing, Zhiting Hu, Shanshan Wu

    Large Language Models

    Accelerating Unbiased LLM Evaluation via Synthetic Feedback

    Authors: Zhaoyi Zhou, Yuda Song, Andrea Zanette

    AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

    Authors: Pranjal Aggarwal, Bryan Parno, Sean Welleck

    An Architecture Search Framework for Inference-Time Techniques

    Authors: Jon Saad-Falcon, Adrian Lafuente, Shlok Natarajan, Nahum Maru, Hristo Todorov, Etash Guha, Estefany Kelly Buchanan, Mayee Chen, Neel Guha, Christopher Re, Azalia Mirhoseini

    Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models

    Authors: Yuan Li, Zhengzhong Liu, Eric Xing

    Demystifying Long Chain-of-Thought Reasoning

    Authors: Edward Yeo, Yuxuan Tong, Xinyao Niu, Graham Neubig, Xiang Yue

    GSM-$infty$: How Do your LLMs Behave over Infinitely Increasing Reasoning Complexity and Context Length?

    Authors: Yang Zhou, Hongyi Liu, Zhuoming Chen, Yuandong Tian, Beidi Chen

    Idiosyncrasies in Large Language Models

    Authors: Mingjie Sun, Yida Yin, Zhiqiu (Oscar) Xu, Zico Kolter, Zhuang Liu

    Large Language Models are Demonstration Pre-Selectors for Themselves

    Authors: Jiarui Jin, Yuwei Wu, Haoxuan Li, Xiaoting He, Weinan Zhang, Yiming Yang, Yong Yu, Jun Wang, Mengyue Yang

    Let LLM Tell What to Prune and How Much to Prune

    Authors: Mingzhe Yang, Sihao Lin, Changlin Li, Xiaojun Chang

    Memorization Sinks: Isolating Memorization during LLM Training

    Authors: Gaurav Ghosal, Pratyush Maini, Aditi Raghunathan

    Optimizing Temperature for Language Models with Multi-Sample Inference

    Authors: Weihua Du, Yiming Yang, Sean Welleck

    Optimizing Test-Time Compute via Meta Reinforcement Finetuning

    Authors: Yuxiao Qu, Matthew Yang, Amrith Setlur, Lewis Tunstall, Edward Beeching, Russ Salakhutdinov, Aviral Kumar

    Overtrained Language Models Are Harder to Fine-Tune

    Authors: Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen, Tanishq Kumar, Xiang Yue, Sadhika Malladi, Graham Neubig, Aditi Raghunathan

    Reflection-Window Decoding: Text Generation with Selective Refinement

    Authors: Zeyu Tang, Zhenhao Chen, Xiangchen Song, Loka Li, Yunlong Deng, Yifan Shen, Guangyi Chen, Peter Spirtes, Kun Zhang

    Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

    Authors: Jingyu Liu, Beidi Chen, Ce Zhang

    Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization

    Authors: Zishun Yu, Tengyu Xu, Di Jin, Karthik Abinav Sankararaman, Yun He, Wenxuan Zhou, Zhouhao Zeng, Eryk Helenowski, Chen Zhu, Sinong Wang, Hao Ma, Han Fang

    To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models

    Authors: Anna Hedström, Salim I. Amoukou, Tom Bewley, Saumitra Mishra, Manuela Veloso

    Training Software Engineering Agents and Verifiers with SWE-Gym

    Authors: Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang

    Understanding the Skill Gap in Recurrent Models: The Role of the Gather-and-Aggregate Mechanism

    Authors: Aviv Bick, Eric Xing, Albert Gu

    Unlocking Post-hoc Dataset Inference with Synthetic Data

    Authors: Bihe Zhao, Pratyush Maini, Franziska Boenisch, Adam Dziedzic

    Unnatural Languages Are Not Bugs but Features for LLMs

    Authors: Keyu Duan, Yiran Zhao, Zhili Feng, Jinjie Ni, Tianyu Pang, Qian Liu, Tianle Cai, Longxu Dou, Kenji Kawaguchi, Anirudh Goyal, Zico Kolter, Michael Shieh

    What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning?

    Authors: Katie Kang, Amrith Setlur, Dibya Ghosh, Jacob Steinhardt, Claire Tomlin, Sergey Levine, Aviral Kumar

    Learning Theory

    Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF

    Authors: Nuoya Xiong, Aarti Singh

    Sample-Optimal Agnostic Boosting with Unlabeled Data

    Authors: Udaya Ghai, Karan Singh

    Multi-agent

    M³HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality

    Authors: Ziyan Wang, Zhicheng Zhang, Fei Fang, Yali Du

    Online Learning And Bandits

    Offline Learning for Combinatorial Multi-armed Bandits

    Authors: Xutong Liu, Xiangxiang Dai, Jinhang Zuo, Siwei Wang, Carlee Joe-Wong, John C. S. Lui, Wei Chen

    Online Learning, Active Learning And Bandits

    AutoAL: Automated Active Learning with Differentiable Query Strategy Search

    Authors: Yifeng Wang, Xueying Zhan, Siyu Huang

    Optimization

    FedECADO: A Dynamical System Model of Federated Learning

    Authors: Aayushya Agarwal, Gauri Joshi, Lawrence Pileggi

    Graph-Based Algorithms for Diverse Similarity Search

    Authors: Piyush Anand, Piotr Indyk, Ravishankar Krishnaswamy, Sepideh Mahabadi, Vikas Raykar, Kirankumar Shiragur, Haike Xu

    Maximum Coverage in Turnstile Streams with Applications to Fingerprinting Measures

    Authors: Alina Ene, Alessandro Epasto, Vahab Mirrokni, Hoai-An Nguyen, Huy Nguyen, David Woodruff, Peilin Zhong

    Robust Sparsification via Sensitivity

    Authors: Chansophea Wathanak In, Yi Li, David Woodruff, Xuan Wu

    Privacy

    EncryptedLLM: Privacy-Preserving Large Language Model Inference via GPU-Accelerated Fully Homomorphic Encryption

    Authors: Leo de Castro, Daniel Escudero, Adya Agrawal, Antigoni Polychroniadou, Manuela Veloso

    Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models

    Authors: Xiaoyu Wu, Jiaru Zhang, Steven Wu

    Private Federated Learning using Preference-Optimized Synthetic Data

    Authors: Charlie Hou, Mei-Yu Wang, Yige Zhu, Daniel Lazar, Giulia Fanti

    Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning

    Authors: Rongzhe Wei, Mufei Li, Mohsen Ghassemi, Eleonora Kreacic, Yifan Li, Xiang Yue, Bo Li, Vamsi Potluru, Pan Li, Eli Chien

    Probabilistic Methods

    Density Ratio Estimation with Conditional Probability Paths

    Authors: Hanlin Yu, Arto Klami, Aapo Hyvarinen, Anna Korba, Lemir Omar Chehab

    Improving the Statistical Efficiency of Cross-Conformal Prediction

    Authors: Matteo Gasparin, Aaditya Ramdas

    Reinforcement Learning And Planning

    Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games

    Authors: Tong Yang, Bo Dai, Lin Xiao, Yuejie Chi

    Representation Learning

    Contextures: Representations from Contexts

    Authors: Runtian Zhai, Kai Yang, Burak VARICI, Che-Ping Tsai, Zico Kolter, Pradeep Ravikumar

    Learning Vision and Language Concepts for Controllable Image Generation

    Authors: Shaoan Xie, Lingjing Kong, Yujia Zheng, Zeyu Tang, Eric Xing, Guangyi Chen, Kun Zhang

    Nonparametric Identification of Latent Concepts

    Authors: Yujia Zheng, Shaoan Xie, Kun Zhang

    Research Priorities, Methodology, And Evaluation

    Position: You Can’t Manufacture a NeRF

    Authors: Marta An Kimmel, Mueed Rehman, Yonatan Bisk, Gary Fedder

    Robotics

    DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

    Authors: Gaoyue Zhou, Hengkai Pan, Yann LeCun, Lerrel Pinto

    Learning Safe Control via On-the-Fly Bandit Exploration

    Authors: Alexandre Capone, Ryan Cosner, Aaron Ames, Sandra Hirche

    Towards Learning to Complete Anything in Lidar

    Authors: Ayça Takmaz, Cristiano Saltori, Neehar Peri, Tim Meinhardt, Riccardo de Lutio, Laura Leal-Taixé, Aljosa Osep

    Safety

    DIS-CO: Discovering Copyrighted Content in VLMs Training Data

    Authors: André Duarte, Xuandong Zhao, Arlindo Oliveira, Lei Li

    Do Not Mimic My Voice : Speaker Identity Unlearning for Zero-Shot Text-to-Speech

    Authors: Taesoo Kim, Jinju Kim, Dongchan Kim, Jong Hwan Ko, Gyeong-Moon Park

    SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior

    Authors: Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, Sydney Levine

    WMarkGPT: Watermarked Image Understanding via Multimodal Large Language Models

    Authors: Tan Songbai, Xuerui Qiu, Yao Shu, Gang Xu, Linrui Xu, Xiangyu Xu, HUIPING ZHUANG, Ming Li, Fei Yu

    Weak-to-Strong Jailbreaking on Large Language Models

    Authors: Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei Li, Yu-Xiang Wang, William Wang

    Security

    ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

    Authors: Chhavi Yadav, Evan Laufer, Dan Boneh, Kamalika Chaudhuri

    Sequential Models, Time Series

    A Generalizable Physics-Enhanced State Space Model for Long-Term Dynamics Forecasting in Complex Environments

    Authors: Yuchen Wang, Hongjue Zhao, Haohong Lin, Enze Xu, Lifang He, Huajie Shao

    Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

    Authors: Luca Masserano, Abdul Fatir Ansari, Boran Han, Xiyuan Zhang, Christos Faloutsos, Michael Mahoney, Andrew Wilson, Youngsuk Park, Syama Sundar Yadav Rangapuram, Danielle Maddix, Yuyang Wang

    LSCD: Lomb–Scargle Conditioned Diffusion for Time series Imputation

    Authors: Elizabeth M Fons Etcheverry, Alejandro Sztrajman, Yousef El-Laham, Luciana Ferrer, Svitlana Vyetrenko, Manuela Veloso

    Understanding and Improving Length Generalization in Recurrent Models

    Authors: Ricardo Buitrago Ruiz, Albert Gu

    Social Aspects

    Data-driven Design of Randomized Control Trials with Guaranteed Treatment Effects

    Authors: Santiago Cortes-Gomez, Naveen Raman, Aarti Singh, Bryan Wilder

    On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

    Authors: Jen-Tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Michael Lyu, Maarten Sap

    STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings

    Authors: Saksham Rastogi, Pratyush Maini, Danish Pruthi

    Structure Learning

    Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations

    Authors: Zhengming Chen, Yewei Xia, Feng Xie, Jie Qiao, Zhifeng Hao, Ruichu Cai, Kun Zhang

    Supervised Learning

    Preserving AUC Fairness in Learning with Noisy Protected Groups

    Authors: Mingyang Wu, Li Lin, Wenbin Zhang, Xin Wang, Zhenhuan Yang, Shu Hu

    Theory

    Learning-Augmented Hierarchical Clustering

    Authors: Vladimir Braverman, Jon C. Ergun, Chen Wang, Samson Zhou

    On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

    Authors: Binghui Li, Yuanzhi Li

    On the Query Complexity of Verifier-Assisted Language Generation

    Authors: Edoardo Botta, Yuchen Li, Aashay Mehta, Jordan Ash, Cyril Zhang, Andrej Risteski

    Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs

    Authors: Siddharth Gollapudi, Ravishankar Krishnaswamy, Kirankumar Shiragur, Harsh Wardhan

    Time Series

    Exploring Representations and Interventions in Time Series Foundation Models

    Authors: Michal Wilinski, Mononito Goswami, Willa Potosnak, Nina Żukowska, Artur Dubrawski

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAccelerate AI development with Amazon Bedrock API keys
    Next Article Accelerating data science innovation: How Bayer Crop Science used AWS AI/ML services to build their next-generation MLOps service

    Related Posts

    Machine Learning

    Accelerating data science innovation: How Bayer Crop Science used AWS AI/ML services to build their next-generation MLOps service

    July 8, 2025
    Machine Learning

    Accelerate AI development with Amazon Bedrock API keys

    July 8, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    VS meldt actief misbruik van kritiek lek in Erlang Erlang/OTP SSH Server

    Security

    Windows 3.1 is now available for the Game Boy Color… kind of

    News & Updates

    Linus Torvalds critica duramente i file system che non distinguono tra maiuscole e minuscole!

    Linux

    CVE-2025-5849 – Tenda AC15 HTTP POST Request Handler Stack-Based Buffer Overflow Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Development

    Chinese Hackers Deploy MarsSnake Backdoor in Multi-Year Attack on Saudi Organization

    May 20, 2025

    Threat hunters have exposed the tactics of a China-aligned threat actor called UnsolicitedBooker that targeted…

    Borderlands 4 drops stunning new story trailer

    June 22, 2025

    CVE-2025-43577 – Adobe Acrobat Reader Use After Free Vulnerability

    June 10, 2025

    IBM HMC Vulnerable to Privilege Escalation Attacks

    April 22, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.