Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Carnegie Mellon University at NeurIPS 2024

    Carnegie Mellon University at NeurIPS 2024

    December 2, 2024

    Carnegie Mellon University is proud to present 194 papers at the 38th conference on Neural Information Processing Systems (NeurIPS 2024), held from December 10-15 at the Vancouver Convention Center. Here is a quick overview of the areas our researchers are working on:

    Here are some of our top collaborator institutions:

    Table of Contents

    • Oral Papers
    • Spotlight Papers
    • Poster Papers
      • Causality
      • Computational Biology
      • Computer Vision
      • Computer Vision (Image Generation)
      • Computer Vision (Video Generation)
      • Computer Vision (Video Understanding)
      • Data-centric AI
      • Data-centric AI (Data Augmentation)
      • Data-centric AI (Data-centric AI Methods And Tools)
      • Deep Learning (Algorithms)
      • Deep Learning (Attention Mechanisms)
      • Deep Learning (Everything Else)
      • Deep Learning (Representation Learning)
      • Deep Learning (Robustness)
      • Fairness
      • Generative Models
      • Generative Models (Diffusion Models)
      • Generative Models (In Context Learning)
      • Generative Models (Misc)
      • Generative Models (Reasoning)
      • Graph Neural Networks
      • Human-computer Interaction
      • Interpretability
      • Language (Dialogue)
      • Language (Generation)
      • Language (Knowledge)
      • Learning Theory
      • Miscellaneous Aspects Of Machine Learning (General Machine Learning Techniques)
      • Miscellaneous Aspects Of Machine Learning (Supervised Learning)
      • Multimodal Models
      • Neuroscience, Cognitive Science
      • Online Learning
      • Optimization
      • Optimization (Convex)
      • Optimization (Large Scale, Parallel And Distributed)
      • Optimization (Learning For Optimization)
      • Other
      • Privacy
      • Reinforcement Learning (Batch Offline)
      • Reinforcement Learning (Everything Else)
      • Reinforcement Learning (Multi-agent)
      • Reinforcement Learning (Planning)
      • Robotics
      • Theory (Everything Else)
      • Theory (Game Theory)
      • Theory (Reinforcement Learning And Planning)
      • Time Series
      • Trustworthy Machine Learning

    Oral Papers

    Stylus: Automatic Adapter Selection for Diffusion Models

    Authors: Michael Luo, Justin Wong, Brandon Trabucco, Yanping Huang, Joseph Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica

    This paper explores an alternative approach to generating high-fidelity, customized images at reduced costs using fine-tuned adapters instead of simply scaling base models with additional data or parameters. Over time, the open-source community has created a large collection of more than 100,000 adapters—small modules that fine-tune base models for specific tasks. However, many of these adapters are highly customized and lack clear descriptions, making them challenging to use effectively. To address this, the paper introduces Stylus, a system designed to match prompts with relevant adapters and automatically compose them for better image generation. Building on recent research showing the benefits of combining multiple adapters, Stylus uses a three-stage process: summarizing adapters with improved descriptions and embeddings, retrieving relevant adapters, and composing adapters based on prompt keywords to ensure a strong match. The authors also present StylusDocs, a curated dataset of 75,000 adapters with pre-computed embeddings, for evaluation. Testing Stylus on popular Stable Diffusion checkpoints shows that it achieves better CLIP/FID Pareto efficiency and is twice as preferred by human and multimodal evaluators compared to the base model.

    The Sample-Communication Complexity Trade-off in Federated Q-Learning

    Authors: Sudeep Salgia, Yuejie Chi

    This work examines the problem of Federated Q-learning, where multiple agents collaboratively learn the optimal Q-function for an unknown infinite-horizon Markov Decision Process with finite state and action spaces. The focus is on understanding the trade-off between sample complexity (the number of data samples needed for learning) and communication complexity (the amount of data exchanged between agents) for intermittent communication algorithms, a commonly used approach in federated settings.

    The authors first establish a fundamental limitation: any Federated Q-learning algorithm that achieves linear speedup in sample complexity relative to the number of agents must incur a communication cost of at least Ω(1/1−γ), where γ is the discount factor. They then introduce a new algorithm, Fed-DVR-Q, which is the first to achieve both optimal sample complexity and communication complexity simultaneously. Together, these results provide a comprehensive understanding of the trade-offs between sample and communication efficiency in Federated Q-learning.

    Spotlight Papers

    Aligner Encoders: Self-Attention Transformers Can Be Self-Transducers

    Authors: Adam Stooke, Rohit Prabhavalkar, Khe Sim, Pedro Moreno Mengibar

    The paper introduces a new transformer-based approach to automatic speech recognition (ASR) that simplifies the alignment process between audio input and text output. Unlike traditional models, the encoder itself aligns audio information internally, reducing the complexity of decoding. The proposed “Aligner-Encoder” model combines efficient training techniques and a lightweight decoder, resulting in significantly faster performance while maintaining competitive accuracy. Notably, the alignment process is evident in the self-attention weights of the model, showcasing its ability to handle the task efficiently.

    Approximating the Top Eigenvector in Random Order Streams

    Authors: Praneeth Kacham, David Woodruff

    This work focuses on streaming algorithms for approximating the top eigenvector of a matrix when its rows are presented in a random order. The authors introduce a new algorithm that works efficiently when there is a sufficient gap between the largest and second-largest eigenvalues of the matrix. Their approach uses a small amount of memory, depending on the number of “heavy rows” (rows with large norms), and produces highly accurate results. They also show that using this heavy-row-based parameterization is necessary for achieving high accuracy and improve on prior methods by reducing the gap requirement for random-order streams, though their method assumes the rows are presented in a random order rather than any order.

    Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning

    Authors: Shentong Mo, Peter Tong

    Recent advancements in unsupervised visual representation learning have highlighted the Joint-Embedding Predictive Architecture (JEPA) as an effective method for extracting visual features from unlabeled images using masking strategies. However, JEPA faces two key challenges: its reliance on Exponential Moving Average (EMA) fails to prevent model collapse, and its predictions struggle to accurately capture the average representation of image patches. To address these issues, this work introduces C-JEPA, a new framework that combines JEPA with a variance-invariance-covariance regularization strategy called VICReg. This approach improves stability, prevents collapse, and ensures better learning of consistent representations. Experiments show that C-JEPA achieves faster convergence and higher performance on standard benchmarks when pre-trained on ImageNet-1K.

    CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

    Authors: Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, Jiangmiao Pang

    This work addresses the challenge of enabling humanoid robots to collaborate on tasks like moving large furniture, which require coordination between multiple robots. Existing methods struggle due to a lack of motion capture data for multi-humanoid collaboration and the inefficiency of training multiple agents together. To overcome this, the authors introduce Cooperative Human-Object Interaction (CooHOI), a framework that uses a two-phase learning approach: first, individual humanoids learn object interaction skills from human motion data, and then they learn to work together using multi-agent reinforcement learning. By focusing on shared object dynamics and decentralized execution, the robots achieve coordination through implicit communication. Unlike previous tracking-based methods, CooHOI is efficient, does not rely on multi-humanoid motion data, and can easily scale to more participants and diverse object types.

    DiffTOP: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning

    Authors: Weikang Wan, Ziyu Wang, Yufei Wang, Zackory Erickson, David Held

    This paper presents DiffTORI, a framework that uses differentiable trajectory optimization as a policy representation for reinforcement and imitation learning. Trajectory optimization, a common tool in control, is parameterized by a cost and a dynamics function, and recent advances now allow gradients of the loss to be computed with respect to these parameters. This enables DiffTORI to learn cost and dynamics functions end-to-end, addressing the “objective mismatch” in previous model-based RL methods by aligning the dynamics model with task performance. Benchmarking on robotic manipulation tasks with high-dimensional sensory inputs, DiffTORI demonstrates superior performance over prior methods, including feedforward policies, energy-based models, and diffusion models, across a wide range of reinforcement and imitation learning tasks.

    Don’t Look Twice: Faster Video Transformers with Run-Length Tokenization

    Authors: Rohan Choudhury, Guanglei Zhu, Sihan Liu, Koichiro Niinuma, Kris Kitani, László Jeni

    Video transformers are notoriously slow to train due to the large number of input tokens, many of which are repeated across frames. Existing methods to remove redundant tokens often introduce significant overhead or require dataset-specific tuning, limiting their practicality. This work introduces Run-Length Tokenization (RLT), a simple and efficient method inspired by run-length encoding, which identifies and removes repeated patches in video frames before inference. By replacing repeated patches with a single token and a positional encoding to reflect its duration, RLT reduces redundancy without requiring tuning or adding significant computational cost. It accelerates training by 30%, maintains baseline performance, and increases throughput by 35% with minimal accuracy loss, while reducing token counts by up to 80% on longer videos.

    ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

    Authors: Gabriel Sarch, Lawrence Jang, Michael Tarr, William Cohen, Kenneth Marino, Katerina Fragkiadaki

    This work introduces In-Context Abstraction Learning (ICAL), a method that enables large-scale language and vision-language models (LLMs and VLMs) to generate high-quality task examples from imperfect demonstrations. ICAL uses a vision-language model to analyze and improve inefficient task trajectories by abstracting key elements like causal relationships, object states, and temporal goals, with iterative refinement through human feedback. These improved examples, when used as prompts, enhance decision-making and reduce reliance on human input over time, making the system more efficient. ICAL outperforms state-of-the-art models in tasks like instruction following, web navigation, and action forecasting, demonstrating its ability to improve performance without heavy manual prompt engineering.

    Is Your LiDAR Placement Optimized for 3D Scene Understanding?

    Authors: Ye Li, Lingdong Kong, Hanjiang Hu, Xiaohao Xu, Xiaonan Huang

    This work focuses on improving the reliability of driving perception systems under challenging and unexpected conditions, particularly with multi-LiDAR setups. Most existing datasets rely on single-LiDAR systems and are collected in ideal conditions, making them insufficient for real-world applications. To address this, the authors introduce Place3D, a comprehensive pipeline that optimizes LiDAR placement, generates data, and evaluates performance. Their approach includes three key contributions: a new metric called the Surrogate Metric of the Semantic Occupancy Grids (M-SOG) for assessing multi-LiDAR configurations, an optimization strategy to improve LiDAR placements based on M-SOG, and the creation of a 280,000-frame dataset capturing both clean and adverse conditions. Experiments show that their optimized placements lead to significant improvements in tasks like semantic segmentation and 3D object detection, even in challenging scenarios with harsh weather or sensor failures.

    Learn To be Efficient: Build Structured Sparsity in Large Language Models

    Authors: Haizhong Zheng, Xiaoyan Bai, Xueshen Liu, Zhuoqing Morley Mao, Beidi Chen, Fan Lai, Atul Prakash

    The paper explores how Large Language Models (LLMs), known for their impressive capabilities but high computational costs, can be made more efficient. It highlights that while activation sparsity—where only some model parameters are used during inference—naturally occurs, current methods fail to maximize its potential during training. The authors propose a novel training algorithm, Learn-To-be-Efficient (LTE), that encourages LLMs to activate fewer neurons, striking a balance between efficiency and performance. Their approach, applicable to models beyond traditional ReLU-based ones, demonstrates improved results across various tasks and reduces inference latency by 25% for LLaMA2-7B at 50% sparsity.

    Learning Social Welfare Functions

    Authors: Kanad Pardeshi, Itai Shapira, Ariel Procaccia, Aarti Singh

    This work explores whether it is possible to understand or replicate a policymaker’s reasoning by analyzing their past decisions. The problem is framed as learning social welfare functions from the family of power mean functions. Two learning tasks are considered: one uses utility vectors of actions and their corresponding social welfare values, while the other uses pairwise comparisons of welfares for different utility vectors. The authors demonstrate that power mean functions can be learned efficiently, even when the social welfare data is noisy. They also propose practical algorithms for these tasks and evaluate their effectiveness.

    Metric Transforms and Low Rank Representations of Kernels

    Authors: Timothy Chu, Josh Alman, Gary L. Miller, Shyam Narayanan, Mark Sellke, Zhao Song

    The authors introduce a linear-algebraic tool based on group representation theory to solve three important problems in machine learning. First, they investigate fast attention algorithms for large language models and prove that only low-degree polynomials can produce the low-rank matrices required for subquadratic attention, thereby showing that polynomial-based approximations are essential. Second, they extend the classification of positive definite kernels from Euclidean distances to Manhattan distances, offering a broader foundation for kernel methods. Finally, they classify all functions that transform Manhattan distances into Manhattan distances, generalizing earlier work on Euclidean metrics and introducing new results about stable-rank-preserving functions with potential applications in algorithm design.

    Sample-Efficient Private Learning of Mixtures of Gaussians

    Authors: Hassan Ashtiani, Mahbod Majid, Shyam Narayanan

    This work examines the problem of learning mixtures of Gaussians while ensuring approximate differential privacy. The authors demonstrate that it is possible to learn a mixture of k arbitrary d-dimensional Gaussians with significantly fewer samples than previous methods, achieving optimal performance when the dimensionality d is much larger than the number of components k. For univariate Gaussians, they establish the first optimal bound, showing that the sample complexity scales linearly with k, improving upon earlier methods that required a quadratic dependence on k. Their approach leverages advanced techniques, including the inverse sensitivity mechanism, sample compression for distributions, and volume bounding methods, to achieve these results.

    Sequoia: Scalable and Robust Speculative Decoding

    Authors: Zhuoming Chen, Avner May, Ruslan Svirschevski, Yu-hsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

    As the use of large language models (LLMs) increases, serving them quickly and efficiently has become a critical challenge. Speculative decoding offers a promising solution, but existing methods struggle to scale with larger workloads or adapt to different settings. This paper introduces Sequoia, a scalable and robust algorithm for speculative decoding. By employing a dynamic programming algorithm, Sequoia optimizes the tree structure for speculated tokens, improving scalability. It also introduces a novel sampling and verification method that enhances robustness across various decoding temperatures. Sequoia achieves significant speedups, improving decoding speed on models like Llama2-7B, Llama2-13B, and Vicuna-33B by up to 4.04x, 3.73x, and 2.27x, respectively, and reducing per-token latency for Llama3-70B-Instruct on a single GPU by 9.5x compared to DeepSpeed-Zero-Inference.

    Slight Corruption in Pre-training Data Makes Better Diffusion Models

    Authors: Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj

    Diffusion models have demonstrated impressive capabilities in generating high-quality images, audio, and videos, largely due to pre-training on large datasets that pair data with conditions, such as image-text or image-class pairs. However, even with careful filtering, these datasets often include corrupted pairs where the conditions do not accurately represent the data. This paper provides the first comprehensive study of how such corruption affects diffusion model training. By synthetically corrupting datasets like ImageNet-1K and CC3M, the authors show that slight corruption in pre-training data can surprisingly enhance image quality, diversity, and fidelity across various models. They also provide theoretical insights, demonstrating that slight condition corruption increases entropy and reduces the 2-Wasserstein distance to the ground truth distribution. Building on these findings, the authors propose a method called condition embedding perturbations, which improves diffusion model performance during both pre-training and downstream tasks, offering new insights into the training process.

    Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models

    Authors: Sanae Lotfi, Yilun Kuang, Marc Finzi, Brandon Amos, Micah Goldblum, Andrew Wilson

    Large language models (LLMs) with billions of parameters are highly effective at predicting the next token in a sequence. While recent research has computed generalization bounds for these models using compression-based techniques, these bounds often fail to apply to billion-parameter models or rely on restrictive methods that produce low-quality text. Existing approaches also tie the tightness of bounds to the number of independent documents in the training set, ignoring the larger number of dependent tokens, which could offer better bounds. This work uses properties of martingales to derive generalization bounds that leverage the vast number of tokens in LLM training sets. By using more flexible compression techniques like Monarch matrices, Kronecker factorizations, and post-training quantization, the authors achieve meaningful generalization bounds for large-scale models, including LLaMA2-70B, marking the first successful bounds for practical, high-quality text-generating models.

    Poster Papers

    Causality

    Causal Inference in the Closed-Loop: Marginal Structural Models for Sequential Excursion Effects

    Authors: Alexander Levis, Gabriel Loewinger, Francisco Pereira

    Causal Temporal Representation Learning with Nonstationary Sparse Transition

    Authors: Xiangchen Song, Zijian Li, Guangyi Chen, Yujia Zheng, Yewen Fan, Xinshuai Dong, Kun Zhang

    Discovery of the Hidden World with Large Language Models

    Authors: Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

    From Causal to Concept-Based Representation Learning

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Identifying General Mechanism Shifts in Linear Causal Representations

    Authors: Tianyu Chen, Kevin Bello, Francesco Locatello, Bryon Aragam, Pradeep Ravikumar

    Identifying Selections for Unsupervised Subtask Discovery

    Authors: Yiwen Qiu, Yujia Zheng, Kun Zhang

    Interventional Causal Discovery in a Mixture of DAGs

    Authors: Burak Varıcı, Dmitriy Katz, Dennis Wei, Prasanna Sattigeri, Ali Tajer

    Learning Discrete Concepts in Latent Hierarchical Models

    Authors: Lingjing Kong, Guangyi Chen, Biwei Huang, Eric Xing, Yuejie Chi, Kun Zhang

    Learning Discrete Latent Variable Structures with Tensor Rank Conditions

    Authors: Zhengming Chen, Ruichu Cai, Feng Xie, Jie Qiao, Anpeng Wu, Zijian Li, Zhifeng Hao, Kun Zhang

    Likelihood-based differentiable structure learning

    Authors: Chang Deng, Kevin Bello, Pradeep Ravikumar, Bryon Aragam

    Linear Causal Representation Learning from Unknown Multi-node Interventions

    Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer

    Mutli-Armed Bandits with Network Interference

    Authors: Abhineet Agarwal, Anish Agarwal, Lorenzo Masoero, Justin Whitehouse

    Natural Counterfactuals With Necessary Backtracking

    Authors: Guang-yuan Hao, Jiji Zhang, Biwei Huang, Hao Wang, Kun Zhang

    On Causal Discovery in the Presence of Deterministic Relations

    Authors: Loka Li, Haoyue Dai, Hanin Al Ghothani, Biwei Huang, Jiji Zhang, Shahar Harel, Isaac Bentwich, Guangyi Chen, Kun Zhang

    Sample Complexity of Interventional Causal Representation Learning

    Authors: Emre Acartürk, Burak Varıcı, Karthikeyan Shanmugam, Ali Tajer

    Computational Biology

    Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer

    Authors: Tinglin Huang, Zhenqiao Song, Rex Ying, Wengong Jin

    Computer Vision

    Adaptive Visual Scene Understanding: Incremental Scene Graph Generation

    Authors: Naitik Khandelwal, Xiao Liu, Mengmi Zhang

    Crafting Hierarchical Strand-based Hair Geometry with Frequency-decomposed Representative Guide Curves

    Authors: Yunlu Chen, Francisco Vicente Carrasco, Christian Häne, Giljoo Nam, Jean-charles Bazin, Fernando D De La Torre

    DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features

    Authors: Letian Wang, Seung Wook Kim, Jiawei Yang, Cunjun Yu, Boris Ivanovic, Steven Waslander, Yue Wang, Sanja Fidler, Marco Pavone, Peter Karkus

    EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding

    Authors: Thanh-dat Truong, Utsav Prabhu, Dongyi Wang, Bhiksha Raj, Susan Gauch, Jeyamkondan Subbiah, Khoa Luu

    Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba

    Authors: Haoye Dong, Aviral Chharia, Wenbo Gou, Francisco Vicente Carrasco, Fernando D De La Torre

    Lexicon3D: Probing Visual Encoding Models for Complex 3D Scene Understanding

    Authors: Yunze Man, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Liangyan Gui, Yu-xiong Wang

    MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction

    Authors: Jiahe Chen, Jinkun Cao, Dahua Lin, Kris Kitani, Jiangmiao Pang

    Metric from Human: Zero-shot Monocular Metric Depth Estimation via Test-time Adaptation

    Authors: Yizhou Zhao, Hengwei Bian, Kaihua Chen, Pengliang Ji, Liao Qu, Shao-yu Lin, Weichen Yu, Haoran Li, Hao Chen, Jun Shen, Bhiksha Raj, Min Xu

    Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis

    Authors: Qitao Zhao, Shubham Tulsiani

    Vision Foundation Model Enables Generalizable Object Pose Estimation

    Authors: Kai Chen, Yiyao Ma, Xingyu Lin, Stephen James, Jianshu Zhou, Yun-hui Liu, Pieter Abbeel, Dou Qi

    Computer Vision (Image Generation)

    Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks

    Authors: Victor Boutin, Rishav Mukherji, Aditya Agrawal, Sabine Muzellec, Thomas Fel, Thomas Serre, Rufin Vanrullen

    Computer Vision (Video Generation)

    4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models

    Authors: Heng Yu, Chaoyang Wang, Peiye Zhuang, Willi Menapace, Aliaksandr Siarohin, Junli Cao, László Jeni, Sergey Tulyakov, Hsin-ying Lee

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-chuan Su, Brendan Jou, Jose Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Se Young Chun, Krishna Somandepalli

    Computer Vision (Video Understanding)

    DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

    Authors: Wen-hsuan Chu, Lei Ke, Katerina Fragkiadaki

    HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model

    Authors: Khoa Vo, Thinh Phan, Kashu Yamazaki, Minh Tran, Ngan Le

    Data-centric AI

    Data Distribution Valuation

    Authors: Xinyi Xu, Shuaiqi Wang, Chuan Sheng Foo, Bryan Kian Hsiang Low, Giulia Fanti

    Visual Data Diagnosis and Debiasing with Concept Graphs

    Authors: Rwiddhi Chakraborty, Yinong O Wang, Jialu Gao, Runkai Zheng, Cheng Zhang, Fernando D De La Torre

    Data-centric AI (Data Augmentation)

    Turning Indirect Knowledge into Direct Demonstrations for Computer Agents at Scale

    Authors: Tianyue Ou, Frank F. Xu, Aman Madaan, Jiarui Liu, Robert Lo, Abishek Sridhar, Sudipta Sengupta, Dan Roth, Graham Neubig, Shuyan Zhou

    Data-centric AI (Data-centric AI Methods And Tools)

    MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models

    Authors: Zichun Yu, Spandan Das, Chenyan Xiong

    Deep Learning (Algorithms)

    Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios

    Authors: Shantanu Jaiswal, Debaditya Roy, Basura Fernando, Cheston Tan

    On the Inductive Bias of Stacking Towards Improving Reasoning

    Authors: Nikunj Saunshi, Stefani Karp, Shankar Krishnan, Sobhan Miryoosefi, Sashank Jakkam Reddi, Sanjiv Kumar

    RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space

    Authors: Jingdi Chen, Hanhan Zhou, Yongsheng Mei, Carlee Joe-wong, Nathaniel Bastian, Tian Lan

    Deep Learning (Attention Mechanisms)

    Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

    Authors: Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang “atlas” Wang

    Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

    Authors: Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

    Towards Understanding the Mechanisms of Associative Memory in Transformers

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Deep Learning (Everything Else)

    FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations

    Authors: Ziyao Wang, Zheyu Shen, Yexiao He, Guoheng Sun, Hongyi Wang, Lingjuan Lyu, Ang Li

    HORSE: Hierarchical Representation for Large-Scale Neural Subset Selection

    Authors: Binghui Xie, Yixuan Wang, Yongqiang Chen, Kaiwen Zhou, Yu Li, Wei Meng, James Cheng

    Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

    Authors: Sukjun Hwang, Aakash Sunil Lahoti, Ratish Puduppully, Tri Dao, Albert Gu

    MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training

    Authors: Cheng Luo, Jiawei Zhao, Zhuoming Chen, Beidi Chen, Animashree Anandkumar

    Mixture of Nested Experts: Adaptive Processing of Visual Tokens

    Authors: Gagan Jain, Nidhi Hegde, Aditya Kusupati, Arsha Nagrani, Shyamal Buch, Prateek Jain, Anurag Arnab, Sujoy Paul

    SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

    Authors: Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li

    Deep Learning (Representation Learning)

    Towards Understanding Extrapolation: a Causal Lens

    Authors: Lingjing Kong, Guangyi Chen, Petar Stojanov, Haoxuan Li, Eric Xing, Kun Zhang

    Who Needs Features? On the Surprising Effectiveness of Attention Transfer for Vision Transformers

    Authors: Alex Li, Yuandong Tian, Beidi Chen, Deepak Pathak, Xinlei Chen

    Deep Learning (Robustness)

    Achieving Domain-Independent Certified Robustness via Knowledge Continuity

    Authors: Alan Sun, Chiyu Ma, Kenneth Ge, Soroush Vosoughi

    Predicting the Performance of Foundation Models via Agreement-on-the-Line

    Authors: Rahul Saxena, Taeyoun Kim, Aman Mehra, Christina Baek, J. Zico Kolter, Aditi Raghunathan

    ProTransformer: Robustify Transformers via Plug-and-Play Paradigm

    Authors: Zhichao Hou, Weizhi Gao, Yuchen Shen, Feiyi Wang, Xiaorui Liu

    Fairness

    Fair Wasserstein Coresets

    Authors: Zikai Xiong, Niccolo Dalmasso, Shubham Sharma, Freddy Lecue, Daniele Magazzeni, Vamsi Potluru, Tucker Balch, Manuela Veloso

    Mitigating Biases in Blackbox Feature Extractors for Image Classification Tasks

    Authors: Abhipsa Basu, Saswat Subhajyoti Mallick, Venkatesh Babu R

    On Socially Fair Low-Rank Approximation and Column Subset Selection

    Authors: Zhao Song, Ali Vakilian, David Woodruff, Samson Zhou

    SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation

    Authors: Misha Khodak, Lester Mackey, Miro Dudik, Alexandra Chouldechova

    Generative Models

    A Critical Evaluation of AI Feedback for Aligning Large Language Models

    Authors: Archit Sharma, Sedrick Scott Keh, Eric Mitchell, Chelsea Finn, Kushal Arora, Thomas Kollar

    Data Attribution for Text-to-Image Models by Unlearning Synthesized Images

    Authors: Sheng-yu Wang, Alexei Efros, Aaron Hertzmann, Jun-yan Zhu, Richard Zhang

    Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

    Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

    Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

    Authors: Minghan Li, Xilun Chen, Ari Holtzman, Beidi Chen, Jimmy Lin, Scott Yih, Victoria Lin

    Generative Models (Diffusion Models)

    Diffusing Differentiable Representations

    Authors: Yash Savani, Marc Finzi, J. Zico Kolter

    Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

    Authors: Boyuan Chen, Diego Martí Monsó, Yilun Du, Max Simchowitz, Russ Tedrake, Vincent Sitzmann

    Improving the Training of Rectified Flows

    Authors: Sangyun Lee, Zinan Lin, Giulia Fanti

    Model-based Diffusion for Trajectory Optimization

    Authors: Chaoyi Pan, Zeji Yi, Guanya Shi, Guannan Qu

    Permutation-Invariant Autoregressive Diffusion for Graph Generation

    Authors: Lingxiao Zhao, Xueying Ding, Leman Akoglu

    Understanding Hallucinations in Diffusion Models through Mode Interpolation

    Authors: Sumukh K Aithal, Pratyush Maini, Zachary Lipton, J. Zico Kolter

    Your Diffusion Model is Secretly a Noise Classifier and Benefits from Contrastive Training

    Authors: Yunshu Wu, Yingtao Luo, Xianghao Kong, Vagelis Papalexakis, Greg Ver Steeg

    Generative Models (In Context Learning)

    Can large language models explore in-context?

    Authors: Akshay Krishnamurthy, Keegan Harris, Dylan J Foster, Cyril Zhang, Aleksandrs Slivkins

    Generative Models (Misc)

    Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning

    Authors: Xuechen Zhang, Zijian Huang, Ege Onur Taga, Carlee Joe-wong, Samet Oymak, Jiasi Chen

    MixEval: Fast and Dynamic Human Preference Approximation with LLM Benchmark Mixtures

    Authors: Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You

    Generative Models (Reasoning)

    AutoMix: Automatically Mixing Language Models

    Authors: Pranjal Aggarwal, Aman Madaan, Ankit Anand, Srividya Pranavi Potharaju, Swaroop Mishra, Pei Zhou, Aditya Gupta, Dheeraj Rajagopal, Karthik Kappaganthu, Yiming Yang, Shyam Upadhyay, Manaal Faruqui, Mausam

    Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

    Authors: Zhiqing Sun, Longhui Yu, Yikang Shen, Weiyang Liu, Yiming Yang, Sean Welleck, Chuang Gan

    Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve

    Authors: Yuxiao Qu, Tianjun Zhang, Naman Garg, Aviral Kumar

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean Mcleish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

    Graph Neural Networks

    Even Sparser Graph Transformers

    Authors: Hamed Shirzad, Honghao Lin, Balaji Venkatachalam, Ameya Velingker, David Woodruff, Danica J. Sutherland

    Human-computer Interaction

    Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

    Authors: Zebang Cheng, Zhi-qi Cheng, Jun-yan He, Kai Wang, Yuxiang Lin, Zheng Lian, Xiaojiang Peng, Alexander Hauptmann

    Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction

    Authors: Zhenyu Lou, Qiongjie Cui, Tuo Wang, Zhenbo Song, Luoming Zhang, Cheng Cheng, Haofan Wang, Xu Tang, Huaxia Li, Hong Zhou

    Interpretability

    Diffusion PID: Interpreting Diffusion via Partial Information Decomposition

    Authors: Shaurya Dewan, Rushikesh Zawar, Prakanshul Saxena, Yingshan Chang, Andrew Luo, Yonatan Bisk

    Model Lego: Creating Models Like Disassembling and Assembling Building Blocks

    Authors: Jiacong Hu, Jing Gao, Jingwen Ye, Yang Gao, Xingen Wang, Zunlei Feng, Mingli Song

    Language (Dialogue)

    IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering

    Authors: Ruosen Li, Ruochen Li, Barry Wang, Xinya Du

    Language (Generation)

    Aligning to Thousands of Varying Preferences via System Message Generalization

    Authors: Seongyun Lee, Sue Hyun Park, Seungone Kim, Minjoon Seo

    Language (Knowledge)

    Alignment for Honesty

    Authors: Yuqing Yang, Ethan Chern, Xipeng Qiu, Graham Neubig, Pengfei Liu

    Learning Theory

    Accelerating ERM for data-driven algorithm design using output-sensitive techniques

    Authors: Maria-florina Balcan, Christopher Seiler, Dravyansh Sharma

    On the Comparison between Multi-modal and Single-modal Contrastive Learning

    Authors: Wei Huang, Andi Han, Yongqiang Chen, Yuan Cao, Zhiqiang Xu, Taiji Suzuki

    Oracle-Efficient Differentially Private Learning with Public Data

    Authors: Adam Block, Mark Bun, Rathin Desai, Abhishek Shetty, Steven Wu

    Sample-Efficient Agnostic Boosting

    Authors: Udaya Ghai, Karan Singh

    Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity

    Authors: Qian Yu, Yining Wang, Baihe Huang, Qi Lei, Jason Lee

    Miscellaneous Aspects Of Machine Learning (General Machine Learning Techniques)

    Post-Hoc Reversal: Are We Selecting Models Prematurely?

    Authors: Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Lipton

    Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices

    Authors: Andres Potapczynski, Shikai Qiu, Marc Finzi, Christopher Ferri, Charlie Chen, Micah Goldblum, C. Bayan Bruss, Christopher De Sa, Andrew Wilson

    Miscellaneous Aspects Of Machine Learning (Supervised Learning)

    Implicit Regularization Paths of Weighted Neural Representations

    Authors: Jin-hong Du, Pratik Patil

    Multimodal Models

    Continual Audio-Visual Sound Separation

    Authors: Weiguo Pian, Yiyang Nan, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian

    Do CLIP Models Always Generalize Better than ImageNet Models?

    Authors: Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

    Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

    Authors: Ce Zhang, Simon Stepputtis, Katia Sycara, Yaqi Xie

    FlexCap: Describe Anything in Images in Controllable Detail

    Authors: Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

    Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

    Authors: Brandon Huang, Chancharik Mitra, Leonid Karlinsky, Assaf Arbelle, Trevor Darrell, Roei Herzig

    Neuroscience, Cognitive Science

    Divergences between Language Models and Human Brains

    Authors: Yuchen Zhou, Emmy Liu, Graham Neubig, Michael Tarr, Leila Wehbe

    MiSO: Optimizing brain stimulation to create neural activity states

    Authors: Yuki Minai, Joana Soldado-magraner, Matthew Smith, Byron M Yu

    Online Learning

    Communication Bounds for the Distributed Experts Problem

    Authors: Zhihao Jia, Qi Pang, Trung Tran, David Woodruff, Zhihao Zhang, Wenting Zheng

    Global Rewards in Restless Multi-Armed Bandits

    Authors: Naveen Raman, Zheyuan Shi, Fei Fang

    Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

    Authors: Agniv Bandyopadhyay, Sandeep Juneja, Shubhada Agrawal

    Regret Minimization in Stackelberg Games with Side Information

    Authors: Keegan Harris, Steven Wu, Maria-florina Balcan

    Optimization

    Binary Search Tree with Distributional Predictions

    Authors: Michael Dinitz, Sungjin Im, Thomas Lavastida, Ben Moseley, Aidin Niaparast, Sergei Vassilvitskii

    SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization

    Authors: Taisuke Yasuda, Kyriakos Axiotis, Gang Fu, Mohammadhossein Bateni, Vahab Mirrokni

    Optimization (Convex)

    John Ellipsoids via Lazy Updates

    Authors: David Woodruff, Taisuke Yasuda

    Optimization (Large Scale, Parallel And Distributed)

    Efficient Federated Learning against Heterogeneous and Non-stationary Client Unavailability

    Authors: Ming Xiang, Stratis Ioannidis, Edmund Yeh, Carlee Joe-wong, Lili Su

    LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing

    Authors: Xiaonan Nie, Liu Qibin, Fangcheng Fu, Shenhan Zhu, Xupeng Miao, Xiaoyang Li, Yang Zhang, Shouda Liu, Bin Cui

    Optimization (Learning For Optimization)

    Warm-starting Push-Relabel

    Authors: Sami Davies, Sergei Vassilvitskii, Yuyan Wang

    Other

    A Local Method for Satisfying Interventional Fairness with Partially Known Causal Graphs

    Authors: Haoxuan Li, Yue Liu, Zhi Geng, Kun Zhang

    Active, anytime-valid risk controlling prediction sets

    Authors: Ziyu Xu, Nikos Karampatziakis, Paul Mineiro

    Aligning Audio-Visual Joint Representations with an Agentic Workflow

    Authors: Shentong Mo, Yibing Song

    Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting

    Authors: Yian Wang, Xiaowen Qiu, Jiageng Liu, Zhehuan Chen, Jiting Cai, Yufei Wang, Tsun-hsuan Johnson Wang, Zhou Xian, Chuang Gan

    Convergence of $log(1/epsilon)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

    Authors: Ioannis Anagnostides, Tuomas Sandholm

    Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

    Authors: Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson

    Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning

    Authors: Tong Yang, Shicong Cen, Yuting Wei, Yuxin Chen, Yuejie Chi

    GL-NeRF: Gauss-Laguerre Quadrature Enables Training-Free NeRF Acceleration

    Authors: Silong Yong, Yaqi Xie, Simon Stepputtis, Katia Sycara

    Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

    Authors: Boshi Wang, Xiang Yue, Yu Su, Huan Sun

    Hierarchical and Density-based Causal Clustering

    Authors: Kwangho Kim, Jisu Kim, Larry Wasserman, Edward Kennedy

    Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations

    Authors: Hao Chen, Ankit Shah, Jindong Wang, Ran Tao, Yidong Wang, Xiang Li, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj

    In-Context Learning with Representations: Contextual Generalization of Trained Transformers

    Authors: Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi

    Invisible Image Watermarks Are Provably Removable Using Generative AI

    Authors: Xuandong Zhao, Kexun Zhang, Zihao Su, Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-xiang Wang, Lei Li

    MAmmoTH2: Scaling Instructions from the Web

    Authors: Xiang Yue, Tianyu Zheng, Ge Zhang, Wenhu Chen

    MergeMinds: Boosting Multilingual Reasoning with the Built-in Capabilities of LLMs

    Authors: Zixian Huang, Wenhao Zhu, Gong Cheng, Lei Li, Fei Yuan

    Neural Collapse Inspired Feature Alignment for Out-of-Distribution Generalization

    Authors: Zhikang Chen, Min Zhang, Sen Cui, Haoxuan Li, Gang Niu, Mingming Gong, Changshui Zhang, Kun Zhang

    On the Parameter Identifiability of Partially Observed Linear Causal Models

    Authors: Xinshuai Dong, Ignavier Ng, Biwei Huang, Yuewen Sun, Songyao Jin, Roberto Legaspi, Peter Spirtes, Kun Zhang

    One-Step Diffusion Distillation through Score Implicit Matching

    Authors: Weijian Luo, Zemin Huang, Zhengyang Geng, J. Zico Kolter, Guo-jun Qi

    Private and Personalized Frequency Estimation in a Federated Setting

    Authors: Amrith Setlur, Vitaly Feldman, Kunal Talwar

    Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image Reconstruction

    Authors: Xingyu Xu, Yuejie Chi

    S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

    Authors: Xinyu Yang, Jixuan Leng, Geyang Guo, Jiawei Zhao, Ryumei Nakada, Linjun Zhang, Huaxiu Yao, Beidi Chen

    SIRIUS : Contexual Sparisty with Correction for Efficient LLMs

    Authors: Yang Zhou, Zhuoming Chen, Zhaozhuo Xu, Victoria Lin, Beidi Chen

    Sequential Harmful Shift Detection Without Labels

    Authors: Salim I. Amoukou, Tom Bewley, Saumitra Mishra, Freddy Lecue, Daniele Magazzeni, Manuela Veloso

    SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices

    Authors: Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin

    Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation

    Authors: Ruihan Gao, Kangle Deng, Gengshan Yang, Wenzhen Yuan, Jun-yan Zhu

    Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models

    Authors: Aviv Bick, Kevin Li, Eric Xing, J. Zico Kolter, Albert Gu

    When and How Does Synthetic Data Improve Reasoning Capabilities of Language Models?

    Authors: Amrith Setlur, Saurabh Garg, Naman Garg, Xinyang Geng, Virginia Smith, Aviral Kumar

    Privacy

    LLM Dataset Inference: Detect Datasets, not Strings

    Authors: Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic

    No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices

    Authors: Qi Pang, Shengyuan Hu, Wenting Zheng, Virginia Smith

    On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift

    Authors: Pratiksha Thaker, Amrith Setlur, Steven Wu, Virginia Smith

    Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable

    Authors: Martin Bertran, Shuai Tang, Michael Kearns, Jamie Morgenstern, Aaron Roth, Steven Wu

    Reinforcement Learning (Batch Offline)

    Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning

    Authors: Jeonghye Kim, Suyoung Lee, Woojun Kim, Youngchul Sung

    BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning

    Authors: Haohong Lin, Wenhao Ding, Jian Chen, Laixi Shi, Jiacheng Zhu, Bo Li, Ding Zhao

    OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

    Authors: Yihang Yao, Zhepeng Cen, Wenhao Ding, Haohong Lin, Shiqi Liu, Tingnan Zhang, Wenhao Yu, Ding Zhao

    Reinforcement Learning (Everything Else)

    Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation

    Authors: Daehee Lee, Minjong Yoo, Woo Kyung Kim, Wonje Choi, Honguk Woo

    REBEL: Reinforcement Learning via Regressing Relative Rewards

    Authors: Zhaolin Gao, Jonathan Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, Drew Bagnell, Jason Lee, Wen Sun

    Understanding Preference Learning Through the Lens of Coverage

    Authors: Yuda Song, Gokul Swamy, Aarti Singh, J. Bagnell, Wen Sun

    Reinforcement Learning (Multi-agent)

    Language Grounded Multi-Agent Communication for Ad-hoc Teamwork

    Authors: Huao Li, Hossein Nourkhiz Mahjoub, Behdad Chalaki, Vaishnav Tadiparthi, Kwonjoon Lee, Ehsan Moradi Pari, Charles Lewis, Katia Sycara

    Multi-Agent Imitation Learning: Value is Easy, Regret is Hard

    Authors: Jingwu Tang, Gokul Swamy, Fei Fang, Steven Wu

    Reinforcement Learning (Planning)

    Identifying Latent State-Transition Processes for Individualized Reinforcement Learning

    Authors: Yuewen Sun, Biwei Huang, Yu Yao, Donghuo Zeng, Xinshuai Dong, Songyao Jin, Boyang Sun, Roberto Legaspi, Kazushi Ikeda, Peter Spirtes, Kun Zhang

    Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

    Authors: Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine

    Robotics

    BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction

    Authors: Zikang Zhou, Hu Haibo, Xinhong Chen, Jianping Wang, Nan Guan, Kui Wu, Yung-hui Li, Yu-kai Huang, Chun Jason Xue

    Simulated Humanoid Grasping on Diverse Objects

    Authors: Zhengyi Luo, Jinkun Cao, Sammy Christen, Alexander Winkler, Kris Kitani, Weipeng Xu

    Theory (Everything Else)

    Adversarially Robust Dense-Sparse Tradeoffs via Heavy-Hitters

    Authors: David Woodruff, Samson Zhou

    Analytically Computing Partial Information Decomposition

    Authors: Chaitanya Goswami, Amanda Merkley

    Theory (Game Theory)

    Aggregating Quantitative Relative Judgments: From Social Choice to Ranking Prediction

    Authors: Yixuan Xu, Hanrui Zhang, Yu Cheng, Vincent Conitzer

    Bias Detection via Signaling

    Authors: Yiling Chen, Tao Lin, Ariel Procaccia, Aaditya Ramdas, Itai Shapira

    Efficient $Phi$-Regret Minimization with Low-Degree Swap Deviations in Extensive-Form Games

    Authors: Brian Zhang, Ioannis Anagnostides, Gabriele Farina, Tuomas Sandholm

    The Secretary Problem with Predicted Additive Gap

    Authors: Alexander Braun, Sherry Sarkar

    Theory (Reinforcement Learning And Planning)

    A theoretical case-study of Scalable Oversight in Hierarchical Reinforcement Learning

    Authors: Tom Yan, Zachary Lipton

    Time Series

    Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification

    Authors: Junru Chen, Tianyu Cao, Jing Xu, Jiahe Li, Zhilong Chen, Tao Xiao, Yang Yang

    Trustworthy Machine Learning

    Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

    Authors: Jiayun Wu, Jiashuo Liu, Peng Cui, Steven Wu

    Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses

    Authors: Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Jing Jiang, Min Lin

    Improving Alignment and Robustness with Short Circuiting

    Authors: Andy Zou, Long Phan, Justin Wang, Derek Duenas, Maxwell Lin, Maksym Andriushchenko, J. Zico Kolter, Matt Fredrikson, Dan Hendrycks

    Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes

    Authors: Weifeng Liu, Tianyi She, Jiawei Liu, Run Wang, Dongyu Yao, 子游 梁, Boheng Li

    Rethinking LLM Memorization through the Lens of Adversarial Compression

    Authors: Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary Lipton, J. Zico Kolter

    Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

    Authors: Eungyeup Kim, Mingjie Sun, Christina Baek, Aditi Raghunathan, J. Zico Kolter

    Towards Calibrated Robust Fine-Tuning of Vision-Language Models

    Authors: Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-qi Cheng, Kyungwoo Song

    WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

    Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Nouha Dziri, Yejin Choi

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleVisatronic: A Unified Multimodal Transformer for Video-Text-to-Speech Synthesis with Superior Synchronization and Efficiency
    Next Article This AI Paper Introduces SuperGCN: A Scalable and Efficient Framework for CPU-Powered GCN Training on Large Graphs

    Related Posts

    Machine Learning

    Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

    May 16, 2025
    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    Researchers Warn of Chinese-Aligned Hackers Targeting South China Sea Countries

    Development

    cybercog/laravel-clickhouse

    Development

    WebDreamer: Enhancing Web Navigation Through LLM-Powered Model-Based Planning

    Development

    This AI Paper by Meta FAIR Introduces MoMa: A Modality-Aware Mixture-of-Experts Architecture for Efficient Multimodal Pre-training

    Development

    Highlights

    The Anatomy of Great UX: 4 Examples to Learn From

    August 22, 2024

    What is great UX? I love this question because everyone almost always has a different definition…

    I’ve Found the Next 20x Cryptocurrency Ready to Take-Off!

    May 31, 2024

    YouTube Tests AI Feature That Will Completely Change How You Search for Videos

    April 25, 2025

    CVE-2025-43919 – cPanel WHM GNU Mailman File Traversal Vulnerability

    April 20, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.