Condition-Aware Neural Network (CAN): A New AI Method for Adding Control to Image Generative Models

A deep Neural network is crucial in synthesizing photorealistic images and videos using large-scale image and video generative models. These models can be made into productive tools for humans through a critical step: adding control. This will empower generative models to follow the instructions humans provided instead of randomly generating data samples. Extensive studies have been conducted to achieve this goal. For example, in Generative Adversarial Networks (GANs), a widespread solution is to use adaptive normalization that dynamically scales and shifts the intermediate feature maps according to the input condition.

However, widely used techniques share the same underlying mechanism, i.e., adding control by feature space manipulation despite the difference in the operations. Also, the neural network weight, convolution, or linear layers remain the same for different conditions. So, two critical questions arise: (a) can image generative models be controlled by manipulating their weight? (b) Can controlled image generative models benefit from this new conditional control method? This paper aims to address both the problems in an efficient way.

Researchers from MIT, Tsinghua University, and NVIDIA introduces Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. CAN successfully control the image generation process by dynamically manipulating the weight of the neural network. To achieve this, a condition-aware weight generation module is introduced that generates conditional weight for convolution/linear layers based on the input condition. There are two critical insights for CAN: choosing a subset of modules to be condition-aware is beneficial for both efficiency and performance. Secondly, directly generating the conditional weight is much more effective.

CAN is evaluated on two representative diffusion transformer models, DiT and UViT. It achieves significant performance boosts for all these diffusion transformer models while incurring negligible computational cost increases. CAN resolve various issues:

This new mechanism controls image-generative models and demonstrates the effectiveness of weight manipulation for conditional control.

CAN is a new conditional control method that can be used in practice with the help of design insights. It outperforms prior conditional control methods by a significant margin.

CAN benefit the deployment of image generative models and achieves a better FID on ImageNet 512Ã—512 by using 52Ã— fewer MACs than DiT-XL/2 per sampling step.

Instead of directly generating the conditional weight, Adaptive Kernel Selection (AKS) is another possible approach that maintains a set of base convolution kernels and dynamically generates scaling parameters to combine these base kernels. The parameter of AKS has a smaller overhead than that of CAN; however, it cannot match CANâ€™s performance. This tells that dynamic parameterization is not the only key to better performance. Moreover, CAN is tested on class conditional image generation on ImageNet and text-to-image generation on COCO, resulting in significant improvements for diffusion transformer models.

In conclusion, CAN is a new conditional control method for adding control to image generative models. For CANâ€™s effectiveness, the experiment is carried out on class-conditional generation using ImageNet and text-to-image generation using COCO, delivering consistent and significant improvements over prior conditional control methods. Apart from this, a new family of diffusion transformer models was built by marrying CAN and EfficientViT. Future work includes applying CAN to more challenging tasks like large-scale text-to-image generation, video generation, etc.

Check out theÂ Paper.Â All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 39k+ ML SubReddit

The post Condition-Aware Neural Network (CAN): A New AI Method for Adding Control to Image Generative Models appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

I need to see more from Lenovo’s most affordable gaming desktop, because this isn’t good enough

Gears of War: Reloaded — Release date, price, and everything you need to know

I’ve been using the Logitech MX Master 3S’ gaming-influenced alternative, and it could be your next mouse

Your Android devices are getting several upgrades for free – including a big one for Auto

YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

Big Changes at Meteor Software: Our Next Chapter

I need to see more from Lenovo’s most affordable gaming desktop, because this isn’t good enough

I need to see more from Lenovo’s most affordable gaming desktop, because this isn’t good enough

Gears of War: Reloaded — Release date, price, and everything you need to know

I’ve been using the Logitech MX Master 3S’ gaming-influenced alternative, and it could be your next mouse

Condition-Aware Neural Network (CAN): A New AI Method for Adding Control to Image Generative Models

February 2025 Baseline monthly digest

Markus Buehler receives 2025 Washington Award

Enhancing Selenium with AI Capabilities: Integrating Image Recognition, NL, and ML

New UX/UI Tools I’m Loving! – Microsoft UX Certificate, Figma Updates, OpenAI Academy & More!

Tech Giants, Google and CSIRO Team Up to Shield Australiaâ€™s Critical Infrastructure

The AI for Science Forum: A new era of discovery

‘Easily Exploitable’ Langflow Vulnerability Requires Immediate Patching

CVE-2024-8201 – Hitachi Ops Center Analyzer RAID Agent Cross-Site WebSocket Hijacking

Overcoming Challenges in Game Testing

Hackers Exploiting LiteSpeed Cache Bug to Gain Full Control of WordPress Sites

Condition-Aware Neural Network (CAN): A New AI Method for Adding Control to Image Generative Models

Related Posts