Researchers at the University College London Unravel the Universal Dynamics of Representation Learning in Deep Neural Networks

Deep neural networks (DNNs) come in various sizes and structures. The specific architecture selected along with the dataset and learning algorithm used, is known to influence the neural patterns learned. Currently, a major challenge faced in the theory of deep learning is the issue of scalability. Although exact solutions to learning dynamics exist for simpler networks, adjusting even a small part of the network architecture often requires significant changes to the analysis. Moreover, the state-of-the-art models are so complex that they outperform practical analytical solutions. These results consist of complex machine learning models and even the brain, posing challenges for theoretical study.

In this paper, the first related work is Exact solutions in simple architectures, where a lot of progress is made in the theoretical analysis of deep linear neural networks, e.g. the loss landscape is well understood, and exact solutions have been obtained for specific initial conditions. The next related approach is the neural tangent kernel, where a notable exception in terms of universal solutions is that it provides exact solutions applicable to a wide range of models. Next is the Implicit biases in the gradient descent technique, where the investigation of gradient descent is done as a source of generalization performance in DNNs. The final method is Local Elasticity, where a model shows this property if updating one feature vector minimally affects dissimilar feature vectors.

Researchers from the University College London have proposed a method for modeling universal representation learning, whose aim is to explain common phenomena observed in learning systems. An effective theory is developed for two similar data points to interact with each other during training when the neural network is large and complex, so, itâ€™s not heavily limited by its parameters. Moreover, the existence of universal behavior is demonstrated in representational learning dynamics, by the fact that the derived theory explains the dynamics of various deep networks with different activation functions and architectures.

The proposed theory looks at the representation dynamics at â€œsome intermediate layer H.â€ Since DNNs have many layers where representations can be observed, it poses a question of how these dynamics depend on the depth of the chosen intermediate layer. To answer this, it is necessary to determine on which layers the effective theory is still valid. For the linear approximation to be accurate, the representations must start close to each other. If the initial weights are small, each layerâ€™s average activational gain factor is a constant G, which is less than 1. The initial representational distance is shown as a function of the depth n scales:

This function decreases, so the theory should be more accurate in the later layers of the network.

The effective learning rates are expected to vary at different hidden layers. In standard gradient descent, the update involves adding up the parameters, so changes are proportional to the number of parameters. In the deeper hidden layers, the number of parameters in the encoder map increases, while the number in the decoder map decreases. This causes the effective learning rate for the encoder to increase with depth and for the decoder to decrease with depth. This relationship holds for the deeper layers of the network where theory is accurate, however, in the earlier layers, the effective learning rate for the decoder appears to increase.

In summary, researchers from the University College London have introduced a new theory about how neural networks learn, focusing on their common learning patterns across different architectures. It shows that these networks naturally learn structured representations, especially when they start with small weights. Rather than presenting this theory as a definitive universal model, researchers highlighted that gradient descent, the fundamental method used in training neural networks, may support the aspects of representation learning. However, this approach faces challenges when applied to larger datasets, and further research is necessary to address these complexities effectively and deal with more complex data.

Check out the Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â

Join ourÂ Telegram Channel andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 46k+ ML SubReddit

The post Researchers at the University College London Unravel the Universal Dynamics of Representation Learning in Deep Neural Networks appeared first on MarkTechPost.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Researchers at the University College London Unravel the Universal Dynamics of Representation Learning in Deep Neural Networks

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

Smashing Security podcast #400: Hacker games, AI travel surveillance, and 25 years of IoT

AWS DeepRacer enables builders of all skill levels to upskill and get started with machine learning

LWiAI Podcast #162 – Udio Song AI, TPU v5, Mixtral 8×22, Mixture-of-Depths, Musicians sign open letter

Hacking Made Easy: A Beginner’s Guide to Penetration Testing with Kali Linux

Belgium Driving Theory Test

Employee Performance Review Policy

Understanding Total Cost of Ownership in B2B Markets and the Power of Integrated WMS and OMS

Parallel selenium UI test failure rate has increase when compared to sequential execution

Researchers at the University College London Unravel the Universal Dynamics of Representation Learning in Deep Neural Networks

Related Posts