CAT Seminar
Bridging Theory and Machine Learning: Analytical Insights into Neural Networks and Generative Models
Carlos Couto
Modern artificial intelligence has achieved state-of-the-art performance across various domains, from solving protein folding to predicting new materials. However, establishing a solid theoretical foundation for machine learning remains an ongoing research challenge. This seminar explores how analytical models can help bridge this gap.
In the first part, we examine the learning behavior of neural networks near their optimal point. Specifically, we analyze the Hessian of the loss function with respect to the learnable parameters in teacher-student setups where both networks share the same architecture. By characterizing the Hessian eigenspectrum for different activation functions, we show that the Hessian rank at the optimal solution effectively determines the number of relevant parameters.
In the second part, we discuss recent research on applying generative diffusion models to classical spin systems, such as the Ising model. The Ising model serves as a testbed, allowing us to leverage its well-understood analytical properties to explore how different design choices in diffusion models impact generative performance.