News Archives

RSS Feed

May 8 seminar: Gangadharan Esakk

May 5, 2026

photo: Gangadharan Esakki

May 8, 2026

Towards Autonomous Intelligence: Foundation Models, Multimodal learning, and Real-world Systems

Gangadharan Esakki, Nvidia

3:00 pm, UNM Centennial Engineering Center, Room 1026
Online Guests: Contact Prof. Santhanam <bsanthan@unm.edu> for a Zoom link

Abstract: Artificial intelligence is undergoing a profound transformation from narrowly scoped, task-specific models that unify perception, reasoning, and decision-making across modalities. This talk explores the convergence of large language models, vision systems, and emerging world models, and examines how this shift is redefining the design of real-world intelligent systems.

Drawing on experience spanning video compression, perceptual modeling, and large-scale deep learning systems, the talk highlights how principles from signal processing and system-level optimization continue to shape modern AI architectures. Particular attention is given to multimodal learning, simulation-driven development, and the growing role of world models in enabling robust and scalable autonomy.

Autonomous driving serves as a representative case study, illustrating both the opportunities and the inherent challenges of deploying AI in complex, safety-critical environments. The talk also examines broader industry trends, including the evolving AI ecosystem, emerging business models, and the practical constraints of deploying AI at scale. The session concludes with a forward-looking perspective on the next generation of AI systems, emphasizing the integration of perception, reasoning, and action, and outlining a path toward reliable, deployable and autonomous intelligence

Bio: Dr. Gangadharan Esakki is a Senior Video Architect at Nvidia specializing in the design of large-scale intelligent systems for real-world applications. He earned his PhD in Computer Engineering, where his research focused on video compression, perceptual modeling, and machine learning based optimization. Based in Silicon Valley, he has worked on video compression standard bodies, advanced deep learning systems, including large language models and multimodal perception architectures. His work explores the convergence of signal processing, foundation models, and system-level optimization, with an emphasis on building scalable, efficient, and reliable AI in domains such as vision, streaming, and autonomous systems