DTSA 5514 - Modern AI Models for Vision and Multimodal Understanding Course uri icon

Overview

description

  • This course delves into the cutting-edge realm of generative models for images and videos, including GANs and Diffusion Models. It will teach about multimodal foundational models such as CLIP, as well as applications for text-to-image and text-to-video generation. The course also addresses the issue of DeepFakes. Through both practical exercises and theoretical discussion, students will explore the ethical considerations, privacy concerns, and future trends in computer vision. Same as CSCA 5422.

instructor(s)

  • Yeh, Tom  
    Primary Instructor - Fall 2025 / Spring 2026