Video coding is a fundamental and ubiquitous technology in modern society. Generations of international video coding standards, such as the widely-deployed H.264/AVC and H.265/HEVC and the latest H.266/VVC, provide essential means for enabling video conferencing, video streaming, video sharing, e-commerce, entertainment, and many more video applications. These existing standards all rely on the fundamental theory of signal processing and information theory to encode generic video efficiently with a favorable rate distortion behavior.
In recent years, rapid advancement in deep learning and artificial intelligence technology has allowed people to manipulate images and videos using deep generative models. Among these, of particular interest to the field of video coding is the application of deep generative models towards compressing talking-face video at ultra-low bit rates. By focusing on talking faces, generative models can effectively learn the inherent structure about composition, movement and posture of human faces and deliver promising results using very little bandwidth resource. At ultra-low bit rates, when even the latest video coding standard H.266/VVC is apt to suffer from significant blocking artifacts and blurriness beyond the point of recognition, generative methods can maintain clear facial features and vivid expression in the reconstructed video. Further, generative face video coding techniques are inherently capable of manipulating the reconstructed face and promise to deliver a more interactive experience.
In this talk, we start with a quick overview of traditional and deep learning-based video coding
techniques. We then focus on face video coding with generative networks, and present two schemes that send different deep information in the bitstream, one sending compact temporal motion features and the other sending 3D facial semantics. We compare their compression efficiency and visual quality with that of the latest H.266/VVC standard, and showcase the power of deep generative models in preserving vivid facial images with little bandwidth resource. We also present visualization results to exhibit the capability of the 3D facial semantics-based scheme in terms of interacting with the reconstructed face video and animating virtual faces.
Yan Ye received her Ph.D. from the University of California, San Diego and her B.S. and M.S. from the University of Science and Technology of China. She is currently the Head of Video Technology Lab of Alibaba’s Damo Academy in Sunnyvale California. Prior to Alibaba, she held various management and technical positions at InterDigital, Dolby Laboratories, and Qualcomm.
Throughout her career, Dr. Ye has been actively involved in developing international video coding and streaming standards in ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). She is currently an Associate Rapporteur of the ITU-T VCEG, the Group Chair of INCITS/MPEG task group, and a focus group chair of the ISO/IEC MPEG Visual Quality Assessment. Her research interests include advanced video coding, processing and streaming algorithms, real-time and immersive video communications, AR/VR/MR, and deep learning-based video coding, processing, and quality assessment algorithms.
Zoom Call Details:
SPS SCV IEEE is inviting you to a scheduled Zoom meeting.
Topic: IEEE SPS SCV – Deep learning-based Video Coding, Processing, and Quality Assessment Algorithms
Time: Sep 12, 2023 05:00 PM Pacific Time (US and Canada)
Join Zoom Meeting
Meeting ID: 829 7289 5241
One tap mobile
+16699006833,,82972895241#,,,,*328652# US (San Jose)
Dial by your location
• +1 669 900 6833 US (San Jose)
• +1 669 444 9171 US
• +1 719 359 4580 US
• +1 253 205 0468 US
• +1 253 215 8782 US (Tacoma)
• +1 346 248 7799 US (Houston)
• +1 507 473 4847 US
• +1 564 217 2000 US
• +1 646 931 3860 US
• +1 689 278 1000 US
• +1 929 205 6099 US (New York)
• +1 301 715 8592 US (Washington DC)
• +1 305 224 1968 US
• +1 309 205 3325 US
• +1 312 626 6799 US (Chicago)
• +1 360 209 5623 US
• +1 386 347 5053 US
Meeting ID: 829 7289 5241
Find your local number: https://us02web.zoom.us/u/kdT477D4AL