The reconstruction of 3D scenes and their appearance from imagery is one of the longest-standing problems in computer vision. Originally developed to support robotics and artificial intelligence applications, it has found some of its most widespread use in the support of interactive 3D scene visualization.
One of the keys to this success has been the melding of 3D geometric and photometric reconstruction with a heavy re-use of the original imagery, which produces more realistic rendering than a pure 3D model-driven approach. In this talk, I give a retrospective of two decades of research in this area, touching on topics such as sparse and dense 3D reconstruction, the fundamental concepts in image-based rendering and computational photography, applications to virtual reality, as well as ongoing research in the areas of layered decompositions and 3D-enabled video stabilization.
Richard Szeliski is a Research Scientist in the Computational Photography group at Facebook, which he founded in 2015. He is also an Affiliate Professor at the University of Washington, and is member of the NAE and a Fellow of the ACM and IEEE. Dr. Szeliski has done pioneering research in the fields of Bayesian methods for computer vision, image-based modeling, image-based rendering, and computational photography, which lie at the intersection of computer vision and computer graphics. His research on Photo Tourism, Photosynth, and Hyperlapse are exciting examples of the promise of large-scale image and video-based rendering.
Dr. Szeliski received his Ph.D. degree in Computer Science from Carnegie Mellon University, Pittsburgh, in 1988 and joined Facebook as founding Director of the Computational Photography group in 2015. Prior to Facebook, he worked at Microsoft Research for twenty years, the Cambridge Research Lab of Digital Equipment Corporation for six years, and several other industrial research labs. He has published over 150 research papers in computer vision, computer graphics, neural nets, and numerical analysis, as well as the books Computer Vision: Algorithms and Applications and Bayesian Modeling of Uncertainty in Low-Level Vision. He was a Program Committee Chair for CVPR'2013 and ICCV'2003, served as an Associate Editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence and on the Editorial Board of the International Journal of Computer Vision, and as Founding Editor of Foundations and Trends in Computer Graphics and Vision.
Each day we are faced with visual puzzles: What is the species of that bird? What script was used to write on this old stone and what does it say? Who painted that picture? In most cases, an expert would be able to quickly inform us, but we do not know whom to ask. I will discuss the challenge of of building a universal visual expert -- a network of people, data and machines designed to harvest and organize visual information and make it accessible to anyone anywhere. I will explore the technical challenges arising from Visipedia and discuss their implications for computer vision, machine learning, artificial intelligence, human-machine systems and visual psychology. I will present data from large-scale experiments carried out by building systems that people use: iNaturalist, eBird and regisTree. I will conclude by discussing open challenges for Computer Vision and Machine Learning researchers.
Pietro Perona received his PhD from UC Berkeley, was a post-doctoral fellow at MIT and is now a Professor at the California Institute of Technology in Pasadena. He is currently interested in visual categorization and in the analysis of behavior. He has worked on partial differential equations for image processing, on modeling visual perception, on visual search and attention and on the role of visual mechanisms in art.