Record Details
Field | Value |
---|---|
Title | Efficient Incremental Panorama Reconstruction from Multiple Videos |
Names |
Feng, Zhongyuan
(creator) Todorovic, Sinisa (advisor) |
Date Issued | 2015-05-22 (iso8601) |
Note | Graduation data: 2015 |
Abstract | Constructing a panorama from a set of videos is a long-standing problem in computer vision. A panorama represents an enhanced still-image representation of an entire scene captured in a set of videos, where each video shows only a part of the scene. Importantly, a panorama shows only the scene background, whereas any foreground objects appearing in the videos are not of interest. This report presents a new framework for an efficient panorama extraction from a large set of 140-150 videos, each showing a play in a given football game. The resulting panorama is supposed to show the entire football field seen in the videos, without foreground players. Prior work typically processes all video frames for panorama extraction. This is computationally expensive. In contrast, we design an efficient approach which incrementally builds the panorama by intelligently selecting a sparse set of video frames to process. Our approach consists of the following steps. We first identify the moment of snap (MOS) in each play, because these video frames are usually captured without camera motion, and thus provide good-quality snap shots of the field. All detected MOS frames are then used to build an initial panorama using the standard Bundle Adjustment algorithm. The initial panorama is used in our subsequent steps to search for new video frames that will maximally cover yet unreconstructed parts the football field. For this, we compute a homographic projection of the MOS video frames onto the current panorama, and estimate an area on the reconstructed football field with the lowest confidence score. The score of an area is made directly proportional to the number of video frames covering that area. Then, we identify a new sparse set of video frames that maximally cover the lowest-confidence area in the current panorama. Finally, this new set of frames and the current panorama are jointly input to the Bundle Adjustment algorithm to produce a new panorama estimate. This process is iterated until the confidence scores associated with different parts of the panorama stop changing, or when there are no new video frames that would extend the current panorama. In each iteration, we also adaptively build a background model of the football field, and in this way remove foreground from the panorama. We test our approach on a large dataset of videos showing 10 football games. The results demonstrate that our approach is more efficient than state-of-the-art methods, while yielding competitive results. To the best of our knowledge, prior work has not considered efficiency, but only accuracy of panorama extraction. Our approach has an advantage of being flexible to various time budgets set by real-world applications. |
Genre | Research Paper |
Identifier | http://hdl.handle.net/1957/55882 |