标题: 利用自动化字幕侦测与字幕处理来撷取结构化之视讯内容
Visual Structuring and Retrieval Based on Automatic Closed Caption Detection and Caption Processing
作者: 萧铭和
Ming-Ho Hsiao
李素瑛
Suh-Yin Lee
资讯科学与工程研究所
关键字: 视讯切割;字幕侦测;字体大小辨识;Video Segmentation;Scene Identification;Caption Localization;Font Size Differentiation
公开日期: 2001
摘要: 我们利用阶层式架构,提出一种结构化的网球影片内容浏览与索引方法。经过视讯切割,自动化字幕侦测与字幕字体大小辨识等方法,将影片做结构化的分析并建立在数位影片资料库中。对数位影片资料库而言,影片内容的结构化提供了浏览的能力而影片的字幕则提供更有意义的资讯。
为了建构影片的阶层式架构,我们提出并整合了一些视讯处理的技术,包括影片的视讯切割,选择适当的视讯片段,侦测视讯片段是否有字幕以及字体大小的辨识的方法。我们选择网球影片当作研究的实例,而且利用我们所设计的自动化选择适当视讯片段的方法,来作进一步的字幕侦测。我们可在侦测到有字幕的视讯片段,做更精确地自动化字幕检测。利用我们提出的字幕字体大小辨识的方法,使用者可以利用此技术来过滤及选择更有意义的字幕资讯,如比赛分数、球员名字等。具有意义的字幕资讯不仅可提供对于高阶层的视讯影片架构分析和视讯影片索引,更可作为MPEG7中内容描述的资讯。我们所有提出的方法都可直接在MPEG压缩影片中做处理,不仅节省计算的时间,更可提高视讯影片处理的效率。此研究实验结果证明了提出的方法令人满意。
An efficient indexing and retrieval of tennis video content is proposed using hierarchical structure. The hierarchical structure is constructed through video segmentation, shots selection and closed caption detection. The video content representation provides browsing capabilities for digital video databases. The video indexing supports more efficient content-based queries and retrieval capabilities for digital video databases.
In this thesis, a novel approach of automatic closed caption detection and font size differentiation among localized text regions in I-frames of MPEG videos is proposed. The approach consists of five modules: video segmentation, shot selection, caption frame detection, caption localization and font size differentiation. Tennis videos are selected as the case study and the module of shot selection is designed to automatically select specific type of shot for further closed caption detection. The noise of potential captions is filtered out based on the long-term consistency of the constant potential caption regions detection over consecutive frames. While the general closed captions are localized, the designed tool – font size differentiation is used as a filter to assist users in the selection of the specific and significant text captions. The significant closed captions, e.g. scores, can support high-level video structuring, video browsing, video indexing and video content description in MPEG-7. Experimental results show the effectiveness and the feasibility of the proposed scheme.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT900392049
http://hdl.handle.net/11536/68463
显示于类别:Thesis