CNN Filter based Text Region Segmentation from Lecture Video and Extraction using Neuro OCR
DOI:
https://doi.org/10.24113/ijoscience.v5i7.218Abstract
Lecture videos are rich with textual information and to be able to understand the text is quite useful for larger video understanding/analysis applications. Though text recognition from images have been an active research area in computer vision, text in lecture videos has mostly been overlooked. In this paper, text extraction from lecture videos are focused. For text extraction from different types of lecture videos such as slides, whiteboard lecture videos, paper lecture videos, etc. The text extraction, the text regions are segmented in video frames and extracted using recurrent neural network based OCR. And finally, the extracted text is converted into audio for ease of convenience. The designed algorithm is tested on different videos from different lectures. The experimental results show that the proposed methodology is quite efficient over existing work.
Downloads
References
[2] Xu-Cheng Yin, Member, IEEE, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao “Robust Text Detection in Natural Scene Images” IEEE transactions on pattern analysis and machine intelligence, Vol. 36, No. 5, (2014).
[3] Viet Phuong Le, Nibal Nayef, Muriel Visani, Jean-Marc Ogier and Cao De Trant “Text and Non-text Segmentation based on Connected Component Features” IEEE, 13th International Conference on Document Analysis and Recognition (ICDAR), (2015).
[4] Ankit Vidyarthi, Namita Mittal, Ankita Kansal, “Text and Non-Text Region Identification Using Texture and Connected Components”, International Conference on Signal Propagation and Computer Technology (ICSPCT), IEEE (2014).
[5] Yingying Zhu, Cong Yao, Xiang Bai “Scene text detection and recognition: recent advances and future trends” Front. Comput. Sci., (2016).
[6] Qixiang Ye, and David Doermann, “Text Detection and Recognition in Imagery: A Survey” IEEE transactions on pattern analysis and machine intelligence, Vol. 37, No. 7, (2015).
[7] N. Senthilkumaran and R. Rajesh, “Edge Detection Techniques for Image Segmentation – A Survey of Soft Computing Approaches”, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, (2009).
[8] Zhong Y, Karu K, Jain A K. “Locating text in complex color images.” in Proceedings of the 3rd IEEE Conference on Document Analysis and Recognition, pp-146–149, IEEE (1995).
[9] Kim K I, Jung K, Kim J H. “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm.” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp-1631–1639, IEEE(2003).
[10] Y. Zhong, H. J. Zhang, and A. K. Jain, “Automatic caption localization in compressed video,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 22, No. 4, pp. 385–392, IEEE (2000).
[11] Li H, Doermann D, Kia O. “Automatic text detection and tracking in digital video.”, 9(1): 147–156, IEEE Transactions on Image Processing, (2000).
[12] K. I. Kim, K. Jung, and H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 25, No. 12, pp. 1631–1639, (2003).
[13] Chong Yu, Yonghong Song, Quan Meng, Yuanlin Zhang, Yang Liu, “Text detection and recognition in natural scene with edge analysis”, IET Comput. Vis., Vol. 9, Iss. 4, pp. 603– 613, (2015).
[14] Chucai Yi, Ying Li Tian, “Text String Detection From Natural Scenes by Structure-Based Partition and Grouping”, Vol. 20, No. 9, IEEE Transactions on Image Processing (2011).
[15] Shijian Lu, Tao Chen, Shangxuan Tian, Joo-Hwee Lim, Chew-Lim Tan, “Scene text extraction based on edges and support vector regression”, 18:125–135, IJDAR (2015).
[16] K. C. Kim, H. R. Byun, Y. J. Song, Y. W. Choi, S. Y. Chi, K. K. Kim, Y. K. Chung, “Scene Text Extraction in Natural Scene Images using Hierarchical Feature Combining and Verification”, 17th International Conference on Pattern Recognition (ICPR’04), IEEE (2004).
[17] Mohammad Khodadadi, and Alireza Behrad, “Text Localization, Extraction and Inpainting in Color Images”, IEEE, 20th Iranian Conference on Electrical Engineering, (ICEE2012), (2012).
[18] Anubhav Kumar “An Efficient Text Extraction Algorithm in Complex Images”, IEEE, (2013).
[19] Kartik Dutta, Minesh Mathew, Praveen Krishnan and C.V. Jawahar, “Localizing and Recognizing Text in Lecture Videos”, International Conference on Frontiers in Handwriting Recognition, 2018.
[20] https://www.youtube.com/
[21] https://nptel.ac.in/
[22] https://www.khanacademy.org/
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2019 Ashima Godha, Puja Trivedi

This work is licensed under a Creative Commons Attribution 4.0 International License.
IJOSCIENCE follows an Open Journal Access policy. Authors retain the copyright of the original work and grant the rights of publication to the publisher with the work simultaneously licensed under a Creative Commons CC BY License that allows others to distribute, remix, adapt, and build upon your work, even commercially, as long as they credit you for the original creation. Authors are permitted to post their work in institutional repositories, social media or other platforms.
Under the following terms:
-
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.