CNN Filter based Text Region Segmentation from Lecture Video and Extraction using Neuro OCR

Authors

  • Ashima Godha M. Tech. (CTA), RKDF School of Engineering, Indore, India
  • Puja Trivedi Assistant Professor (CSE), RKDF School of Engineering, Indore, India

DOI:

https://doi.org/10.24113/ijoscience.v5i7.218

Abstract

Lecture videos are rich with textual information and to be able to understand the text is quite useful for larger video understanding/analysis applications. Though text recognition from images have been an active research area in computer vision, text in lecture videos has mostly been overlooked. In this paper, text extraction from lecture videos are focused. For text extraction from different types of lecture videos such as slides, whiteboard lecture videos, paper lecture videos, etc. The text extraction, the text regions are segmented in video frames and extracted using recurrent neural network based OCR. And finally, the extracted text is converted into audio for ease of convenience. The designed algorithm is tested on different videos from different lectures.  The experimental results show that the proposed methodology is quite efficient over existing work.

Downloads

Download data is not yet available.

References

[1] Keechul Jung, Kwang In Kim, Anil K. Jain “Text information extraction in images and video: a survey”, Elsevier, Pattern Recognition 37 (2004).
[2] Xu-Cheng Yin, Member, IEEE, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao “Robust Text Detection in Natural Scene Images” IEEE transactions on pattern analysis and machine intelligence, Vol. 36, No. 5, (2014).
[3] Viet Phuong Le, Nibal Nayef, Muriel Visani, Jean-Marc Ogier and Cao De Trant “Text and Non-text Segmentation based on Connected Component Features” IEEE, 13th International Conference on Document Analysis and Recognition (ICDAR), (2015).
[4] Ankit Vidyarthi, Namita Mittal, Ankita Kansal, “Text and Non-Text Region Identification Using Texture and Connected Components”, International Conference on Signal Propagation and Computer Technology (ICSPCT), IEEE (2014).
[5] Yingying Zhu, Cong Yao, Xiang Bai “Scene text detection and recognition: recent advances and future trends” Front. Comput. Sci., (2016).
[6] Qixiang Ye, and David Doermann, “Text Detection and Recognition in Imagery: A Survey” IEEE transactions on pattern analysis and machine intelligence, Vol. 37, No. 7, (2015).
[7] N. Senthilkumaran and R. Rajesh, “Edge Detection Techniques for Image Segmentation – A Survey of Soft Computing Approaches”, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, (2009).
[8] Zhong Y, Karu K, Jain A K. “Locating text in complex color images.” in Proceedings of the 3rd IEEE Conference on Document Analysis and Recognition, pp-146–149, IEEE (1995).
[9] Kim K I, Jung K, Kim J H. “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm.” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp-1631–1639, IEEE(2003).
[10] Y. Zhong, H. J. Zhang, and A. K. Jain, “Automatic caption localization in compressed video,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 22, No. 4, pp. 385–392, IEEE (2000).
[11] Li H, Doermann D, Kia O. “Automatic text detection and tracking in digital video.”, 9(1): 147–156, IEEE Transactions on Image Processing, (2000).
[12] K. I. Kim, K. Jung, and H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 25, No. 12, pp. 1631–1639, (2003).
[13] Chong Yu, Yonghong Song, Quan Meng, Yuanlin Zhang, Yang Liu, “Text detection and recognition in natural scene with edge analysis”, IET Comput. Vis., Vol. 9, Iss. 4, pp. 603– 613, (2015).
[14] Chucai Yi, Ying Li Tian, “Text String Detection From Natural Scenes by Structure-Based Partition and Grouping”, Vol. 20, No. 9, IEEE Transactions on Image Processing (2011).
[15] Shijian Lu, Tao Chen, Shangxuan Tian, Joo-Hwee Lim, Chew-Lim Tan, “Scene text extraction based on edges and support vector regression”, 18:125–135, IJDAR (2015).
[16] K. C. Kim, H. R. Byun, Y. J. Song, Y. W. Choi, S. Y. Chi, K. K. Kim, Y. K. Chung, “Scene Text Extraction in Natural Scene Images using Hierarchical Feature Combining and Verification”, 17th International Conference on Pattern Recognition (ICPR’04), IEEE (2004).
[17] Mohammad Khodadadi, and Alireza Behrad, “Text Localization, Extraction and Inpainting in Color Images”, IEEE, 20th Iranian Conference on Electrical Engineering, (ICEE2012), (2012).
[18] Anubhav Kumar “An Efficient Text Extraction Algorithm in Complex Images”, IEEE, (2013).
[19] Kartik Dutta, Minesh Mathew, Praveen Krishnan and C.V. Jawahar, “Localizing and Recognizing Text in Lecture Videos”, International Conference on Frontiers in Handwriting Recognition, 2018.
[20] https://www.youtube.com/
[21] https://nptel.ac.in/
[22] https://www.khanacademy.org/

Downloads

Published

07/28/2019

How to Cite

Godha, A., & Trivedi, P. (2019). CNN Filter based Text Region Segmentation from Lecture Video and Extraction using Neuro OCR. SMART MOVES JOURNAL IJOSCIENCE, 5(7), 30–35. https://doi.org/10.24113/ijoscience.v5i7.218