Abstract
In e-Learning research, teachers can record lecture videos in e-class and upload these lecture videos to e-Learning system themselves. Once lecture videos and handouts can be generated automatically in traditional classroom, it can help students with self-learning and teacher with lecture content development for e-Learning services. This paper proposed a teaching assistant system based on computer vision that can help in content development for e-Learning services. Lecture videos are taken by using two cameras and merged on both sides so that students can see a clear and complete teaching content. The k-means segmentation is used to extract board area and then connected component technique helps refill the board area which is covered by lecturer’s body. Then we use adaptive threshold to extract handwritings in various light conditions and time-series denoising technique is designed to reduce noise. According to extracted handwritings, the lecture videos can be automatically structured with high level of semantics. The lecture videos are segmented into video clips and all key-frames are integrated as handouts of the education videos.


















Similar content being viewed by others
References
Bay H, Tuytelaars T, Gool LV (2008) SURF: speeded up robust features. Comput Vis Image Underst 110(3):346–359
Bhogal AK, Singla N, Kaur M (2010) Color image segmentation using K-means clustering algorithm. 1(2):18–20
Brown M, Lowe DG (2007) Automatic panoramic image stitching using invariant features. Int J Comput Vis 74(1):59–73
Chang HS, Sull S, Lee SU (1999) Efficient video indexing scheme for content-based retrieval. IEEE Trans Circ Syst Video Technol 9(8):1269–1279
Choudary C, Liu T (2007) Summarization of visual content in instructional videos. IEEE Trans Multimed 9(7):1443–1455
Ferman AM, Tekalp AM, Mehrotra R (2002) Robust color histogram descriptors for video segment retrieval and identification. IEEE Trans Image Process 11(5):497–508
Fink GA, Wienecke M, Plötz T (2005) Experiments in video-based whiteboard reading. Proceedings of International Workshop on Camera-Based Document Analysis and Recognition, pp. 95–100
Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambridge University Press, Cambridge
He L, Zhang Z (2007) Real-time whiteboard capture and processing using a video camera for remote collaboration. IEEE Trans Multimed 9(1):198–206
Hirzallah N, Nusir S, Al Sayyed A, Kayed A (2008) Notes extraction algorithm from traditional presentations without the use of e-boards. Proceedings of the International Conference on Computer and Communication Engineering, pp. 195–200
Imran AS, Cheikh FA (2011) Blackboard content classification for lecture videos. Proceedings of International Conference on Image Processing, pp. 2989–2992
Imran AS, Cheikh FA (2012) Lecture content classification tool. Proceedings of International Symposium on Communications Control and Signal Processing, pp. 1–6
Imran AS, Rahadianti L, Cheikh FA, Yayilgan SY (2012) Semantic tags for lecture videos. Proceedings of International Conference on Semantic Computing, pp. 117–120
Jain A (1986) Fundamentals of digital image processing. Prentice-Hall
Lei Z, Chou W, Zhong J, Lee CH (2000) Video segmentation using spatial and temporal statistical analysis method. Proc Int Conf Multimed Expo 3:1527–1530
Lin C, Sheu M, Chiang H, Liaw C, Tsai C (2005) An efficient video de-interlacing with scene change detection. Proceedings of the International Conference on Information, Communications and Signal Processing, pp. 36–40
Liu TT, Choudary C (2006) Content Extraction and Summarization of Instructional Videos”, Proceedings of International Conference on Image Processing, pp. 149–152
Okuni S, Tsuruoka S, Rayat GP, Kawanaka H, Shinogi T (2007) Video scene segmentation using the state recognition of blackboard for blended learning. Proceedings of International Conference on Convergence Information Technology, pp. 2437–2442
Onishi M, Izumi M, Fukunaga K (2000) Blackboard segmentation using video image of lecture and its applications. Proceedings of International Conference on Pattern Recognition, pp. 615–618
Saez E, Benavides JI, Guil N (2004) Reliable real time scene change detection in MPEG compressed video. Proc Int Conf Multimed Expo 1:567–570
Saund E (1999) Image mosaicing and a diagrammatic user interface for an office whiteboard scanner. Technical report, Xerox Palo Alto Research Center
Zhang Z, He LW (2004) Notetaking with a camera: whiteboard scanning and image enhancement. Proc IEEE Int Conf Acoust Speech Signal Process 3:533–536
Zhang HJ, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Systems 1(1):10–28
Zhang D, Qi W, Zhang HJ (2001) A new shot boundary detection algorithm. Proceedings of the Second Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing, pp. 63–70
Zhao L, Qi W, Li SZ, Yang SQ, Zhang HJ (2001) Content-based retrieval of video shot using the-improved nearest feature line method. Proc Int Conf Acoust Speech Signal Process 3:1625–1628
Zhou J, Zhang XP (2004) A web-enabled video indexing system. Proceedings of the International Workshop on Multimedia Information Retrieval, pp. 307–314
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, G.C., Yeh, FH., Chen, YJ. et al. Robust handwriting extraction and lecture video summarization. Multimed Tools Appl 76, 7067–7085 (2017). https://doi.org/10.1007/s11042-016-3353-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3353-y