Comparison of text-image fusion models for high school diploma certificate classification

Main Article Content

Chandra Ramadhan Atmaja Perdana
Hanung Adi Nugroho
Igi Ardiyanto


File scanned documents are commonly used in this digital era. Text and image extraction of scanned documents play an important role in acquiring information. A document may contain both texts and images. A combination of text-image classification has been previously investigated. The dataset used for those research works the text were digitally provided. In this research, we used a dataset of high school diploma certificate, which the text must be acquired using optical character recognition (OCR) method. There were two categories for this high school diploma certificate, each category has three classes. We used convolutional neural network for both text and image classifications. We then combined those two models by using adaptive fusion model and weight fusion model to find the best fusion model. We come into conclusion that the performance of weight fusion model which is 0.927 is better than that of adaptive fusion model with 0.892.


Download data is not yet available.

Article Details

How to Cite
Atmaja Perdana, C. R., Adi Nugroho, H., & Ardiyanto, I. (2020). Comparison of text-image fusion models for high school diploma certificate classification. Communications in Science and Technology, 5(1), 5-9.
Author Biographies

Hanung Adi Nugroho, Universitas Gadjah Mada

Department of Electrical and Information Engineering

Igi Ardiyanto, Universitas Gadjah Mada

Department of Electrical and Information Engineering


1. K. Taghva, T. A. Nartker, J. Borsack, S. Lumos, A. Condit and R. Young, Evaluating text categorization in the presence of OCR errors, 8th SPIE Conference on Document Recognition and Retrieval, San Jose, CA, USA, 2000 pp. 68–74.
2. K. Taghva, N. Thomas and B. Julie, Recognize, Categorize, and Retrieve, Sympo-sium on Document Image Understanding Technology, Columbia, MD, USA, 2001 pp. 227--232.
3. D. Zelenika, J. Povh and A. Dobrovoljc, Document categorization based on OCR technology?: An overview, Recent Adv. Inf. Sci. (2013) 409-414.
4. M. Shen and H. Lei, Improving OCR performance with background image elimination, 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Zhangjiajie, China, 2015 pp. 1566–1570.
5. M. K. Ugale and M. S. Joshi, Improving optical character recognition forlow resolution images, Int. J. Comput. Sci. Netw. 6 (2017) 18-20.
6. V. Renganathan, Text mining in biomedical domain with emphasis on document clustering, Healthc. Inform. Res. 23 (2017) 141–146.
7. M. Haddoud, A. Mokhtari, T. Lecroq and S. Abdeddaïm, Combining supervised term-weighting metrics for SVM text classification with extended term representation, Knowl. Inf. Syst. 49 (2016) 909–931.
8. P. Li, F. Zhao, Y. Li and Z. Zhu, Law text classification using semi-supervised convolutional neural networks, 30th Chinese Control and Decision Conference, Shenyang, China, 2018 pp. 309–313.
9. S. Song, H. Huang and T. Ruan, Abstractive text summarization using LSTM-CNN based deep learning, Multimed. Tools Appl. 78 (2019) 857–875.
10. K. Park and D. H. Kim, Accelerating image classification using feature map similarity in convolutional neural networks, Appl. Sci. 9 (2018) 108.
11. P. Tang, X. Wang, B. Shi, X. Bai, W. Liu and Z. Tu, Deep FisherNet for Image Classification, IEEE Trans. Neural Networks Learn. Syst. 30 (2019) 2244–2250.
12. G. Li and N. Li, Customs classification for cross-border e-commerce based on text-image adaptive convolutional neural network, Electron. Commer. Res. 19(4) (2019) 799–800.
13. F. Zhu et al., Image-text dual neural network with decision strategy for small-sample image classification, Neurocomputing 328 (2019) 182–188.