Comparison of text-image fusion models for high school diploma certificate classification

Chandra Ramadhan Atmaja Perdana; Hanung Adi Nugroho; Igi Ardiyanto

doi:10.21924/cst.5.1.2020.172

PDF

DOI: https://doi.org/10.21924/cst.5.1.2020.172

Keywords:

text classification, image classification, text-image classification, convolutional neural network

Chandra Ramadhan Atmaja Perdana

Universitas Gadjah Mada

Hanung Adi Nugroho

Universitas Gadjah Mada

Igi Ardiyanto

Universitas Gadjah Mada

Abstract

File scanned documents are commonly used in this digital era. Text and image extraction of scanned documents play an important role in acquiring information. A document may contain both texts and images. A combination of text-image classification has been previously investigated. The dataset used for those research works the text were digitally provided. In this research, we used a dataset of high school diploma certificate, which the text must be acquired using optical character recognition (OCR) method. There were two categories for this high school diploma certificate, each category has three classes. We used convolutional neural network for both text and image classifications. We then combined those two models by using adaptive fusion model and weight fusion model to find the best fusion model. We come into conclusion that the performance of weight fusion model which is 0.927 is better than that of adaptive fusion model with 0.892.

Downloads

Download data is not yet available.

How to Cite

Atmaja Perdana, C. R., Adi Nugroho, H., & Ardiyanto, I. (2020). Comparison of text-image fusion models for high school diploma certificate classification. Communications in Science and Technology, 5(1), 5-9. https://doi.org/10.21924/cst.5.1.2020.172

Issue

Vol. 5 No. 1 (2020)

Section

Articles

Author Biographies

Hanung Adi Nugroho, Universitas Gadjah Mada

Department of Electrical and Information Engineering

Igi Ardiyanto, Universitas Gadjah Mada

Department of Electrical and Information Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright

Open Access authors retain the copyrights of their papers, and all open access articles are distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided that the original work is properly cited.

The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.

While the advice and information in this journal are believed to be true and accurate on the date of its going to press, neither the authors, the editors, nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

1. K. Taghva, T. A. Nartker, J. Borsack, S. Lumos, A. Condit and R. Young, Evaluating text categorization in the presence of OCR errors, 8th SPIE Conference on Document Recognition and Retrieval, San Jose, CA, USA, 2000 pp. 68–74.
2. K. Taghva, N. Thomas and B. Julie, Recognize, Categorize, and Retrieve, Sympo-sium on Document Image Understanding Technology, Columbia, MD, USA, 2001 pp. 227--232.
3. D. Zelenika, J. Povh and A. Dobrovoljc, Document categorization based on OCR technology?: An overview, Recent Adv. Inf. Sci. (2013) 409-414.
4. M. Shen and H. Lei, Improving OCR performance with background image elimination, 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Zhangjiajie, China, 2015 pp. 1566–1570.
5. M. K. Ugale and M. S. Joshi, Improving optical character recognition forlow resolution images, Int. J. Comput. Sci. Netw. 6 (2017) 18-20.
6. V. Renganathan, Text mining in biomedical domain with emphasis on document clustering, Healthc. Inform. Res. 23 (2017) 141–146.
7. M. Haddoud, A. Mokhtari, T. Lecroq and S. Abdeddaïm, Combining supervised term-weighting metrics for SVM text classification with extended term representation, Knowl. Inf. Syst. 49 (2016) 909–931.
8. P. Li, F. Zhao, Y. Li and Z. Zhu, Law text classification using semi-supervised convolutional neural networks, 30th Chinese Control and Decision Conference, Shenyang, China, 2018 pp. 309–313.
9. S. Song, H. Huang and T. Ruan, Abstractive text summarization using LSTM-CNN based deep learning, Multimed. Tools Appl. 78 (2019) 857–875.
10. K. Park and D. H. Kim, Accelerating image classification using feature map similarity in convolutional neural networks, Appl. Sci. 9 (2018) 108.
11. P. Tang, X. Wang, B. Shi, X. Bai, W. Liu and Z. Tu, Deep FisherNet for Image Classification, IEEE Trans. Neural Networks Learn. Syst. 30 (2019) 2244–2250.
12. G. Li and N. Li, Customs classification for cross-border e-commerce based on text-image adaptive convolutional neural network, Electron. Commer. Res. 19(4) (2019) 799–800.
13. F. Zhu et al., Image-text dual neural network with decision strategy for small-sample image classification, Neurocomputing 328 (2019) 182–188.

	All	Since 2020
Citations	1341	1270
h-index	15	15
i10-index	39	34

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Hanung Adi Nugroho, Universitas Gadjah Mada

Igi Ardiyanto, Universitas Gadjah Mada

References