Detection and Recognition of Hindi Text from Natural Scenes and its Transliteration to English
Main Article Content
Abstract
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 2963–2970, 2010, doi: 10.1109/CVPR.2010.5540041.
S. Bhargava and E. Yablonovitch, “Lowering HAMR near-field transducer temperature via inverse electromagnetic design,†IEEE Trans. Magn., vol. 51, no. 4, 2015, doi: 10.1109/TMAG.2014.2355215.
S. Karim, A. A. Laghari, A. Halepoto, A. Manzoor, N. Hussain Phulpoto, and A. Ali, “Vehicle detection in Satellite Imagery using Maximally Stable Extremal Regions,†IJCSNS Int. J. Comput. Sci. Netw. Secur., vol. 18, no. 4, pp. 75–78, 2018.
I. Ahmad and G. A. Fink, “Handwritten Arabic text recognition using multi-stage sub-core-shape HMMs,†Int. J. Doc. Anal. Recognit., vol. 22, no. 3, pp. 329–349, 2019, doi: 10.1007/s10032-019-00339-8.
X. Zhou et al., “East: An efficient and accurate scene text detector,†arXiv, pp. 5551–5560, 2017.
J. Wang and X. Hu, “Gated Recurrent Convolution Neural Network for OCR,†no. Nips, 2017.
P. Shivakumara, D. Tang, M. Asadzadehkaljahi, T. Lu, U. Pal, and M. H. Anisi, “CNN-RNN based method for license plate recognition,†CAAI Trans. Intell. Technol., vol. 3, no. 3, pp. 169–175, 2018, doi: 10.1049/trit.2018.1015.
L. Giridhar, A. Dharani, and V. Guruviah, “A novel approach to OCR using image recognition based classification for ancient tamil inscriptions in temples,†arXiv, pp. 1–8, 2019.
S. Prajapati, S. R. Joshi, A. Maharjan, and B. Balami, “Evaluating Performance of Nepali Script OCR using Tesseract and Artificial Neural Network,†Proc. 2018 IEEE 3rd Int. Conf. Comput. Commun. Secur. ICCCS 2018, pp. 104–107, 2018, doi: 10.1109/CCCS.2018.8586808.
A. S., J. Yankey, and E. O., “An Automatic Number Plate Recognition System using OpenCV and Tesseract OCR Engine,†Int. J. Comput. Appl., vol. 180, no. 43, pp. 1–5, 2018, doi: 10.5120/ijca2018917150.
P. Duygulu, K. Barnard, J. F. G. de Freitas, and D. A. Forsyth, “Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 2353, pp. 97–112, 2002, doi: 10.1007/3-540-47979-1_7.
T. Deselaers, S. Hasan, O. Bender, and H. Ney, “A deep learning approach to machine transliteration,†no. March, p. 233, 2009, doi: 10.3115/1626431.1626476.
M. Alam and S. ul Hussain, “Sequence to sequence networks for roman-Urdu to Urdu transliteration,†arXiv, pp. 1–7, 2017.
Y. Wu et al., “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation,†arXiv e-prints, p. arXiv:1609.08144, 2016, [Online]. Available: http://arxiv.org/abs/1609.08144.
T. Q. Phan, P. Shivakumara, S. Tian, and C. L. Tan, “Recognizing text with perspective distortion in natural scenes,†Proc. IEEE Int. Conf. Comput. Vis., pp. 569–576, 2013, doi: 10.1109/ICCV.2013.76.
P. Dollar, R. Appel, S. Belongie, and P. Perona, “Fast feature pyramids for object detection,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 8, pp. 1532–1545, 2014, doi: 10.1109/TPAMI.2014.2300479.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,†2016, doi: 10.1109/CVPR.2016.90.
M. Liao, B. Shi, X. Bai, X. Wang, and W. Liu, “TextBoxes: A fast text detector with a single deep neural network,†31st AAAI Conf. Artif. Intell. AAAI 2017, pp. 4161–4167, 2017.
Y. Zhu and J. Du, “Sliding line point regression for shape robust scene text detection,†arXiv, pp. 3735–3740, 2018.
S. R. Laskar, A. Dutta, P. Pakray, and S. Bandyopadhyay, “Neural machine translation: English to hindi,†2019 IEEE Conf. Inf. Commun. Technol. CICT 2019, pp. 25–30, 2019, doi: 10.1109/CICT48419.2019.9066238.
W. Wang et al., “Shape robust text detection with progressive scale expansion network,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, no. c, pp. 9328–9337, 2019, doi: 10.1109/CVPR.2019.00956.
T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,†Proc. - 2019 IEEE Intl Conf Parallel Distrib. Process. with Appl. Big Data Cloud Comput. Sustain. Comput. Commun. Soc. Comput. Networking, ISPA/BDCloud/SustainCom/SocialCom 2019, pp. 1500–1504, Dec. 2016, doi: 10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00217.
F. Milletari, N. Navab, and S. A. Ahmadi, “V-Net: Fully convolutional neural networks for volumetric medical image segmentation,†Proc. - 2016 4th Int. Conf. 3D Vision, 3DV 2016, pp. 565–571, 2016, doi: 10.1109/3DV.2016.79.
A. Shrivastava, A. Gupta, and R. Girshick, “Training region-based object detectors with online hard example mining,†Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 761–769, 2016, doi: 10.1109/CVPR.2016.89.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,†Proc. IEEE, vol. 86, no. 11, pp. 2278–2323, 1998, doi: 10.1109/5.726791.
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,†32nd Int. Conf. Mach. Learn. ICML 2015, vol. 1, pp. 448–456, 2015.
X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,†J. Mach. Learn. Res., vol. 15, pp. 315–323, 2011.
B. Leibe, J. Matas, N. Sebe, and M. Welling, “Preface,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9906 LNCS, pp. VII–IX, 2016, doi: 10.1007/978-3-319-46493-0.
Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,†pp. 248–255, 2009, doi: 10.1109/cvprw.2009.5206848.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,†in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034, doi: 10.1109/ICCV.2015.123.
A. Khan and A. Sarfaraz, “RNN-LSTM-GRU based language transformation,†Soft Comput., vol. 23, no. 24, pp. 13007–13024, 2019, doi: 10.1007/s00500-019-04281-z.