-
Essay / Bangla Ocr
Table of ContentsIntroductionBasic StudyProposed Methodology and ImplementationIntroductionWith the advent of computers and Internet technology, the possibilities for collecting data and using it for various purposes have exploded. The possibilities are particularly enticing when it comes to textual data. Converting the vast amount of data accumulated over the years of human history into digital format is vital for preservation, data mining, sentiment analysis, etc., which will only add to the progress of our society. The tool used for this purpose is called OCR. Say no to plagiarism. Get a tailor-made essay on “Why violent video games should not be banned”? Get an original essay Like many other languages, Bengali can also benefit from OCR technology – especially since it is the seventh most spoken language in the world and the population of speakers is approximately 300 million. The Bengali-speaking population is found mainly in Bangladesh, the Indian states of West Bengal, Assam, Tripura, Andaman and Nicobar Islands, as well as the ever-growing diaspora in the United Kingdom (UK), in the United States (US), Canada, Middle East. -East, Australia, Malaysia, etc. Thus, progress in the digital use of the Bangla language is of interest to many countries. Basic StudyOCR is the short form of optical character recognition. It is a technology for converting images of printed/handwritten texts into a machine-readable, i.e. digital, format. Although OCR today is primarily focused on scanning text, old OCRs were analog. The world's first OCR is considered to have been invented by American inventor Charles R. Carey, who used an image transmission system using a mosaic of photocells. Later inventions focused on scanning documents to produce more copies or to convert them into telegraphic code. then the digital format gradually became more popular. In 1966, the IBM laboratory in Rochester developed the IBM 1287, the first scanner capable of reading handwritten digits. The first commercial OCR was introduced in 1977 by Caere Corporation. OCR began to be available online as a service (WebOCR) in 2000 on various platforms via cloud computing. Depending on its method, OCR can be divided into two types: Online OCR (not to be confused with “online”). in Internet technology) involves the automatic conversion of text as it is written on a special digitizer or PDA, where a sensor picks up movements of the pen tip as well as switching of the pen up/down. This type of data is known as digital ink and can be thought of as a digital representation of handwriting. The resulting signal is converted into letter codes usable in computer and word processing applications. Offline OCR scans an image as a whole and does not process stroke order. It is a kind of image processing because it attempts to recognize character patterns in given image files. Online OCR can only process written texts in real time, while offline OCR can process images of handwritten and printed texts without any special devices. Most of the successful research on Bengali OCR to date has been carried out on printed text, although researchers are gradually moving towards handwritten text recognition. Sanchez and Pal proposed a classical line-based approach for continuous recognition of Bengali handwriting, based on hidden Markov models. and the modelsn-grams. They used both a word-based LM (language model) and a character-based LM for their experiment and found better results with a word-based LM. Garain, Mioulet, Chaudhuri, Chatelain and Paquet developed a recurrent neural network model to recognize Bangla handwriting without constraint at the character level. . They used a BLSTM-CTC-based recognition tool on a dataset consisting of 2,338 unconstrained Bengali handwritten lines, or about 21,000 words in total. Instead of horizontal segmentation, they chose vertical segmentation classifying words into “semi-ortho syllables”. Their experiment yielded an accuracy of 75.40% without any post-processing. Hasnat, Chowdhury and Khan developed a Tesseract-based OCR for the Bangla script which they used on a printed document. They achieved a maximum accuracy of 93% on clean printed documents and a minimum accuracy of 70% on a screen-printed image. Obviously this is very sensitive to variations in letter shape and is not very favorable for use in recognizing Bengali script characters. Chowdhury and Rahman proposed an optimal neural network setting for recognizing Bengali handwritten digits, consisting of two convolution layers with Tanh activation. , a hidden layer with Tanh activation and an output layer with softmax activation. To recognize the 9 Bangla digital characters, they used a dataset of 70,000 samples with an error rate of 1.22% to 1.33%. Purkayastha, Datta and Islam also used convolutional neural network for Bangla handwritten character recognition. They were the first to work on handwritten characters composed in Bengali. Their recognition experiment also included numerical characters and alphabets. They achieved 98.66% accuracy on numbers and 89.93% on almost all Bengali characters (80 classes). Some projects have been developed for Bangla OCR, it should be noted that none of them work on handwritten text: BanglaOCR is an open source OCR. developed by Hasnat, Chowdhury and Khan which uses Google Tesseract engine for character recognition and works on printed documents as discussed in section 3.1 Puthi OCR aka GIGA Text Reader is a cross-platform Bangla OCR application developed by Giga TECH. This application works on printed materials written in Bengali, English and Hindi. The Android app version is free to download, but the desktop version and enterprise version require payment. Chitrolekha is another Bangla OCR using Google Tesseract and Open CV Image Library. The app is free and may have been available on Google Play Store in the past, but as of now (as of July 15, 2018) it is no longer available. i2OCR is a multilingual OCR supporting over 60 languages including Bengali. Proposed Deep Convolutional Neural Network Methodology and Implementation. First of all, let's try to understand what a convolutional neural network (CNN) is. Neural networks are tools used in machine learning inspired by the architecture of the human brain. The most basic version of the artificial neuron is called a perceptron, which makes a decision based on weighted inputs and probabilities relative to a threshold value. A neural network is made up of interconnected perceptrons whose connectivity can differ in various configurations. The simplest perceptron topology is the feedforward network consisting of three layers: input layer, hidden layer, and output layer. Deep neural networks have more..