Recognition of documents on a private example - an overview of available paid and free solutions

Hello! This is a typical situation in the company where I work. In accounting, there is an eternal rush, there are not enough people, everyone is doing something absolutely important, but essentially useless. This state of affairs did not suit the management.



In more detail, the problem is that accounting resources are not enough for current tasks, and no one wants to allocate rates for new people. Therefore, from above, they decided to cut some tasks and free up the accountants' time for more useful things. Such work as scanning and recognizing documents, copying, adding them to other routine joys came under the knife.



So, as an analyst, I was faced with the task of finding a solution for recognizing a document typical for my company - an invoice - to structure it into the available storage facilities, as well as in 1C. A solution that will be convenient, understandable, and will not cost the company a pretty penny.



The experience turned out to be amusing, I decided to share what I managed to collect. Perhaps I missed something, so welcome in the comments, if there is anything to add.





Document scanning programs, document recognition programs are not a new solution on the market, it can be found both in free programs and built into systems.



I started with free programs:



  • glmageReader
  • Paperwork
  • VietOCR
  • CuneiForm.


During the recognition of our invoice by such programs, I saw the following:



  • In programs such as VietOCR, Paperwork, glmageReader, you can configure the storage of scanned documents in specific folders, Paperwork can even sort them according to labels.
  • They generally do well with text, and where text is not recognized correctly, some programs can manually change the content before exporting the file.


image



However, there are also problems:



  • There is a difference between working with pdf scans and png. It is not always possible to successfully convert png to pdf.
  • Most of these programs are difficult to cope with the recognition of tabular documents, even the simplest format. As a result, we get the recognized text without the marked fields.



    image

  • Sometimes the font is inaccurately determined, as a result of which, when converting, all the recognized text runs over each other.
  • In the process of recognition, sometimes it is necessary to do alignment by keywords, with rotations and coordinate displacement.
  • In some programs, the table was recognized as a picture and exported to a new Word document also as a picture, very truncated, which is even difficult to see.
  • When editing recognized content in some programs, problems arose, the font or the text itself changed.




image



The technology worked well enough. Considering that the programs are free, the problems described above are acceptable. However, I was looking for a more streamlined solution.



Then I researched recognition in ABBYY FineReader 15 Corporate



During the 7-day trial period, I studied this platform too.



What noted:



  • When I opened the png file, it was perfectly read and as a result, it was successfully converted to pdf without losing the quality of the image and text.
  • , . png , .
  • - pdf. .
  • , , .
  • OCR pdf -. - .



    image

  • , , . , , .



    image

  • Here you can configure the automatic conversion of incoming documents that will regularly be pulled from the specified folder, according to the specified schedule.
  • It allows you to compare versions of documents, even if they are in different formats. With a large flow of documents and edits in them, it is very convenient.


I had a pleasant experience using this software. However, when I turned to the price tag of the ABBYY Flexicapture system solution (and I need the system solution), I found out that the solution, especially the customized one, costs a fairly round sum, about 400 thousand rubles / month. and above for 10 thousand pages.



I started looking for an alternative. How to free the hands of an employee, get high-quality document recognition and not worry about the safety and structure of data.



And then I decided to take a better look at ELMA RPA, which I had already studied earlier .



The vendor proposes to shift a significant part of the work on exporting data to ERP from the shoulders of accountants to robots. In fact, this is exactly what solves the problem posed to me. To get acquainted with recognition in this system, I took a trial version of the system from the vendor.



Here I discovered that recognition is not intended to convert the received data into a new document file.



Here the main goal is to recognize the details of the document and transfer them to other systems / sites / applications. In addition, robots put all the information where they need it: they automatically find the necessary folders and save them in the required formats.



What types of recognition in the system have I looked at:



Pattern recognition



We are offered to recognize the loaded document based on the document template. As far as I know, this type of recognition is free, the Tesseract engine is wired into it.



What noted:



  • This type of recognition works with scans of the jpg and png formats, it does not consider pdf yet. But the product is still young, I think everything is ahead.
  • This type of recognition is included in the free Community Edition
  • The text is conveniently marked up in blocks that can be matched according to the variables that we created in the context of the robot. Thus, manually configure what exactly we are pulling into recognition.
  • He recognized our invoice 50/50, changed some words as he saw fit. :)



    image





However, the vendor for this case said that this type of recognition is adapted for simple documents, with a text structure or with light forms. And he advised to use another type of recognition to recognize the invoice - intellect lab .



The process is the same, we load the template and recognize it by it. But here the template is sent to the cloud server.



We receive a response from the server (whether it recognizes this type of document or not), and if it is recognized, then the template structure (variables for mapping) is passed to match the variables that will need to be written in the RPA process.



During the playback process, we already send a document that we would like to recognize and receive a response from the iLab server about recognition.



What I noted about this recognition:



  • pdf, jpg png.
  • . .
  • - .
  • , 1. , , , , .
  • Community Edition . , (, , .), , 100 500 . ( , , .)


The process of document recognition itself is rather difficult to display on video, as it happens in the box, and the screen is empty for several seconds. Therefore, I made a separate entry of the recognized data into a notebook for visualization.



Recognizing document in notepad



Accordingly, the robot writes the same data to 1C, creating a new document there:



document recognition and creation in 1C



What we managed to find out by prices: If, for example, we want to work on a large scale with ilab recognition, then for our 10,000 documents we will have to pay:



  • about 180,000 rubles. at a time,
  • plus, say, 400,000 rubles. buying a robot with an orchestrator
  • total: 580,000 rubles.


The robot is unlimited, and 10,000 documents will be enough for some time. It turns out quite profitable, at least in the fact that we pay for everything once.



What we liked about recognition in this platform in general:



  • , , . .
  • , , , . .
  • . 15 , β€” . , .
  • , .


:



  • Free programs cope with the task of document recognition better than I expected, however, due to them, it will not be possible to significantly speed up work with a large volume
  • ABBYY FineReader copes well with processing and recognizing documents afterwards, however, to get a system solution, you need great financial capabilities.
  • ELMA RPA surprised by the quality of document recognition, variability, as well as storage and transfer capabilities after recognition, but it should be borne in mind that the product is young.



All Articles