bg
News
19:49, 14 December 2025
views
12

In Russia, Artificial Intelligence Learns Speed Reading

The new method dramatically accelerates the setup of optical character recognition systems and improves their accuracy when working with real-world documents.

Researchers at NUST MISIS in Russia have developed a method that significantly speeds up the configuration of optical character recognition (OCR) systems and improves their accuracy when processing real documents. The new tool cuts OCR training time from several weeks to just 72 hours, making such systems far more practical for use in business and government document workflows.

Avoiding Errors

Optical character recognition is widely used to digitize contracts, invoices, archival materials, and other documents. In real-world conditions, however, OCR systems often struggle with errors caused by stamps, signatures, non-standard fonts, or poor scan quality. Improving accuracy typically requires lengthy and costly training. MISIS researchers proposed a different approach, combining classical machine learning methods with modern generative neural networks.

At the core of the development is a closed interaction loop between the OCR engine and a language model. The system independently analyzes recognition results, identifies recurring errors, and corrects them, generating new training data in the process.

During experiments, this approach reduced model preparation time to three days of continuous operation and achieved Russian text recognition accuracy above 90 percent. This level of accuracy meets widely accepted industry standards.

Under Non-Ideal Conditions

As noted by Kirill Pronin, a master’s student at the MISIS Institute of Computer Science, the use of generative models reduced training costs by nearly one-third and lowered the required size of test datasets. An additional advantage is the ability to simulate “non-ideal” conditions, such as poor print quality, complex layouts, and blurred images, which improves the robustness of neural network training.

Associate Professor Alexander Suleikin of NUST MISIS emphasized:

“This approach brings OCR solutions closer to real operating conditions. The development opens the door to more affordable and accurate tools for document workflow automation.”

The research results were presented at the international ISKE conference in China and will form the basis for new industrial and scientific developments.

like
heart
fun
wow
sad
angry
Latest news
Important
Recommended
previous
next