Puff sandwich Cognitive Natalia Lazareva | 05.06.2011
New technology introduced by the company Cognitive Technologies designed for batch scanning, processing and compression of facsimile copies of contracts, invoices, tables, personnel paperwork, etc
D., that is, those documents which present graphics, print, signature and text.
As stressed by Vice-President Cognitive Technologies of marketing and public communications Nikolai Nikolsky, the documents that Russia has to deal, are often in very
bad condition, illegible stamps and signatures, poor font, markings, markers, various kinds of stains. Experts of the company assure that after treatment with ScanPack document can
again become legible.
As you know, high-quality image is a digitized document bulky, occupying considerable amount of space. Therefore, the new technology provides automatic compression and
packaging files. The developers emphasize higher efficiency ScanPack in comparison with similar solutions on the market, and this was the result of the implementation of technology
a certain similarity of the expert system. According to Vladimir Arlazarov, head of the technological laboratory, ScanPack has the ability to analyze the type and structure of the document, and then
automatically select the most appropriate mode of processing. It was also stressed that the above technology to a company called not cognitive, but much more simply and clearly: technology
understanding documents.
The idea of the new approach is to split the image captured without loss to some areas and to process each with a different degree of compression. The result is a layered
sandwich, but as a result, each picture is legible. Note: in Cognitive ScanPack provides the ability to OCR text layers by using the built in kernel OCR system
CuneiForm (OpenOCR), but the actual recognition engine can be different.
After processing and compression of all the layers (including the layer with the recognized text) are collected in a file PDF/A (the compression technology of electronic documents, with their subsequent conversion into PDF/a format).
It is thus possible to reduce the amount of digitized documents to 10x: for example, the insurance policy when saving in JPEG format with a quality suitable for use, will occupy a volume of 2.42
MB and compressed using ScanPack – total 408 KB.
According to Nikolai Nikolsky, the global market system, such ScanPack not yet formed, but, according to preliminary estimates, only in the Russian segment its potential volume will be about
billion dollars. “We are taking this system to the market, as there is a greater need, he said. – And print as a technology, but in the future will appear and the product. Technology
already implemented by some insurance companies. There are requests from Europe and America.”
In the future, the developers promise to teach the system to work with color pages of magazines and Newspapers – because of their economical storage is also necessary.
Source: IT News No. 8 (may 2011)