News in Version 5.0

General

The biggest, but least visible change is the use of 64 bit variables in KADMOS as well as in all programs which are needed for data collection and classifier computation. The biggest visible changes are:

The use of multi threading by the module REP (rep_do()). The standard version of KADMOS now supports parallel processing with two threads. The new server version of KADMOS supports parallel processing with all available threads. This results in a measurable gain in speed. The server version is sold at a fourfold price compared to the standard version. Both variations can be run with the program Famulus from the developer kit. In the result window the related recognition speed can be retrieved. New parameters GENERAL_REPMULTITHREADING and GENERAL_MULTITHREADING are introduced to activate or deactivate multithreading. This improvement requires the MSVC library wsock32.lib, which now has to be linked to all related applications.
If you, after switching to KADMOS 5.0, get an error message like the following:
"KADMOS is already running on your computer"
then you probably haven't ordered enough KADMOS licenses. You should either order an additional licence, or - in case of genuine server mode - switch to a KADMOS server license.

New algorithms for classifier computation and discrimination result in a clear improvement of recognition accuracy. rec_value, the confidence value of recognition, now has a clear meaning. Basic classifiers (for instance 'A' - 'B') recognize 99% of all related characters from our huge data base with a rec_value below 32. 99.9% have a rec_value below 64, 99.99% a rec_value below 96, and so on. For classifiers with 100 or 150 character classes this relationship holds up in an analog way: 90% have a rec_value below 32, 99% a rec_value below 64, 99.9% a rec_value below 96. If the reject level is changed by 32 (up or down), then the number of rejects gets lower or higher by a factor of 10. This of course holds only true for the single character recognition REC. With the line recognition REL the rec_values changes due to evaluation of the position of the character in the line, evaluations of the character font in relation to the font of the other characters in the line, and other evaluations. From this relationship of the reject level as the percentage of rejected characters a simple rule can be given: If the percentage of rejected characters at reject_level 64 exceeds 5% or 10%, then a data collection and a retraining of the used classifier is strongly recommended. The program Famulus in the developer kit now contains an additional menu item "Reject rates". After every recognition the related reject rates for the reject levels 32, 64, and 96 can be retrieved. In connection with this change the default value for the parameter parm.reject_limit has been set to 128. Even despite this reduction KADMOS 5.0 generates more alternatives than KADMOS 4.4 with the former value 150. If a comparable behavior of both KADMOS versions is required, reject_limit in KADMOS 5.0 should be set to a value of about 110.

The result window from Famulus.exe has ben expanded with the point reject rate.

KADMOS was prepared for a switch to the internal use of Unicode, including our service and classifier computation programs. We want to complete this within the next year to be able to recognize fonts like Tamil, Korean, and others. The value of REC_CHAR_SIZE in kadmos.h has been extended from 8 to 16 bytes to prepare the future use of Unicode 32-bit codes.

Two new functions rel_find() and rel_findg() were provided. rel_find() tries to find a readable text line in a given image. rel_findg() is doing the same, but specialized for gray images, which often are difficult for binarization - like serial numbers on money bills.

For the functions re_(w)readparm(), re_(w)writeparm(), and GetPrivateFileName() the default directory under Windows had to be changed in the case of a missing directory specification. We were forced to do this, as Windows Vista and Windows 7 do not work correctly any more with the former default (Windows directory). The new default now is - as already has been under Linux - the working directory (WorkDir). This default can be changed by setting the new environment variable "kadmos_inifiles" to a different default directory.

n kadmos.h REC_CHAR_SIZE has been expanded from 8 to 16 byte to prepare later use of Unicode 32 bit.

All machine print classifiers (jumbo??.rec, ttf??.rec) have been extended by the double accents Unicode 0x201C and 0x201D with the replacement (ersatz) representations '"[' and '"]' respectively. The Thai machine print classifier was extended by the labels 'T2' and 'U2' (replacement representation). These are special forms of the characters Unicode 0x0E14 and 0x0E15.

Parameters, Structures and Functions

New Functions

rel_find() Looks in black and white pictures a well readable line of text with the given number of letters. 🗏

rel_findg() searches in gray pictures for a readable line of text with the given number of letters. 🗏

re_layout() analyzes the structure and content of a given document. 🗏 Version 5.0o

Changed Functions

Für Windows wurde die Voreinstellung des Verzeichnisses geändert, falls kein Pfad beim angegebenen Dateinamen spezifiziert wurde.

re_readparm(), re_wreadparm() 🗏

re_writeparm(), re_wwriteparm() 🗏

GetPrivateFileName() 🗏

New Structures

ReLayoutResult, ReLayoutData 🗏

Changed Structures

ReParm 🗏

RecData 🗏