ALC Files

For every classifier file a related Alc file is contained in the KADMOS developer kit. This is a text file in which all labels of the classifier can be found. The labels are divided into groups related to the ALC constants from kadmos.h (ALC_LCALPHA, ALC_UCALPHA, ALC_NUMERIC, ALC_SPECIAL, ALC_GESTURE). So for example under [numeric] in the file ttfus.alc the following lines are to be found:

[numeric]
font=0123456789
size=0:1
font+=0w121|32 02
size+=0:1 0.3:1         
       

This means, the classifier contains under ALC_NUMERIC the labels '0 ', '1 ', '2 ', ... '9 ', '0W', '12', and '1|', '32', and '02'. All these characters are expected within their text line as an uppercase 'A' (size=0.0:1.0), only the character 02 is smaller. It is the character o, often used as zero. If the second byte normally is not a blank (' '), an additional line is filled in. So in the file hand.alc under [numeric] an additional line ext=_ can be found. Below the lines font and size (resp. font+ and size+) additional lines xminmax and yminmax (resp. xminmax+ and yminmax+) sometimes can be found. There the minimum and maximum extensions of the characters in x- and y-direction are determined, measured in pixels. Let us give an example:

[numeric]
font=0123456789
size=0.0:1.0
xminmax=10:20
yminmax=15:30         
       

The characters given under font have to be at least 10 pixels in width, but maximum 20 pixels (xminmax). The height of the characters is at least 15 pixels, but maximum 30 pixels (yminmax). If one of the conditions is not satisfied, the related character becomes a reject. From the Alc file it can be seen, which labels are treated as characters of equal shape. These character classes can be found under [equivalence] in the line moma=.... The classifier ttfus.rec has in its Alc file beside others the following moma equivalences:

, ' - _ . '2· * *2*4 / '|,|1|I|\ l|| 0 02O o ° 12l2         
       

For moma equivalent character classes there is no exclusion of alternatives, if for a call to re?_do() under option the parameter OPTIONS_EXCLUDE is set. Otherwise only a random (and probably false) result would be generated.

Sample Alc File

[comment] comment section of the Alc file

ttfus.alc
Generated by RecMaker from ttf.rec and ttfus.al0
Generation Time ##-Feb-200# 11:50
crc=0x7ab8

[general]
representation=CODE_ISO_8859_1
[lcalpha] lower case section

font=acemnorsuvwxz bdfhklt gpqy    i      j       ss
size=0.3:1         0:1     0.3:1.4 0.1:1  0.1:1.4 0:1.1
font+=a2    g2    i|    j|      l2l|
size+=0.3:1 0.1:1 0.1:1 0.1:1,4 0:1

font=  characters of the section with 1 byte-label.
size=  size (height) of the characters top:bottom, related to top and bottom of an uppercase A as 0:1.
font+= characters of the section with 2 byte-label.
size+= see size=

[ucalpha] upper case section
font=ABCDEFGHIJKLMNOPRSTUVWXYZ Q
size=0:1                       0:1.1
font+=I|
size+=0:1

font=  characters of the section with 1 byte label.
size=  size (height) of the characters top:bottom, related to top and bottom of an uppercase A as 0:1.
font+= characters of the section with 2 byte label.
size+= see size=  

[numeric] numbers section
font=012345689
size=0:1
font+=0w121|32 02
size+=-0:1     0.3:1

font=  characters of the section with 1 byte label.
size=  size (height) of the characters top:bottom, related to top and bottom of an uppercase A as 0:1. 
font+= characters of the section with 2 byte label.
size+= see size=

[special] special characters section
font=!#%&()/?[\]{|}£¥ "'    $§       *+      ,       -·      .     :    ...
size=0:1              0:0.2 -0.1:1.1 0.3:0.7 0.8:1.2 0.5:0.6 0.7:1 0.4:1
font+=*4e$ '2'|  *2    ,|
size+=0:1  0:0.2 0:0.4 0.8:1.2

font=  characters of the section with 1 byte label.
size=  size (height) of the characters top:bottom, related to top and bottom of an uppercase A as 0:1. 
font+= characters of the section with 2 byte label.
size+= see size=

[reject] reject section
labels=#X#x0<0[0]1<1[1]2<2[2]3<3[3]4<4[4]5<5[5]6<6[6]7<7[7]8<8[8]9<9[9]

labels= character forms that shall be rejected  

[slant] section characters with a slant wit 2 byte label
base=64
label=' , F J P d f j p r y   /  L   Q b h k q  \
slant=10                      20 -10 -5         -20
korr='|,|/ 1|I|\ l||

base= supposed character height in pixel.
label= characters of the section with 2-byte-label
slant= slant of the characters listed under label. Negative numbers mean slant to the left, positive mean slant to the right.
korr= Characters to use slant for discrimination

[width] width description of the characters
base=16 supposed character height 16 pixels
blank=4 width of a standard blank
prop=11 normal character width with proportional spacing

prop3=! " ' '2'|, ,|1|. ; I|i|l|| .
prop4=( ) - [ ] j|{ }
prop5=* *2+ . = ° 
prop6=/ j l
prop7=1 12< > \ _ i ¢ 
prop8=? @ I J f l2s ~ § 
prop9=# 020w3 5 7 a2c e o r t u z £ ä ö 
prop10=$ *40 2 324 6 8 9 E F L a b d h k n v x 
prop12=M U V X m y ¥ Ä Ö Ü 
prop13=w
prop14=W

equi=9   normal character width with equidistant spacing
equi3=! " ' '2'|( ) , ,|- 1|: ; I|[ ] i|j|l|{ | } · 
equi6=* *2+ . / 1 12< = > \ _ i j l ¢ ° 
equi12=% & A B C D G H K M N O P Q R S T U V W X Y Z e$g g2m p q w y 
       ¥ Ä Ö Ü ss

width of the characters in pixels, related to the line height given in base in the case of proportional spacing
equi?=width of the characters in pixels, related to the line height given in base in the case of equidistant spacing.

[equivalence]        section of equally shaped and equally named characters
moma=, '   - _   . '2·   * *2*4  / '|,|1|I|\ l||   0 02O o °   12l2
moma=C c   i|j|  Ö ö   P p   S s   U u   Ü ü   V v   W w   X x   Y y
moma=Z z

rename=* * *2*4  , , ,|  0 0 0w02  1 1 121|  3 3 32  I I I|  a a a2  g g g2
rename=i i i|  j j j|  l l l2l|  ' ' '2'|

moma=   Groups of characters with equal shape. Groups are separated by blanks, but mind that the labels 
        itself can have a blank as second byte.
rename= The first 2 byte label in every block is the label for the group of the rest in the block 
        (basic labels). All other labels in the block are labels from characters. For example the label 9_ is the label 
        from the group of handwritten niners. The basic label 9_ means a character with a bend the 91 means a character 
        with a stroke. It can be worked with group labels or with basic labels.

[words] Section of additional characters in words of numbers, lowercase alpha, or uppercase alpha. 
        This is necessary for instance not to get a result of a OZ but a result of 02 in a string "02.04.99". 

lcalpha+=, . / : ; ' '|,|\ | 
ucalpha+=, . / : ; ' '|,|\ | 
numeric+=, . / : ; ' '|,|\ | 


[fontgroup]
machine =[ 24aw|]
latin =[ 24aw|]
ocra =[a]

 
other_norm=! " # $ % & ' '2'|( ) * *2*4+ , ,|- . .2/ 0 020w1 ...

[fontgroup] Here information is provided about handprint, machine print, OCRA, or other and is used 
            to enlarge the rec_value   for handprint characters among machine print and similar cases. The blank between 
            the square brackets means that all 2 byte labels with a blank in the second place also describe machine 
            print characters. 

[segmentation]
tryless="#%49HKLMNPTUVWYbdhkmnpquw„”
trymore=!'()*,./1:;<>CIJV[\]cijlnrv{|}

Here characters are listed which might be a good result of wrong segmentation. So a letter m might be 
  recognized as well as the letters r and n and vice versa. That means that in these cases the engine always tries alternative 
  segmentation.