| Normal symbol | SYMBOL_ETC | !:”’‘?^_ ̄&#@´ `‘“ |
| Number symbol | SYMBOL_NUMBER | =+*<>/¥$%±×÷≠≦≧ |
| Middle dot | SYMBOL_NAKAGURO | ・ |
| Minus sign | SYMBOL_MINUS | - |
| Comma | SYMBOL_COMMA | , |
| Period | SYMBOL_PERIOD | . |
| Round brackets | SYMBOL_PAREN | () |
| Brackets | SYMBOL_BRACE | {}[]--〔〕「」『』【】 |
| Punctuation marks | SYMBOL_KUTOTEN | 、。 |
| Circle | SYMBOL_MARU | ○ |
| Extended bar | SYMBOL_NOBASBO | ー |
| Arabic numerals | CHAR_NUMBER | 0~9 |
| Uppercase alphabets | CHAR_ALPHABET_CAPITAL | A-Z |
| Lowercase alphabets | CHAR_ALPHABET_SMALL | a-z |
| Uppercase katakana | CHAR_KATAKANA_CAPITAL | ア-ン |
| Lowercase katakana | CHAR_KATAKANA_SMALL | ァ-ヶ |
| Uppercase hiragana | CHAR_HIRAGANA_CAPITAL | あ-ん |
| Lowercase hiragana | CHAR_HIRAGANA_SMALL | ぁ-ょ |
| Kanji numerals | CHAR_KANJI_NUMBER | 一二三四五六七八九十百千万億兆 |
| Kanji characters | CHAR_KANJI |
| Ascii symbols | ASCII_CHARSET |
| File name | FILE_CHARSET |
| Macro Constants | Description | Definition |
|---|---|---|
| CHAR_SET_SYMBOL | Symbols | (SYMBOL_ETC | SYMBOL_NUMBER | SYMBOL_NAKAGURO | SYMBOL_MINUS | SYMBOL_MARU | SYMBOL_NOBASBO) |
| CHAR_SET_KAKKO | Brackets | (SYMBOL_PAREN | SYMBOL_BRACE) |
| CHAR_SET_TERMINAL | Punctuation marks | (SYMBOL_PERIOD | SYMBOL_COMMA | SYMBOL_KUTOTEN) |
| CHAR_SET_ALPHABET | All alphabets | (CHAR_ALPHABET_SMALL | CHAR_ALPHABET_CAPITAL) |
| CHAR_SET_HIRAGANA | Hiragana | (CHAR_HIRAGANA_CAPITAL | CHAR_HIRAGANA_SMALL) |
| CHAR_SET_KATAKANA | Katakana | (CHAR_KATAKANA_CAPITAL | CHAR_KATAKANA_SMALL) |
| CHAR_SET_KANJI | All kanji characters | (CHAR_KANJI | CHAR_KANJI_NUMBER) |
| CHAR_SET_ALL | All character types | (CHAR_SET_SYMBOL | CHAR_SET_KAKKO | CHAR_SET_TERMINAL | CHAR_NUMBER | CHAR_SET_ALPHABET | CHAR_SET_HIRAGANA | CHAR_SET_KATAKANA | CHAR_SET_KANJI) |
// Pattern bounding rectangle
typedef struct {
short x1; // Top left x coordinate of the pattern
short y1; // Top left y coordinate of the pattern
short x2; // Bottom right x coordinate of the pattern
short y2; // Bottom right y coordinate of the pattern
} OCRRect;
// Recognition candidate
typedef struct {
char* code; // Pointer to the pattern string
unsigned char score; // Confidence level
unsigned char filler[3]; // padding
} Candidate;
//////////////////
// OCR result structure
// 128 bytes fixed
typedef struct {
Candidate cand[MAX_CAND]; // Candidate data MAX_CAND == 10 80 bytes
OCRRect area; // Recognized area 8 bytes
long fieldtype;
unsigned long chartype; // Character type of the recognition result
unsigned long maskchartype1;
unsigned long maskchartype2;
unsigned long space; // Number of spaces following the characters in a line recognition
unsigned char cgravx;
unsigned char cgravy;
unsigned char morph;
unsigned char size;
char newcand[KEYSIZE_MAX];
} OCRResult;
typedef struct {
short x1; // Top left x-coordinate of the pattern
short y1; // Top left y-coordinate of the pattern
short x2; // Bottom right x-coordinate of the pattern
short y2; // Bottom right y-coordinate of the pattern
} OCRRect;
| Basic Dictionaries | |||
| Dictionary Name | Record File Name | Key File Name | Content |
|---|---|---|---|
| system | system.dbs | system.key | Required |
| systemfat | systemfat.dbs | systemfat.key | Required |
| Optional Dictionaries (Differential Dictionaries) | |||
| diff0 | diff0.dbf | diff0.kef | Kaisho Font |
| diff1 | diff1.dbf | diff1.kef | Blurry Characters |
| diff2 | diff2.dbf | diff2.kef | Squished Characters |
| diff3 | diff3.dbf | diff3.kef | Numbers |
| diff4 | diff4.dbf | diff4.kef | Alphabets |
| diff5 | diff5.dbf | diff5.kef | Hiragana |
| User Pattern Dictionary | |||
| Any name | |||
| kana | kana.dbf | kana.kef | Addition of Hiragana, Katakana |
| optblur | optblur.dbf | optblur.kef | Severe Blurry Characters 1 |
| blur | blur.dbf | blur.kef | Severe Blurry Characters 2 |
| optblot | optblot.dbf | optblot.kef | Severe Squished Characters 1 |
| blot | blot.dbf | blot.kef | Severe Squished Characters 2 |
| ninja001 | ninja001.dbf | ninja001.kef | Addition of Alphanumeric Characters |
|
About User Dictionary |
|
The format of the user dictionary is the same as the optional dictionary, but the only difference is that the image of the pattern itself is also stored for reference. Only one user dictionary can be used per dictionary class instance. Also, within the lifetime of one dictionary class instance, you can usually only register a maximum of 1024 patterns. If you need to register more than 1024 patterns, you need to delete the instance first, and then generate and initialize it again. In addition, since the user dictionary is always managed exclusively as long as an instance exists, only one user dictionary can be used on one machine at a time. To share a user dictionary among multiple instances of recognition classes, you need to share one dictionary class instance. The maximum number of records in one dictionary is 8192. A typical dictionary contains records corresponding to several hundred to several thousand characters. The larger the size of the dictionary, the longer it takes for recognition. By the way, as of now (June 1999), the total number of records combining all the basic and optional dictionaries is about 10,000. |
| FILE_OPEN_ERROR | File not found. The file is locked (the dictionary is currently in use). An instance using the same user dictionary already exists. Another instance is loading the same system dictionary. |
| FILE_READ_ERROR | Read error (the dictionary may be corrupted). |
| FILE_SEEK_ERROR | Seek error (the dictionary may be corrupted). |
| MEMORY_SHORTAGE | Memory shortage. |
| FATAL_ERROR | Fatal error (no dictionary records could be loaded). |
#include "ocrdef.h"
#include "ocrco.h"
#include "cjocrstock.h"
#include "cjocrdict98.h"
#include "errcode.h"
.....
// 1...Create an instance of the dictionary class
CJocrDict* pjocrdict = new CJocrDict;
// 2...Set up the basic dictionary
pjocrdict->msetsystemdict("\\dic\\feature\\system");
pjocrdict->msetsystemdict("\\dic\\feature\\systemfat");
// 3...Set up the optional dictionary
pjocrdict->msetdiffdict("\\dic\\feature\\diff1");
pjocrdict->msetdiffdict("\\dic\\feature\\diff2");
pjocrdict->msetdiffdict("\\dic\\feature\\diff3");
pjocrdict->msetdiffdict("\\dic\\feature\\diff4");
pjocrdict->msetdiffdict("\\dic\\feature\\diff5");
// 5...Set up the user dictionary (optional)
m_pJocrDict->msetuserdict("\\dic\\feature\\userpat");
// 6...Load the dictionary
i1 = pjocrdict->mloaddict();
if(i1 < 0) {
Error (defined in errcode.h)
FILE_OPEN_ERROR The dictionary could not be found
FILE_READ_ERROR The dictionary could not be read
FILE_SEEK_ERROR Unable to seek to the dictionary record
MEMORY_SHORTAGE Out of memory (no further processing)
FATAL_ERROR Unrecoverable error (no further processing)
}
......Recognition, registration, deletion, reference
// Delete the instance
delete pjocrdict;
int i1 = pjocrdict->mput("漢",pattern);
if(i1 < 0) {
Error;
FILE_SEEK_ERROR
FILE_WRITE_ERROR
}
int i1 = pjocrdict->mseek("漢");
if(i1 < 0) {
Error;
}
if(i1 == 1) {
// Delete
i1 = pjocrdict->mdel();
if(i1 < 0) {
Error;
}
}
int i1 = pjocrdict->mseeknext();
// keysize is a variable used for input and output
// It represents the maximum read buffer size when inputting
// and the actual read size when outputting
unsigned long keysize = KEYSIZE_MAX;
char keybuffer[KEYSIZE_MAX];
if(i1 == 1) {
i1 = pjocrdict->mgetkey(keysize,keybuffer);
if(i1 < 0) {
Error;
}
else {
// REG_FONT_SIZE is the normalized font size in bytes
// It is resized and saved as REG_FONT_SIZE during user dictionary registration
// In ALOCR Ver.1.0, the normalized font is 48x48 pixels, which is 288 bytes
// REG_FONT_WIDTH....48
// REG_FONT_HEIGHT....48
// REG_FONT_SIZE....48*48/8(8 pixels = 1 byte)
// recordsize is a variable used for input and output
// It represents the maximum read buffer size when inputting
// and the actual read size when outputting
unsigned long recordsize = REG_FONT_SIZE;
char record[REG_FONT_SIZE];
i1 = mgetpattern(recordsize,record);
if(i1 < 0) {
Error;
}
// The obtained pattern is a bitmap of REG_FONT_WIDTH × REG_FONT_HEIGHT pixels.
}
}
else if(i1 == 0) {
No more records;
}
else {
Error;
}
| Class Name | CJocrDict |
| Header File | ocrdef.h ocrco.h cjocrstock.h cjocrdict98.h errcode.h |
| Class Name | CJocrPattern |
| Header File | ocrdef.h ocrco.h cjocrpat98.h errcode.h |
// 1...Create an instance of the pattern class
CJocrPattern* pattern = new CJocrPattern;
// 2...Allocate memory
i1 = pattern->mallocmemory();
if(i1 < 0) {Display error message; delete pattern;}
.....
delete pattern;
typedef struct {
unsigned char* top; // Address of the start of the image data
short width; // Width of the image data (in byte boundaries, in pixels)
short height; // Height of the image data (in pixels)
} OCRBuffer;
typedef struct {
short x1; // Top-left x-coordinate of the pattern
short y1; // Top-left y-coordinate of the pattern
short x2; // Bottom-right x-coordinate of the pattern
short y2; // Bottom-right y-coordinate of the pattern
} OCRRect;
| Class Name | CJocrRecognize |
| Header File | ocrdef.h ocrco.h cjocrrec98.h errcode.h |
// 1...Create an instance of the recognition class
// Construct an instance by specifying a 20-digit code supplied by the library license distributor.
// Updated on November 6, 2000
// Alternatively, specify the path to a license code file supplied by the library license distributor.
CJocrRecognize* precognize = new CJocrRecognize("ABCDEFGHJKLMNPQ23456");
// For license code files CJocrRecognize* precognize = new CJocrRecognize("C:\\Program Files\\Foo\\jocr.kcd");
// 2...Pattern setting
// Let pattern be an instance of the CJocrPattern class
// .....Generate and initialize the pattern
precognize->msetpatter(pattern);
// 3...Dictionary setting
// Let pjocrdict be an instance of the CJocrDict class
// .....Generate and initialize pjocrdict
precognize->msetdict(pjocrdict);
// 4...Memory allocation
i1 = precognize->mallocmemory();
if(i1 < 0) {
MEMORY_SHORTAGE....Insufficient memory (defined in errcode.h)
Display error message;delete pattern;
}
....
Repeat single character recognition process
....
delete pattern;
pattern->mrecognize(CHAR_SET_ALL);
// Get recognition result
// For OCRResult, please refer to section 2-2 in the reference manual.
OCRResult aresult;
mgetresult(&aresult);
| Class Name | CJocrLine |
| Header File | ocrdef.h ocrco.h cjocrline98.h errcode.h |
// 1...Create an instance of the line class
CJocrLine* pjocrline = new CJocrLine;
// 2...Set an instance of the pattern class to the line class (pattern already constructed elsewhere)
// Execution of mallocmemory is required
pjocrline->msetpattern(pattern);
// 3...Set an instance of the recognition class to the line class (precognize already constructed elsewhere)
// Execution of msetpattern, msetdict, and mallocmemory is required
pjocrline->msetrecognize(precognize);
// 4...Initialize the document
// Call this whenever the image buffer for recognition changes
typedef struct {
unsigned char* top; // Starting address of the image data
short width; // Width of the image data (byte alignment, pixel units)
short height; // Height of the image data (pixel units)
} OCRBuffer;
OCRBuffer aocrbuffer;
aocrbuffer.top = ...; // Buffer address
aocrbuffer.width = ...; // Buffer width (pixel units, multiple of 8)
aocrbuffer.height = ...; // Buffer height (pixel units)
int i1 = pjocrline->msetdocument(&aocrbuffer);
if(i1 < 0) {
// MEMORY_SHORTAGE....Insufficient memory
Display error message;
delete pjocrline;
}
pjocrline->msetdpi(400); // Resolution set to 400dpi
Repeat line recognition. Call msetdocument when the image buffer for recognition changes.
delete pjocrline;
// Line settings
OCRRect aocrrect;
aocrrect.x1; // Top-left X coordinate of the bounding rectangle of the line
aocrrect.y1; // Top-left Y coordinate of the bounding rectangle of the line
aocrrect.x2; // Bottom-right X coordinate of the bounding rectangle of the line
aocrrect.y2; // Bottom-right Y coordinate of the bounding rectangle of the line
#if Horizontal writing
msetlineuser(&aocrrect,HORIZONTAL_LINE); // Horizontal writing
#else
msetlineuser(&aocrrect,VERTICAL_LINE); // Vertical writing
#endif
// Line recognition
i1 = pjocrline->mrecognize(CHAR_SET_ALL);
if(i1 < 0) {Display error message;}
else {
int resultnum;
OCRResult pocrresult[resultnum];
// Get results
i1 = pjocrline->mgetresult(resultnum,pocrresult);
if(i1 < 0) {Display error message;}
}
| Class Name | CJocrLang |
| Header Files | ocrdef.h ocrco.h cjocrline98.h cjocrlang.h errcode.h |
| Class Name | CJocrBlock |
| Header File | ocrdef.h ocrco.h cjocrblock.h errcode.h |