It is very difficult to improve the accuracy of caption recognition. It is even more difficult for variety shows than news programs.
Many researchers have been studying and developing algorithms to extract only the character parts of captions, but the practical application is still far from reality.
Art Logic has developed a 2-value algorithm for color images, including captions, that can recognize parts of the character parts of captions by combining 14 different algorithms, and supports them using the "even a poor marksman will hit the target with enough shots" method.
On a PC with about 2 cores and 4HT, it takes about 4 seconds to process one image, which is a disadvantage. To achieve a speed of 1 frame per second, you need a CPU with 8 cores and 16HT. Of course, operation with a 64-bit OS is recommended.
|By performing 14 process parallel execution of low recognition rate recognition processing as shown in the figure below and integrating the results,
high accuracy is achieved.
Due to copyright reasons, only the corners of the screen are quoted. Thank you for your understanding.
|Binarization with strong emphasis on low luminance ignoring color components
|Binarization with strong emphasis on high luminance ignoring color components
|Binarization with yellow and orange colors as character colors
|Binarization targeting outlined characters (outlined outline is low luminance)
|Binarization using saturation information
|General binarization for color documents (not dedicated to captions)