Patent document OCR system developed by our company.
Automatically generates links from text to figures and from figures to text, and automatically creates HTML files with multiple frames.

The longest matching method is used to select strings. For example, if there is "Figure 5" in the text, it will be linked to "Figure 5" and not to "Figure 5a" or "Figure 5b".
If there are multiple "Figure 5", links to each page will be automatically generated. At the same time, links from "Figure 5" in the figure to the text will also be automatically generated.

In addition to patent drawings, there are development achievements of software that automatically generates HTML files linking the title block of the drawing (date, drafter, title, etc.) and the drawing.

Demonstration animation of patent document processing