计算机科学
深度学习
图形模型
人工智能
数字化
过程(计算)
对象(语法)
情报检索
可视化
数据科学
数据挖掘
计算机视觉
操作系统
作者
Jwalin Bhatt,Khurram Azeem Hashmi,Muhammad Zeshan Afzal,Didier Stricker
摘要
In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that make the digitization of documents viable. Since the advent of deep learning, deep learning-based object detection performance has improved many folds. This work outlines and summarizes the deep learning approaches for detecting graphical page objects in document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.
科研通智能强力驱动
Strongly Powered by AbleSci AI