Author
Listed:
- Junyi Wu
(College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)
- Chen Ye
(College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
Jubang Group Co., Ltd., Yueqing 325600, China)
- Haoshi Zhi
(College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)
- Shihao Jiang
(College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)
Abstract
Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table column type prediction.
Suggested Citation
Junyi Wu & Chen Ye & Haoshi Zhi & Shihao Jiang, 2023.
"Column-Type Prediction for Web Tables Powered by Knowledge Base and Text,"
Mathematics, MDPI, vol. 11(3), pages 1-15, January.
Handle:
RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.