IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i3p560-d1042851.html
   My bibliography  Save this article

Column-Type Prediction for Web Tables Powered by Knowledge Base and Text

Author

Listed:
  • Junyi Wu

    (College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)

  • Chen Ye

    (College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
    College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
    Jubang Group Co., Ltd., Yueqing 325600, China)

  • Haoshi Zhi

    (College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)

  • Shihao Jiang

    (College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China)

Abstract

Web tables are essential for applications such as data analysis. However, web tables are often incomplete and short of some critical information, which makes it challenging to understand the web table content. Automatically predicting column types for tables without metadata is significant for dealing with various tables from the Internet. This paper proposes a CNN-Text method to deal with this task, which fuses CNN prediction and voting processes. We present data augmentation and synthetic column generation approaches to improve the CNN’s performance and use extracted text to get better predictions. The experimental result shows that CNN-Text outperforms the baseline methods, demonstrating that CNN-Text is well qualified for the table column type prediction.

Suggested Citation

  • Junyi Wu & Chen Ye & Haoshi Zhi & Shihao Jiang, 2023. "Column-Type Prediction for Web Tables Powered by Knowledge Base and Text," Mathematics, MDPI, vol. 11(3), pages 1-15, January.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/3/560/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/3/560/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:3:p:560-:d:1042851. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.