IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v15y2023i5p180-d1145821.html
   My bibliography  Save this article

A Hybrid Text Generation-Based Query Expansion Method for Open-Domain Question Answering

Author

Listed:
  • Wenhao Zhu

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
    These authors contributed equally to this work.)

  • Xiaoyu Zhang

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
    These authors contributed equally to this work.)

  • Qiuhong Zhai

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China)

  • Chenyun Liu

    (Shanghai Municipal Big Data Center, Shanghai 200433, China)

Abstract

In the two-stage open-domain question answering (OpenQA) systems, the retriever identifies a subset of relevant passages, which the reader then uses to extract or generate answers. However, the performance of OpenQA systems is often hindered by issues such as short and semantically ambiguous queries, making it challenging for the retriever to find relevant passages quickly. This paper introduces Hybrid Text Generation-Based Query Expansion (HTGQE), an effective method to improve retrieval efficiency. HTGQE combines large language models with Pseudo-Relevance Feedback techniques to enhance the input for generative models, improving text generation speed and quality. Building on this foundation, HTGQE employs multiple query expansion generators, each trained to provide query expansion contexts from distinct perspectives. This enables the retriever to explore relevant passages from various angles for complementary retrieval results. As a result, under an extractive and generative QA setup, HTGQE achieves promising results on both Natural Questions (NQ) and TriviaQA (Trivia) datasets for passage retrieval and reading tasks.

Suggested Citation

  • Wenhao Zhu & Xiaoyu Zhang & Qiuhong Zhai & Chenyun Liu, 2023. "A Hybrid Text Generation-Based Query Expansion Method for Open-Domain Question Answering," Future Internet, MDPI, vol. 15(5), pages 1-14, May.
  • Handle: RePEc:gam:jftint:v:15:y:2023:i:5:p:180-:d:1145821
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/15/5/180/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/15/5/180/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:15:y:2023:i:5:p:180-:d:1145821. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.