IDEAS home Printed from https://ideas.repec.org/h/spr/lnopch/978-3-032-13116-4_6.html

From Proposals to Outcomes: Concept-Aligned Chunking for Cross-Document Relevance Assessment in Research Funding Review

In: AI, Society and Digital Transformation

Author

Listed:
  • Fengchi Yuan

    (Tsinghua Shenzhen International Graduate School, Tsinghua University, Institute of Data and Information)

  • Keqin Guan

    (Tsinghua Shenzhen International Graduate School, Tsinghua University, Institute of Data and Information)

  • Siyu Chen

    (Tsinghua Shenzhen International Graduate School, Tsinghua University, Institute of Data and Information)

  • Bokui Chen

    (Tsinghua Shenzhen International Graduate School, Tsinghua University, Institute of Data and Information)

  • Wai Kin Victor Chan

    (Tsinghua Shenzhen International Graduate School, Tsinghua University, Institute of Data and Information)

Abstract

Government-funded science and technology innovation projects are vital for driving industrial development and supporting talent cultivation. However, evaluating their outcomes remains a significant challenge, especially when some researchers misattribute unrelated publications to funding projects, raising concerns about research integrity and transparency. This paper focuses on the challenging task of assessing the relevance between project proposals and research outputs, formulated as a long-text matching problem. Due to the fact that even valid research outputs often address only subtopics of the original project objectives, traditional methods, which typically compare entire documents, often fail to provide accurate relevance assessments. To address this, we propose ConceptSplitter, a concept-based chunking method inspired by long-text structuring strategies. As part of a retrieval-augmented generation (RAG) pipeline, ConceptSplitter serves as the chunking module that improves retrieval precision and contextual relevance in large language model inference. To support robust evaluation, we also construct a domain-diverse dataset that mirrors real-world funding scenarios. Experiments on this dataset show that ConceptSplitter outperforms traditional methods by enhancing chunking quality, improving the accuracy of relevance classification, and providing more reliable confidence estimation in large language model outputs.

Suggested Citation

  • Fengchi Yuan & Keqin Guan & Siyu Chen & Bokui Chen & Wai Kin Victor Chan, 2026. "From Proposals to Outcomes: Concept-Aligned Chunking for Cross-Document Relevance Assessment in Research Funding Review," Lecture Notes in Operations Research, in: Xiaolei Xie & Kejia Hu & Guiping Hu & Weiwei Chen & Robin Qiu (ed.), AI, Society and Digital Transformation, pages 66-77, Springer.
  • Handle: RePEc:spr:lnopch:978-3-032-13116-4_6
    DOI: 10.1007/978-3-032-13116-4_6
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a
    for a similarly titled item that would be available.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:lnopch:978-3-032-13116-4_6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.