Author
Listed:
- Anna Nesvijevskaia
(DICEN-IDF - Dispositifs d'Information et de Communication à l'Ère du Numérique - Paris Île-de-France - UPN - Université Paris Nanterre - Cnam - Conservatoire National des Arts et Métiers [Cnam] - Université Gustave Eiffel, HEG - Haute Ecole de Gestion de Genève, ISI 4C - Intelligence Swiss Initiative - HEG - Haute Ecole de Gestion de Genève)
- Stefan Berechet
(Quinten)
Abstract
The recent democratization of Large Language Models (LLMs) has profoundly transformed access to and exploitation of external data, particularly in the field of Strategic and Competitive Intelligence (CI). While LLM-based solutions offer new opportunities to enhance decision-making, innovation detection, and information processing, they also raise significant challenges regarding their evaluation, notably in terms of performance, trust, and operational impact. Existing evaluation frameworks, largely inherited from traditional machine learning, appear insufficient to capture the multifaceted, subjective, and evolving nature of LLM-driven uses. This paper explores the evaluation of LLM-based CI solutions through an in-depth case study of a French data science project aimed at supporting innovation scouting for enterprises. Building on a review of state-of-the-art LLM evaluation metrics and methodologies, the study adopts an anthropocentric qualitative approach, based on eleven semi-structured interviews conducted with data and business stakeholders involved in the co-design of the solution. The analysis follows the CRISP-DM framework to examine how evaluation activities are embedded throughout the project lifecycle. The findings highlight an extreme complexity in defining and operationalizing evaluation criteria, driven by the coexistence of multiple objectives such as relevance, exhaustiveness, creativity, usability, and strategic impact. Beyond statistical metrics, the study reveals the growing importance of behavioral and psychological dimensions, particularly user trust, which strongly influence adoption and perceived value but remain costly and difficult to measure over time. Moreover, the rapid evolution of LLM technologies and the constraints related to sovereignty, confidentiality, and intellectual property further complicate model selection, benchmarking, and long-term governance. The paper concludes by discussing emerging challenges for performance assessment, project arbitration, and the balance between standardization and flexibility of practices. It calls for renewed evaluation frameworks that integrate technical, organizational, and human factors, and outlines avenues for future research on sustainable and interpretable evaluation of LLM-based CI systems.
Suggested Citation
Download full text from publisher
To our knowledge, this item is not available for
download. To find whether it is available, there are three
options:
1. Check below whether another version of this item is available online.
2. Check on the provider's
web page
whether it is in fact available.
3. Perform a
for a similarly titled item that would be
available.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:halshs-05474783. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.