IDEAS home Printed from https://ideas.repec.org/p/idb/brikps/12962.html

Automatic Product Classification in International Trade: Machine Learning and Large Language Models

Author

Listed:
  • Marra de Artiñano, Ignacio
  • Riottini Depetris, Franco
  • Volpe Martincus, Christian

Abstract

Accurately classifying products is essential in international trade. Virtually all countries categorize products into tariff lines using the Harmonized System (HS) nomenclature for both statistical and duty collection purposes. In this paper, we apply and assess several different algorithms to automatically classify products based on text descriptions. To do so, we use agricultural product descriptions from several public agencies, including customs authorities and the United States Department of Agriculture (USDA). We find that while traditional machine learning (ML) models tend to perform well within the dataset in which they were trained, their precision drops dramatically when implemented outside of it. In contrast, large language models (LLMs) such as GPT 3.5 show a consistently good performance across all datasets, with accuracy rates ranging between 60% and 90% depending on HS aggregation levels. Our analysis highlights the valuable role that artificial intelligence (AI) can play in facilitating product classification at scale and, more generally, in enhancing the categorization of unstructured data.

Suggested Citation

  • Marra de Artiñano, Ignacio & Riottini Depetris, Franco & Volpe Martincus, Christian, 2023. "Automatic Product Classification in International Trade: Machine Learning and Large Language Models," IDB Publications (Working Papers) 12962, Inter-American Development Bank.
  • Handle: RePEc:idb:brikps:12962
    DOI: http://dx.doi.org/10.18235/0005012
    as

    Download full text from publisher

    File URL: https://publications.iadb.org/publications/english/document/Automatic-Product-Classification-in-International-Trade-Machine-Learning-and-Large-Language-Models.pdf
    Download Restriction: no

    File URL: https://libkey.io/http://dx.doi.org/10.18235/0005012?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alejandro Lopez-Lira & Yuehua Tang, 2023. "Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models," Papers 2304.07619, arXiv.org, revised Oct 2025.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuqi Nie & Yaxuan Kong & Xiaowen Dong & John M. Mulvey & H. Vincent Poor & Qingsong Wen & Stefan Zohren, 2024. "A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges," Papers 2406.11903, arXiv.org.
    2. Penaranda, Francisco & Sentana, Enrique, 2024. "Portfolio management with big data," CEPR Discussion Papers 19314, C.E.P.R. Discussion Papers.
    3. Claudia Biancotti & Carolina Camassa, 2023. "Loquacity and visible emotion: ChatGPT as a policy advisor," Questioni di Economia e Finanza (Occasional Papers) 814, Bank of Italy, Economic Research and International Relations Area.
    4. Liping Wang & Jiawei Li & Lifan Zhao & Zhizhuo Kou & Xiaohan Wang & Xinyi Zhu & Hao Wang & Yanyan Shen & Lei Chen, 2023. "Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey," Papers 2308.04947, arXiv.org.
    5. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2023. "From Transcripts to Insights: Uncovering Corporate Risks Using Generative AI," Papers 2310.17721, arXiv.org, revised Mar 2025.
    6. Mostapha Benhenda, 2025. "FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents," Papers 2502.07393, arXiv.org.
    7. Baptiste Lefort & Eric Benhamou & Jean-Jacques Ohana & David Saltiel & Beatrice Guez, 2024. "Optimizing Performance: How Compact Models Match or Exceed GPT's Classification Capabilities through Fine-Tuning," Papers 2409.11408, arXiv.org.
    8. Manish Jha & Jialin Qian & Michael Weber & Baozhong Yang, 2024. "Generative AI, Managerial Expectations, and Economic Activity," Papers 2410.03897, arXiv.org, revised Nov 2025.
    9. Paul Glasserman & Caden Lin, 2023. "Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis," Papers 2309.17322, arXiv.org.
    10. Kelvin J. L. Koa & Yunshan Ma & Ritchie Ng & Tat-Seng Chua, 2024. "Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models," Papers 2402.03659, arXiv.org, revised Feb 2024.
    11. Alex Kim & Maximilian Muhn & Valeri Nikolaev, 2024. "Financial Statement Analysis with Large Language Models," Papers 2407.17866, arXiv.org, revised Feb 2025.
    12. Han Ding & Yinheng Li & Junhao Wang & Hang Chen, 2024. "Large Language Model Agent in Financial Trading: A Survey," Papers 2408.06361, arXiv.org.
    13. Jaskaran Singh Walia & Aarush Sinha & Srinitish Srinivasan & Srihari Unnikrishnan, 2025. "Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation," Papers 2502.17011, arXiv.org.
    14. Li Zhao & Rui Sun & Zuoyou Jiang & Bo Yang & Yuxiao Bai & Mengting Chen & Xinyang Wang & Jing Li & Zuo Bai, 2025. "ContestTrade: A Multi-Agent Trading System Based on Internal Contest Mechanism," Papers 2508.00554, arXiv.org, revised Aug 2025.
    15. Ko, Hyungjin & Lee, Jaewook, 2024. "Can ChatGPT improve investment decisions? From a portfolio management perspective," Finance Research Letters, Elsevier, vol. 64(C).
    16. Hanshuang Tong & Jun Li & Ning Wu & Ming Gong & Dongmei Zhang & Qi Zhang, 2024. "Ploutos: Towards interpretable stock movement prediction with financial large language model," Papers 2403.00782, arXiv.org.
    17. Udit Gupta, 2023. "GPT-InvestAR: Enhancing Stock Investment Strategies through Annual Report Analysis with Large Language Models," Papers 2309.03079, arXiv.org.
    18. Julian Junyan Wang & Victor Xiaoqi Wang, 2025. "Assessing Consistency and Reproducibility in the Outputs of Large Language Models: Evidence Across Diverse Finance and Accounting Tasks," Papers 2503.16974, arXiv.org, revised Sep 2025.
    19. Thomas R. Cook & Sophia Kazinnik & Anne Lundgaard Hansen & Peter McAdam, 2023. "Evaluating Local Language Models: An Application to Bank Earnings Calls," Research Working Paper RWP 23-12, Federal Reserve Bank of Kansas City.
    20. Yujie Ding & Shuai Jia & Tianyi Ma & Bingcheng Mao & Xiuze Zhou & Liuliu Li & Dongming Han, 2023. "Integrating Stock Features and Global Information via Large Language Models for Enhanced Stock Return Prediction," Papers 2310.05627, arXiv.org.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • F10 - International Economics - - Trade - - - General
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:idb:brikps:12962. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Felipe Herrera Library (email available below). General contact details of provider: https://edirc.repec.org/data/iadbbus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.