IDEAS home Printed from https://ideas.repec.org/p/gwi/wpaper/2023-07.html
   My bibliography  Save this paper

The Governance Challenge Posed by Large Learning Models

Author

Listed:
  • Susan Ariel Aaronson

    (George Washington University)

Abstract

Only 8 months have passed since Chat-GPT and the large learning model underpinning it took the world by storm. This article focuses on the data supply chain-the data collected and then utilized to train large language models and the governance challenge it presents to policymakers. These challenges include: - How web scraping may affect individuals and firms which hold copyrights. - How web scraping may affect individuals and groups who are supposed to be protected under privacy and personal data protection laws. - How web scraping revealed the lack of protections for content creators and content providers on open access web sites; and - How the debate over open and closed source LLM reveals the lack of clear and universal rules to ensure the quality and validity of datasets. As the US National Institute of Standards explained, many LLMs depend on "largescale datasets, which can lead to data quality and validity concerns. "The difficulty of finding the "right" data may lead AI actors to select datasets based more on accessibility and availability than on suitability... Such decisions could contribute to an environment where the data used in processes is not fully representative of the populations or phenomena that are being modeled, introducing downstream risks" -in short problems of quality and validity (NIST: 2023, 80). Thie author uses qualitative methods to examine these data governance challenges. In general, this report discusses only those governments that adopted specific steps (actions, policies, new regulations etc.) to address web scraping, LLMs, or generative AI. The author acknowledges that these examples do not comprise a representative sample based on income, LLM expertise, and geographic diversity. However, the author uses these examples to show that while some policymakers are responsive to rising concerns, they do not seem to be looking at these issues systemically. A systemic approach has two components: First policymakers recognize that these AI chatbots are a complex system with different sources of data, that are linked to other systems designed, developed, owned, and controlled by different people and organizations. Data and algorithm production, deployment, and use are distributed among a wide range of actors who together produce the system's outcomes and functionality. Hence accountability is diffused and opaque(Cobbe et al: 2023). Secondly, as a report for the US National Academy of Sciences notes, the only way to govern such complex systems is to create "a governance ecosystem that cuts across sectors and disciplinary silos and solicits and addresses the concerns of many stakeholders." This assessment is particularly true for LLMs—a global product with a global supply chain with numerous interdependencies among those who supply data, those who control data, and those who are data subjects or content creators (Cobbe et al: 2023).

Suggested Citation

  • Susan Ariel Aaronson, 2023. "The Governance Challenge Posed by Large Learning Models," Working Papers 2023-07, The George Washington University, Institute for International Economic Policy.
  • Handle: RePEc:gwi:wpaper:2023-07
    as

    Download full text from publisher

    File URL: https://www2.gwu.edu/~iiep/assets/docs/papers/2023WP/AaronsonIIEP2023-07.pdf
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    data; data governance; personal data; property rights; open data; open source; governance;
    All these keywords.

    JEL classification:

    • P51 - Political Economy and Comparative Economic Systems - - Comparative Economic Systems - - - Comparative Analysis of Economic Systems

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gwi:wpaper:2023-07. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Kyle Renner (email available below). General contact details of provider: https://edirc.repec.org/data/iigwuus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.