IDEAS home Printed from https://ideas.repec.org/p/fip/fedpwp/99851.html
   My bibliography  Save this paper

On the Testability of the Anchor-Words Assumption in Topic Models

Author

Listed:

Abstract

What does the Fed talk about in its monetary policy discussions? We introduce a new statistical methodology to analyze text documents, and we use that methodology to recover the topics discussed during FOMC meetings. Topic models are a simple and popular tool for the statistical analysis of textual data. Their identification and estimation are typically enabled by assuming the existence of anchor words; that is, words that are exclusive to specific topics. In this paper we show that the existence of anchor words is statistically testable: There exists a hypothesis test with correct size that has nontrivial power. This means that the anchor-words assumption cannot be viewed simply as a convenient normalization. Central to our results is a simple characterization of when a column-stochastic matrix with known nonnegative rank admits a separable factorization. We test for the existence of anchor words in two different datasets derived from monetary policy discussions in the Federal Reserve and reject the null hypothesis that anchor words exist in one of them.

Suggested Citation

  • Simon Freyaldenhoven & Shikun Ke & Dingyi Li & Jose Luis Montiel Olea, 2025. "On the Testability of the Anchor-Words Assumption in Topic Models," Working Papers 25-14, Federal Reserve Bank of Philadelphia.
  • Handle: RePEc:fip:fedpwp:99851
    DOI: 10.21799/frbp.wp.2025.14
    as

    Download full text from publisher

    File URL: https://www.philadelphiafed.org/-/media/FRBP/Assets/working-papers/2025/wp25-14.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.21799/frbp.wp.2025.14?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2013. "On the Testability of Identification in Some Nonparametric Models With Endogeneity," Econometrica, Econometric Society, vol. 81(6), pages 2535-2559, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jarociński, Marek & Marcet, Albert, 2019. "Priors about observables in vector autoregressions," Journal of Econometrics, Elsevier, vol. 209(2), pages 238-255.
    2. Babii, Andrii, 2020. "Honest Confidence Sets In Nonparametric Iv Regression And Other Ill-Posed Models," Econometric Theory, Cambridge University Press, vol. 36(4), pages 658-706, August.
    3. Rodrigo Adão & Costas Arkolakis & Sharat Ganapati, 2020. "Aggregate Implications of Firm Heterogeneity: A Nonparametric Analysis of Monopolistic Competition Trade Models," Working Papers 2020-161, Becker Friedman Institute for Research In Economics.
    4. Krief, Jerome M., 2017. "Direct instrumental nonparametric estimation of inverse regression functions," Journal of Econometrics, Elsevier, vol. 201(1), pages 95-107.
    5. Yu Zhu, 2020. "Inference in nonparametric/semiparametric moment equality models with shape restrictions," Quantitative Economics, Econometric Society, vol. 11(2), pages 609-636, May.
    6. Daniel Wilhelm, 2018. "Testing for the presence of measurement error," CeMMAP working papers CWP45/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Centorrino Samuele & Feve Frederique & Florens Jean-Pierre, 2017. "Additive Nonparametric Instrumental Regressions: A Guide to Implementation," Journal of Econometric Methods, De Gruyter, vol. 6(1), pages 1-25, January.
    8. Yilin Li & Wang Miao & Ilya Shpitser & Eric J. Tchetgen Tchetgen, 2023. "A self‐censoring model for multivariate nonignorable nonmonotone missing data," Biometrics, The International Biometric Society, vol. 79(4), pages 3203-3214, December.
    9. Manuel Arellano & Stéphane Bonhomme, 2017. "Nonlinear Panel Data Methods for Dynamic Heterogeneous Agent Models," Annual Review of Economics, Annual Reviews, vol. 9(1), pages 471-496, September.
    10. Hidehiko Ichimura & Whitney K. Newey, 2022. "The influence function of semiparametric estimators," Quantitative Economics, Econometric Society, vol. 13(1), pages 29-61, January.
    11. Wang, Ao, 2021. "A BLP Demand Model of Product-Level Market Shares with Complementarity," The Warwick Economics Research Paper Series (TWERPS) 1351, University of Warwick, Department of Economics.
    12. Victor Chernozhukov & Whitney K. Newey & Andres Santos, 2023. "Constrained Conditional Moment Restriction Models," Econometrica, Econometric Society, vol. 91(2), pages 709-736, March.
    13. Enrico Moretti, 2014. "Local Economic Development, Agglomeration Economies, and the Big Push: 100 Years of Evidence from the Tennessee Valley Authority," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 129(1), pages 275-331.
    14. Joachim Freyberger & Joel L. Horowitz, 2012. "Identification and shape restrictions in nonparametric instrumental variables estimation," CeMMAP working papers 15/12, Institute for Fiscal Studies.
    15. Jason R. Blevins & Wei Shi & Donald R. Haurin & Stephanie Moulton, 2020. "A Dynamic Discrete Choice Model Of Reverse Mortgage Borrower Behavior," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 61(4), pages 1437-1477, November.
    16. Hu, Yingyao & Schennach, Susanne M. & Shiu, Ji-Liang, 2017. "Injectivity of a class of integral operators with compactly supported kernels," Journal of Econometrics, Elsevier, vol. 200(1), pages 48-58.
    17. Denis Chetverikov & Daniel Wilhelm, 2017. "Nonparametric Instrumental Variable Estimation Under Monotonicity," Econometrica, Econometric Society, vol. 85, pages 1303-1320, July.
    18. Andrew Chesher & Adam M. Rosen, 2017. "Generalized Instrumental Variable Models," Econometrica, Econometric Society, vol. 85, pages 959-989, May.
    19. Daniel Wilhelm, 2015. "Identification and estimation of nonparametric panel data regressions with measurement error," CeMMAP working papers 34/15, Institute for Fiscal Studies.
    20. Xiaohong Chen & Andres Santos, 2018. "Overidentification in Regular Models," Econometrica, Econometric Society, vol. 86(5), pages 1771-1817, September.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    JEL classification:

    • C39 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Other
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedpwp:99851. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Beth Paul (email available below). General contact details of provider: https://edirc.repec.org/data/frbphus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.