IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v118y2023i544p2698-2711.html
   My bibliography  Save this article

Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data

Author

Listed:
  • Zhaoxing Gao
  • Ruey S. Tsay

Abstract

This article proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each second-level computer collects the common factors from its subordinates and performs another PCA to select the second-level common factors. This process is repeated until the central server is reached, which collects factors from its direct subordinates and performs a final PCA to select the global common factors. The noise terms of the second-level approximate factor model are the unique common factors of the first-level clusters. We focus on the case of two levels in our theoretical derivations, but the idea can easily be generalized to any finite number of hierarchies, and the proposed method is also applicable to data with heterogeneous and multilevel subcluster structures that are stored and analyzed by a single machine. We introduce a new diffusion index approach to forecasting based on the global and group-specific factors. Some clustering methods are discussed in the supplement when the group memberships are unknown. We further extend the analysis to unit-root nonstationary time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size T. We use both simulated and real examples to assess the performance of the proposed method in finite samples, and compare our method with the commonly used ones in the literature concerning the forecasting ability of extracted factors. Supplementary materials for this article are available online.

Suggested Citation

  • Zhaoxing Gao & Ruey S. Tsay, 2023. "Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 118(544), pages 2698-2711, October.
  • Handle: RePEc:taf:jnlasa:v:118:y:2023:i:544:p:2698-2711
    DOI: 10.1080/01621459.2022.2071279
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2022.2071279
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2022.2071279?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:118:y:2023:i:544:p:2698-2711. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.