IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1803.09514.html
   My bibliography  Save this paper

Cluster analysis of stocks using price movements of high frequency data from National Stock Exchange

Author

Listed:
  • Charu Sharma

    (Shiv Nadar University, UP)

  • Amber Habib

    (Shiv Nadar University, UP)

  • Sunil Bowry

    (Shiv Nadar University, UP)

Abstract

This paper aims to develop new techniques to describe joint behavior of stocks, beyond regression and correlation. For example, we want to identify the clusters of the stocks that move together. Our work is based on applying Kernel Principal Component Analysis(KPCA) and Functional Principal Component Analysis(FPCA) to high frequency data from NSE. Since we dealt with high frequency data with a tick size of 30 seconds, FPCA seems to be an ideal choice. FPCA is a functional variant of PCA where each sample point is considered to be a function in Hilbert space L^2. On the other hand, KPCA is an extension of PCA using kernel methods. Results obtained from FPCA and Gaussian Kernel PCA seems to be in synergy but with a lag. There were two prominent clusters that showed up in our analysis, one corresponding to the banking sector and another corresponding to the IT sector. The other smaller clusters were seen from the automobile industry and the energy sector. IT sector was seen interacting with these small clusters. The learning gained from these interactions is substantial as one can use it significantly to develop trading strategies for intraday traders.

Suggested Citation

  • Charu Sharma & Amber Habib & Sunil Bowry, 2018. "Cluster analysis of stocks using price movements of high frequency data from National Stock Exchange," Papers 1803.09514, arXiv.org.
  • Handle: RePEc:arx:papers:1803.09514
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1803.09514
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. J. O. Ramsay, 1998. "Estimating smooth monotone functions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 365-375.
    2. Jiguo Cao & James Ramsay, 2007. "Parameter cascades and profiling in functional data analysis," Computational Statistics, Springer, vol. 22(3), pages 335-351, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christian Genest & Johanna G. Nešlehová, 2014. "A Conversation with James O. Ramsay," International Statistical Review, International Statistical Institute, vol. 82(2), pages 161-183, August.
    2. Eduardo L. Montoya & Wendy Meiring, 2016. "An F-type test for detecting departure from monotonicity in a functional linear model," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(2), pages 322-337, June.
    3. Cao, Jiguo & Ramsay, James O., 2009. "Generalized profiling estimation for global and adaptive penalized spline smoothing," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2550-2562, May.
    4. Zhang, Yu Yvette, 2017. "A shape constrained estimator of bidding function of first-price sealed-bid auctions," Economics Letters, Elsevier, vol. 150(C), pages 67-72.
    5. Wenchuan Liu & Yu Zhang & Qi Li, 2015. "A semiparametric varying coefficient model of monotone auction bidding processes," Empirical Economics, Springer, vol. 48(1), pages 313-335, February.
    6. Birke, Melanie & Dette, Holger, 2006. "Testing strict monotonicity in nonparametric regression," Technical Reports 2006,49, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    7. Shively, Thomas S. & Kockelman, Kara & Damien, Paul, 2010. "A Bayesian semi-parametric model to estimate relationships between crash counts and roadway characteristics," Transportation Research Part B: Methodological, Elsevier, vol. 44(5), pages 699-715, June.
    8. Zheng, Tingguo & Xiao, Han & Chen, Rong, 2015. "Generalized ARMA models with martingale difference errors," Journal of Econometrics, Elsevier, vol. 189(2), pages 492-506.
    9. Gattone, Stefano Antonio & Fortuna, Francesca & Evangelista, Adelia & Di Battista, Tonio, 2022. "Simultaneous confidence bands for the functional mean of convex curves," Econometrics and Statistics, Elsevier, vol. 24(C), pages 183-193.
    10. Brian Neelon & David B. Dunson, 2004. "Bayesian Isotonic Regression and Trend Analysis," Biometrics, The International Biometric Society, vol. 60(2), pages 398-406, June.
    11. Daniel R. Jiang & Warren B. Powell, 2015. "An Approximate Dynamic Programming Algorithm for Monotone Value Functions," Operations Research, INFORMS, vol. 63(6), pages 1489-1511, December.
    12. James P. Hughes & Patricia Totten, 2003. "Estimating the Accuracy of Polymerase Chain Reaction–Based Tests Using Endpoint Dilution," Biometrics, The International Biometric Society, vol. 59(3), pages 505-511, September.
    13. Christophe Abraham & Khader Khadraoui, 2015. "Bayesian regression with B-splines under combinations of shape constraints and smoothness properties," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 69(2), pages 150-170, May.
    14. repec:jss:jstsof:18:i04 is not listed on IDEAS
    15. Hazelton, Martin L. & Turlach, Berwin A., 2011. "Semiparametric regression with shape-constrained penalized splines," Computational Statistics & Data Analysis, Elsevier, vol. 55(10), pages 2871-2879, October.
    16. Boudaoud, S. & Rix, H. & Meste, O., 2010. "Core Shape modelling of a set of curves," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 308-325, February.
    17. Jan Humplik & Gašper Tkačik, 2017. "Probabilistic models for neural populations that naturally capture global coupling and criticality," PLOS Computational Biology, Public Library of Science, vol. 13(9), pages 1-26, September.
    18. C Rohrbeck & D A Costain & A Frigessi, 2018. "Bayesian spatial monotonic multiple regression," Biometrika, Biometrika Trust, vol. 105(3), pages 691-707.
    19. N. G. Cadigan & J. Brattey, 2003. "Semiparametric Estimation of Tag Loss and Reporting Rates for Tag-Recovery Experiments Using Exact Time-at-Liberty Data," Biometrics, The International Biometric Society, vol. 59(4), pages 869-876, December.
    20. Wagner, Heiko & Kneip, Alois, 2019. "Nonparametric registration to low-dimensional function spaces," Computational Statistics & Data Analysis, Elsevier, vol. 138(C), pages 49-63.
    21. Levent Kutlu & Shasha Liu & Robin C. Sickles, 2022. "Cost, Revenue, and Profit Function Estimates," Springer Books, in: Subhash C. Ray & Robert G. Chambers & Subal C. Kumbhakar (ed.), Handbook of Production Economics, chapter 16, pages 641-679, Springer.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1803.09514. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.