IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-37690-8.html
   My bibliography  Save this article

Characterization of genome-wide STR variation in 6487 human genomes

Author

Listed:
  • Yirong Shi

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Yiwei Niu

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Peng Zhang

    (Chinese Academy of Sciences)

  • Huaxia Luo

    (Chinese Academy of Sciences)

  • Shuai Liu

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Sijia Zhang

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Jiajia Wang

    (Chinese Academy of Sciences)

  • Yanyan Li

    (Chinese Academy of Sciences)

  • Xinyue Liu

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Tingrui Song

    (Chinese Academy of Sciences)

  • Tao Xu

    (Chinese Academy of Sciences
    Shandong First Medical University & Shandong Academy of Medical Sciences)

  • Shunmin He

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

Abstract

Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3′UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.

Suggested Citation

  • Yirong Shi & Yiwei Niu & Peng Zhang & Huaxia Luo & Shuai Liu & Sijia Zhang & Jiajia Wang & Yanyan Li & Xinyue Liu & Tingrui Song & Tao Xu & Shunmin He, 2023. "Characterization of genome-wide STR variation in 6487 human genomes," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37690-8
    DOI: 10.1038/s41467-023-37690-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-37690-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-37690-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Swapan Mallick & Heng Li & Mark Lipson & Iain Mathieson & Melissa Gymrek & Fernando Racimo & Mengyao Zhao & Niru Chennagiri & Susanne Nordenfelt & Arti Tandon & Pontus Skoglund & Iosif Lazaridis & Sri, 2016. "The Simons Genome Diversity Project: 300 genomes from 142 diverse populations," Nature, Nature, vol. 538(7624), pages 201-206, October.
    2. Mathys Grapotte & Manu Saraswat & Chloé Bessière & Christophe Menichelli & Jordan A. Ramilowski & Jessica Severin & Yoshihide Hayashizaki & Masayoshi Itoh & Michihira Tagami & Mitsuyoshi Murata & Miki, 2021. "Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    3. Bo Wen & Hui Li & Daru Lu & Xiufeng Song & Feng Zhang & Yungang He & Feng Li & Yang Gao & Xianyun Mao & Liang Zhang & Ji Qian & Jingze Tan & Jianzhong Jin & Wei Huang & Ranjan Deka & Bing Su & Ranajit, 2004. "Genetic evidence supports demic diffusion of Han culture," Nature, Nature, vol. 431(7006), pages 302-305, September.
    4. Nick Kinney & Lin Kang & Laurel Eckstrand & Arichanah Pulenthiran & Peter Samuel & Ramu Anandakrishnan & Robin T Varghese & P Michalak & Harold R Garner, 2019. "Abundance of ethnically biased microsatellites in human gene regions," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-20, December.
    5. Konrad J. Karczewski & Laurent C. Francioli & Grace Tiao & Beryl B. Cummings & Jessica Alföldi & Qingbo Wang & Ryan L. Collins & Kristen M. Laricchia & Andrea Ganna & Daniel P. Birnbaum & Laura D. Gau, 2020. "The mutational constraint spectrum quantified from variation in 141,456 humans," Nature, Nature, vol. 581(7809), pages 434-443, May.
    6. Peter H. Sudmant & Tobias Rausch & Eugene J. Gardner & Robert E. Handsaker & Alexej Abyzov & John Huddleston & Yan Zhang & Kai Ye & Goo Jun & Markus Hsi-Yang Fritz & Miriam K. Konkel & Ankit Malhotra , 2015. "An integrated map of structural variation in 2,504 human genomes," Nature, Nature, vol. 526(7571), pages 75-81, October.
    7. Ryan L. Collins & Harrison Brand & Konrad J. Karczewski & Xuefang Zhao & Jessica Alföldi & Laurent C. Francioli & Amit V. Khera & Chelsea Lowther & Laura D. Gauthier & Harold Wang & Nicholas A. Watts , 2020. "A structural variation reference for medical and population genetics," Nature, Nature, vol. 581(7809), pages 444-451, May.
    8. Ileena Mitra & Bonnie Huang & Nima Mousavi & Nichole Ma & Michael Lamkin & Richard Yanicky & Sharona Shleizer-Burko & Kirk E. Lohmueller & Melissa Gymrek, 2021. "Patterns of de novo tandem repeat mutations and their role in autism," Nature, Nature, vol. 589(7841), pages 246-250, January.
    9. Tuuli Lappalainen & Michael Sammeth & Marc R. Friedländer & Peter A. C. ‘t Hoen & Jean Monlong & Manuel A. Rivas & Mar Gonzàlez-Porta & Natalja Kurbatova & Thasso Griebel & Pedro G. Ferreira & Matthia, 2013. "Transcriptome and genome sequencing uncovers functional variation in humans," Nature, Nature, vol. 501(7468), pages 506-511, September.
    10. Javier Prado-Martinez & Peter H. Sudmant & Jeffrey M. Kidd & Heng Li & Joanna L. Kelley & Belen Lorente-Galdos & Krishna R. Veeramah & August E. Woerner & Timothy D. O’Connor & Gabriel Santpere & Alex, 2013. "Great ape genetic diversity and population history," Nature, Nature, vol. 499(7459), pages 471-475, July.
    11. Jill E. Moore & Michael J. Purcaro & Henry E. Pratt & Charles B. Epstein & Noam Shoresh & Jessika Adrian & Trupti Kawli & Carrie A. Davis & Alexander Dobin & Rajinder Kaul & Jessica Halow & Eric L. No, 2020. "Expanded encyclopaedias of DNA elements in the human and mouse genomes," Nature, Nature, vol. 583(7818), pages 699-710, July.
    12. Josine L Min & Jennifer M Taylor & J Brent Richards & Tim Watts & Fredrik H Pettersson & John Broxholme & Kourosh R Ahmadi & Gabriela L Surdulescu & Ernesto Lowy & Christian Gieger & Chris Newton-Cheh, 2011. "The Use of Genome-Wide eQTL Associations in Lymphoblastoid Cell Lines to Identify Novel Genetic Pathways Involved in Complex Traits," PLOS ONE, Public Library of Science, vol. 6(7), pages 1-14, July.
    13. David Jakubosky & Erin N. Smith & Matteo D’Antonio & Marc Bonder & William W. Young Greenwald & Agnieszka D’Antonio-Chronowska & Hiroko Matsui & Oliver Stegle & Stephen B. Montgomery & Christopher DeB, 2020. "Discovery and quality analysis of a comprehensive set of structural variants and short tandem repeats," Nature Communications, Nature, vol. 11(1), pages 1-15, December.
    14. David Jakubosky & Matteo D’Antonio & Marc Jan Bonder & Craig Smail & Margaret K. R. Donovan & William W. Young Greenwald & Hiroko Matsui & Agnieszka D’Antonio-Chronowska & Oliver Stegle & Erin N. Smit, 2020. "Properties of structural variants and short tandem repeats associated with gene expression and complex traits," Nature Communications, Nature, vol. 11(1), pages 1-15, December.
    15. Shubham Saini & Ileena Mitra & Nima Mousavi & Stephanie Feupe Fotsing & Melissa Gymrek, 2018. "A reference haplotype panel for genome-wide imputation of short tandem repeats," Nature Communications, Nature, vol. 9(1), pages 1-11, December.
    16. Frank R. Wendt & Gita A. Pathak & Renato Polimanti, 2022. "Phenome-wide association study of loci harboring de novo tandem repeat mutations in UK Biobank exomes," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    17. Carles A. Boix & Benjamin T. James & Yongjin P. Park & Wouter Meuleman & Manolis Kellis, 2021. "Regulatory genomic circuitry of human disease loci by integrative epigenomics," Nature, Nature, vol. 590(7845), pages 300-307, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhuoran Xu & Quan Li & Luigi Marchionni & Kai Wang, 2023. "PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    2. Jeffrey D. Wall & J. Fah Sathirapongsasuti & Ravi Gupta & Asif Rasheed & Radha Venkatesan & Saurabh Belsare & Ramesh Menon & Sameer Phalke & Anuradha Mittal & John Fang & Deepak Tanneeru & Manjari Des, 2023. "South Asian medical cohorts reveal strong founder effects and high rates of homozygosity," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    3. Ramesh Rajaby & Dong-Xu Liu & Chun Hang Au & Yuen-Ting Cheung & Amy Yuet Ting Lau & Qing-Yong Yang & Wing-Kin Sung, 2023. "INSurVeyor: improving insertion calling from short read sequencing data," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    4. Yuichi Shiraishi & Ai Okada & Kenichi Chiba & Asuka Kawachi & Ikuko Omori & Raúl Nicolás Mateos & Naoko Iida & Hirofumi Yamauchi & Kenjiro Kosaki & Akihide Yoshimi, 2022. "Systematic identification of intron retention associated variants from massive publicly available transcriptome sequencing data," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    5. Timothy D. Arthur & Jennifer P. Nguyen & Agnieszka D’Antonio-Chronowska & Hiroko Matsui & Nayara S. Silva & Isaac N. Joshua & André D. Luchessi & William W. Young Greenwald & Matteo D’Antonio & Martin, 2024. "Complex regulatory networks influence pluripotent cell state transitions in human iPSCs," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    6. Sudha Sunil Rajderkar & Kitt Paraiso & Maria Luisa Amaral & Michael Kosicki & Laura E. Cook & Fabrice Darbellay & Cailyn H. Spurrell & Marco Osterwalder & Yiwen Zhu & Han Wu & Sarah Yasmeen Afzal & Ma, 2024. "Dynamic enhancer landscapes in human craniofacial development," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    7. Jinlong Shi & Zhilong Jia & Jinxiu Sun & Xiaoreng Wang & Xiaojing Zhao & Chenghui Zhao & Fan Liang & Xinyu Song & Jiawei Guan & Xue Jia & Jing Yang & Qi Chen & Kang Yu & Qian Jia & Jing Wu & Depeng Wa, 2023. "Structural variants involved in high-altitude adaptation detected using single-molecule long-read sequencing," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    8. Qingbo S. Wang & Ryuya Edahiro & Ho Namkoong & Takanori Hasegawa & Yuya Shirai & Kyuto Sonehara & Hiromu Tanaka & Ho Lee & Ryunosuke Saiki & Takayoshi Hyugaji & Eigo Shimizu & Kotoe Katayama & Masahir, 2022. "The whole blood transcriptional regulation landscape in 465 COVID-19 infected samples from Japan COVID-19 Task Force," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    9. Xiaoling Tong & Min-Jin Han & Kunpeng Lu & Shuaishuai Tai & Shubo Liang & Yucheng Liu & Hai Hu & Jianghong Shen & Anxing Long & Chengyu Zhan & Xin Ding & Shuo Liu & Qiang Gao & Bili Zhang & Linli Zhou, 2022. "High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    10. Parithi Balachandran & Isha A. Walawalkar & Jacob I. Flores & Jacob N. Dayton & Peter A. Audano & Christine R. Beck, 2022. "Transposable element-mediated rearrangements are prevalent in human genomes," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    11. Manako Yamaguchi & Hirofumi Nakaoka & Kazuaki Suda & Kosuke Yoshihara & Tatsuya Ishiguro & Nozomi Yachida & Kyota Saito & Haruka Ueda & Kentaro Sugino & Yutaro Mori & Kaoru Yamawaki & Ryo Tamura & Sun, 2022. "Spatiotemporal dynamics of clonal selection and diversification in normal endometrial epithelium," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    12. Marsha M. Wheeler & Adrienne M. Stilp & Shuquan Rao & Bjarni V. Halldórsson & Doruk Beyter & Jia Wen & Anna V. Mihkaylova & Caitlin P. McHugh & John Lane & Min-Zhi Jiang & Laura M. Raffield & Goo Jun , 2022. "Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    13. Junho Kim & August Yue Huang & Shelby L. Johnson & Jenny Lai & Laura Isacco & Ailsa M. Jeffries & Michael B. Miller & Michael A. Lodato & Christopher A. Walsh & Eunjung Alice Lee, 2022. "Prevalence and mechanisms of somatic deletions in single human neurons during normal aging and in DNA repair disorders," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    14. Ludovica Montanucci & David Lewis-Smith & Ryan L. Collins & Lisa-Marie Niestroj & Shridhar Parthasarathy & Julie Xian & Shiva Ganesan & Marie Macnee & Tobias Brünger & Rhys H. Thomas & Michael Talkows, 2023. "Genome-wide identification and phenotypic characterization of seizure-associated copy number variations in 741,075 individuals," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    15. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    16. Estavoyer, Maxime & François, Olivier, 2022. "Theoretical analysis of principal components in an umbrella model of intraspecific evolution," Theoretical Population Biology, Elsevier, vol. 148(C), pages 11-21.
    17. Zhikun Wu & Zehang Jiang & Tong Li & Chuanbo Xie & Liansheng Zhao & Jiaqi Yang & Shuai Ouyang & Yizhi Liu & Tao Li & Zhi Xie, 2021. "Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    18. Chi-Fen Chang & Shu-Pin Huang & Yu-Mei Hsueh & Jiun-Hung Geng & Chao-Yuan Huang & Bo-Ying Bao, 2022. "Genetic Analysis Implicates Dysregulation of SHANK2 in Renal Cell Carcinoma Progression," IJERPH, MDPI, vol. 19(19), pages 1-9, September.
    19. Alexendar R. Perez & Laura Sala & Richard K. Perez & Joana A. Vidigal, 2021. "CSC software corrects off-target mediated gRNA depletion in CRISPR-Cas9 essentiality screens," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    20. Kai Yuan & Xumin Ni & Chang Liu & Yuwen Pan & Lian Deng & Rui Zhang & Yang Gao & Xueling Ge & Jiaojiao Liu & Xixian Ma & Haiyi Lou & Taoyang Wu & Shuhua Xu, 2021. "Refining models of archaic admixture in Eurasia with ArchaicSeeker 2.0," Nature Communications, Nature, vol. 12(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37690-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.