IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v71y2022i4p773-805.html
   My bibliography  Save this article

Exploring British accents: Modelling the trap–bath split with functional data analysis

Author

Listed:
  • Aranya Koshy
  • Shahin Tavakoli

Abstract

The sound of our speech is influenced by the places we come from. Great Britain contains a wide variety of distinctive accents which are of interest to linguistics. In particular, the ‘a’ vowel in words like ‘class’ is pronounced differently in the North and the South. Speech recordings of this vowel can be represented as formant curves or as mel‐frequency cepstral coefficient curves. Functional data analysis and generalised additive models offer techniques to model the variation in these curves. Our first aim was to model the difference between typical Northern and Southern vowels /æ/ and /ɑ/, by training two classifiers on the North‐South Class Vowels dataset collected for this paper. Our second aim is to visualise geographical variation of accents in Great Britain. For this we use speech recordings from a second dataset, the British National Corpus (BNC) audio edition. The trained models are used to predict the accent of speakers in the BNC, and then we model the geographical patterns in these predictions using a soap film smoother. This work demonstrates a flexible and interpretable approach to modelling phonetic accent variation in speech recordings.

Suggested Citation

  • Aranya Koshy & Shahin Tavakoli, 2022. "Exploring British accents: Modelling the trap–bath split with functional data analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(4), pages 773-805, August.
  • Handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:773-805
    DOI: 10.1111/rssc.12555
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12555
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12555?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Davide Pigoli & Pantelis Z. Hadjipantelis & John S. Coleman & John A. D. Aston, 2018. "The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1103-1145, November.
    2. Shahin Tavakoli & Davide Pigoli & John A. D. Aston & John S. Coleman, 2019. "A Spatial Modeling Approach for Linguistic Object Data: Analyzing Dialect Sound Variations Across Great Britain," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(527), pages 1081-1096, July.
    3. Simon N. Wood, 2011. "Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(1), pages 3-36, January.
    4. Clara Happ & Sonja Greven, 2018. "Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 649-659, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Edward Gunning & Steven Golovkine & Andrew J. Simpkin & Aoife Burke & Sarah Dillon & Shane Gore & Kieran Moran & Siobhan O’Connor & Enda White & Norma Bargary, 2025. "Analysing kinematic data from recreational runners using functional data analysis," Computational Statistics, Springer, vol. 40(4), pages 1825-1847, April.
    2. Gerhard Tutz & Moritz Berger, 2018. "Tree-structured modelling of categorical predictors in generalized additive regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 737-758, September.
    3. Tommaso Luzzati & Angela Parenti & Tommaso Rughi, 2017. "Spatial error regressions for testing the Cancer-EKC," Discussion Papers 2017/218, Dipartimento di Economia e Management (DEM), University of Pisa, Pisa, Italy.
    4. Davide Fiaschi & Andrea Mario Lavezzi & Angela Parenti, 2020. "Deep and Proximate Determinants of the World Income Distribution," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 66(3), pages 677-710, September.
    5. Amira Elayouty & Marian Scott & Claire Miller, 2022. "Time-Varying Functional Principal Components for Non-Stationary EpCO $$_2$$ 2 in Freshwater Systems," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 27(3), pages 506-522, September.
    6. Christina Kassara & Christos Barboutis & Anastasios Bounas, 2025. "Favorable stopover sites and fuel load dynamics of spring bird migrants under a changing climate," Climatic Change, Springer, vol. 178(1), pages 1-19, January.
    7. Longhi, Christian & Musolesi, Antonio & Baumont, Catherine, 2014. "Modeling structural change in the European metropolitan areas during the process of economic integration," Economic Modelling, Elsevier, vol. 37(C), pages 395-407.
    8. Dillon T. Fogarty & Caleb P. Roberts & Daniel R. Uden & Victoria M. Donovan & Craig R. Allen & David E. Naugle & Matthew O. Jones & Brady W. Allred & Dirac Twidwell, 2020. "Woody Plant Encroachment and the Sustainability of Priority Conservation Areas," Sustainability, MDPI, vol. 12(20), pages 1-15, October.
    9. Daniel Melser & Robert J. Hill, 2019. "Residential Real Estate, Risk, Return and Diversification: Some Empirical Evidence," The Journal of Real Estate Finance and Economics, Springer, vol. 59(1), pages 111-146, July.
    10. repec:grz:wpaper:2014-05 is not listed on IDEAS
    11. Sara Moscatelli & Simone Pesaresi & Martin Wikelski & Federico Maria Tardella & Andrea Catorci & Giacomo Quattrini, 2025. "Influence of Pasture Diversity and NDVI on Sheep Foraging Behavior in Central Italy," Geographies, MDPI, vol. 5(2), pages 1-14, June.
    12. Devin Kirk & Samantha Straus & Marissa L Childs & Mallory Harris & Lisa Couper & T Jonathan Davies & Coreen Forbes & Alyssa-Lois Gehman & Maya L Groner & Christopher Harley & Kevin D Lafferty & Van Sa, 2024. "Temperature impacts on dengue incidence are nonlinear and mediated by climatic and socioeconomic factors: A meta-analysis," PLOS Climate, Public Library of Science, vol. 3(3), pages 1-18, March.
    13. Ronald E. Gangnon & Natasha K. Stout & Oguzhan Alagoz & John M. Hampton & Brian L. Sprague & Amy Trentham-Dietz, 2018. "Contribution of Breast Cancer to Overall Mortality for US Women," Medical Decision Making, , vol. 38(1_suppl), pages 24-31, April.
    14. Yuko Araki & Atsushi Kawaguchi & Fumio Yamashita, 2013. "Regularized logistic discrimination with basis expansions for the early detection of Alzheimer’s disease based on three-dimensional MRI data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(1), pages 109-119, March.
    15. Weishampel, Anthony & Staicu, Ana-Maria & Rand, William, 2023. "Classification of social media users with generalized functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    16. Megan K. Jennings & Emily Haeuser & Diane Foote & Rebecca L. Lewison & Erin Conlisk, 2020. "Planning for Dynamic Connectivity: Operationalizing Robust Decision-Making and Prioritization Across Landscapes Experiencing Climate and Land-Use Change," Land, MDPI, vol. 9(10), pages 1-18, September.
    17. Robert J. Hill & Alicia N. Rambaldi & Michael Scholz, 2021. "Higher frequency hedonic property price indices: a state-space approach," Empirical Economics, Springer, vol. 61(1), pages 417-441, July.
    18. Adam R. Pines & Bart Larsen & Zaixu Cui & Valerie J. Sydnor & Maxwell A. Bertolero & Azeez Adebimpe & Aaron F. Alexander-Bloch & Christos Davatzikos & Damien A. Fair & Ruben C. Gur & Raquel E. Gur & H, 2022. "Dissociable multi-scale patterns of development in personalized brain networks," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    19. Marra, Giampiero & Wood, Simon N., 2011. "Practical variable selection for generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2372-2387, July.
    20. Wenyi Lin & Jingjing Zou & Chongzhi Di & Dorothy D. Sears & Cheryl L. Rock & Loki Natarajan, 2023. "Longitudinal Associations Between Timing of Physical Activity Accumulation and Health: Application of Functional Data Methods," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(2), pages 309-329, July.
    21. Archimbaud, Aurore & Boulfani, Feriel & Gendre, Xavier & Nordhausen, Klaus & Ruiz-Gazen, Anne & Virta, Joni, 2025. "ICS for multivariate functional anomaly detection with applications to predictive maintenance and quality control," Econometrics and Statistics, Elsevier, vol. 33(C), pages 282-303.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:71:y:2022:i:4:p:773-805. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.