IDEAS home Printed from https://ideas.repec.org/a/dem/demres/v53y2025i22.html

Online obituaries as a complementary source of data for mortality in Canada

Author

Listed:
  • Pietro Violo

    (Université de Montréal)

  • Nadine Ouellette

    (Université de Montréal)

Abstract

Background: Obituaries and death notices have existed for centuries as a form of commemoration, particularly in Western countries. With the rise of the internet, these records have become more accessible, presenting a valuable, largely untapped source for mortality research. Objective: We aim to collect online obituaries through web scraping and evaluate their representa-tiveness, advantages, and limitations for use in mortality studies in Canada’s two largest provinces: Quebec and Ontario. Methods: We web scraped 236,290 and 288,623 obituaries for Quebec and Ontario, respectively, spanning the years 2017 to 2022. Using regular expressions, a formal language for defining text-search patterns, we derived demographic variables from the text to compute mortality measures, which we then compared to a gold-standard vital statistics dataset. Results: Although obituaries in Quebec and Ontario respectively account for only half and one-third of all recorded deaths, the age and gender distributions they capture closely align with those of the general population. Infant deaths remain notably underrepresented. Life expectancy estimates derived from obituaries exceed official figures by 0.4 years for women and 0.5 for men, while the modal age at death is slightly underestimated. Despite these limitations, the timeliness and demographic representativeness of online obituaries make them a valuable supplement to conventional mortality datasets in Canada. Contribution: This study draws attention to an underused data source by leveraging Canada’s bilingual context and developing methods for extracting demographic information from both French and English obituaries. We contribute to digital and computational demography by detailing techniques for web scraping, data cleaning, extraction, and validation, and by assessing coverage, age structures, gender disparities, and inherent biases in this type of textual data.

Suggested Citation

  • Pietro Violo & Nadine Ouellette, 2025. "Online obituaries as a complementary source of data for mortality in Canada," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 53(22), pages 661-704.
  • Handle: RePEc:dem:demres:v:53:y:2025:i:22
    DOI: 10.4054/DemRes.2025.53.22
    as

    Download full text from publisher

    File URL: https://www.demographic-research.org/volumes/vol53/22/53-22.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4054/DemRes.2025.53.22?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Debra J. Bassett, 2015. "Who Wants to Live Forever? Living, Dying and Grieving in Our Digital Society," Social Sciences, MDPI, vol. 4(4), pages 1-13, November.
    2. Jeremy Ginsberg & Matthew H. Mohebbi & Rajan S. Patel & Lynnette Brammer & Mark S. Smolinski & Larry Brilliant, 2009. "Detecting influenza epidemics using search engine query data," Nature, Nature, vol. 457(7232), pages 1012-1014, February.
    3. Diego Alburez-Gutierrez & Ugofilippo Basellini & Emilio Zagheni, 2025. "When do mothers bury a child? Heterogeneity in the maternal age at offspring loss," Population Studies, Taylor & Francis Journals, vol. 79(1), pages 45-57, January.
    4. Ariel Karlinsky, 2024. "International completeness of death registration," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 50(38), pages 1151-1170.
    5. Nadine Ouellette & Robert Bourbeau, 2011. "Changes in the age-at-death distribution in four low mortality countries: A nonparametric approach," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 25(19), pages 595-628.
    6. Camarda, Carlo G., 2012. "MortalitySmooth: An R Package for Smoothing Poisson Counts with P-Splines," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 50(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ugofilippo Basellini & Søren Kjærgaard & Carlo Giovanni Camarda, 2020. "An age-at-death distribution approach to forecast cohort mortality," Working Papers axafx5_3agsuwaphvlfk, French Institute for Demographic Studies.
    2. Basellini, Ugofilippo & Kjærgaard, Søren & Camarda, Carlo Giovanni, 2020. "An age-at-death distribution approach to forecast cohort mortality," Insurance: Mathematics and Economics, Elsevier, vol. 91(C), pages 129-143.
    3. Marco Bonetti & Ugofilippo Basellini, 2021. "Epilocal: A real-time tool for local epidemic monitoring," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 44(12), pages 307-332.
    4. Cosmo Strozza & Marie-Pier Bergeron-Boucher & Julia Callaway & Sven Drefahl, 2024. "Forecasting Inequalities in Survival to Retirement Age by Socioeconomic Status in Denmark and Sweden," European Journal of Population, Springer;European Association for Population Studies, vol. 40(1), pages 1-28, December.
    5. Paola Vázquez-Castillo & Trifon Missov & Marie-Pier Bergeron-Boucher, 2024. "Longevity à la mode: A discretized derivative tests method for accurate estimation of the adult modal age at death," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 50(11), pages 325-346.
    6. Carlo Giovanni Camarda, 2019. "Smooth constrained mortality forecasting," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 41(38), pages 1091-1130.
    7. Viorela Diaconu & Nadine Ouellette & Robert Bourbeau, 2020. "Modal lifespan and disparity at older ages by leading causes of death: a Canada-U.S. comparison," Journal of Population Research, Springer, vol. 37(4), pages 323-344, December.
    8. Viorela Diaconu & Robert Bourbeau & Nadine Ouellette & Carlo Giovanni Camarda, 2016. "Insight on 'typical' longevity: An analysis of the modal lifespan by leading causes of death in Canada," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 35(17), pages 471-504.
    9. David H Chae & Sean Clouston & Mark L Hatzenbuehler & Michael R Kramer & Hannah L F Cooper & Sacoby M Wilson & Seth I Stephens-Davidowitz & Robert S Gold & Bruce G Link, 2015. "Association between an Internet-Based Measure of Area Racism and Black Mortality," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-12, April.
    10. Xiaoli Wang & Shuangsheng Wu & C Raina MacIntyre & Hongbin Zhang & Weixian Shi & Xiaomin Peng & Wei Duan & Peng Yang & Yi Zhang & Quanyi Wang, 2015. "Using an Adjusted Serfling Regression Model to Improve the Early Warning at the Arrival of Peak Timing of Influenza in Beijing," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-14, March.
    11. Ishani Chaudhuri & Parthajit Kayal, 2022. "Predicting Power of Ticker Search Volume in Indian Stock Market," Working Papers 2022-214, Madras School of Economics,Chennai,India.
    12. Yang, Xin & Pan, Bing & Evans, James A. & Lv, Benfu, 2015. "Forecasting Chinese tourist volume with search engine data," Tourism Management, Elsevier, vol. 46(C), pages 386-397.
    13. Kuchler, Theresa & Russel, Dominic & Stroebel, Johannes, 2022. "JUE Insight: The geographic spread of COVID-19 correlates with the structure of social networks as measured by Facebook," Journal of Urban Economics, Elsevier, vol. 127(C).
    14. Markowitz, Sara & Nesson, Erik & Robinson, Joshua J., 2019. "The effects of employment on influenza rates," Economics & Human Biology, Elsevier, vol. 34(C), pages 286-295.
    15. Bentzen, Jeanet Sinding, 2021. "In crisis, we pray: Religiosity and the COVID-19 pandemic," Journal of Economic Behavior & Organization, Elsevier, vol. 192(C), pages 541-583.
    16. Jesse T. Richman & Ryan J. Roberts, 2023. "Assessing Spurious Correlations in Big Search Data," Forecasting, MDPI, vol. 5(1), pages 1-12, February.
    17. Linus Schiöler & Marianne Fris�n, 2012. "Multivariate outbreak detection," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(2), pages 223-242, April.
    18. Sasikiran Kandula & Jeffrey Shaman, 2019. "Reappraising the utility of Google Flu Trends," PLOS Computational Biology, Public Library of Science, vol. 15(8), pages 1-16, August.
    19. repec:plo:pone00:0018687 is not listed on IDEAS
    20. Daniel E. O'Leary, 2024. "Toward an extended framework of exhaust data for predictive analytics: An empirical approach," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 31(2), June.
    21. Grechyna, Daryna, 2025. "Raising awareness of climate change: Nature, activists, politicians?," Ecological Economics, Elsevier, vol. 227(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • J1 - Labor and Demographic Economics - - Demographic Economics
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dem:demres:v:53:y:2025:i:22. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Editorial Office (email available below). General contact details of provider: https://www.demogr.mpg.de/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.