IDEAS home Printed from https://ideas.repec.org/a/dem/demres/v53y2025i22.html

Online obituaries as a complementary source of data for mortality in Canada

Author

Listed:
  • Pietro Violo

    (Université de Montréal)

  • Nadine Ouellette

    (Université de Montréal)

Abstract

Background: Obituaries and death notices have existed for centuries as a form of commemoration, particularly in Western countries. With the rise of the internet, these records have become more accessible, presenting a valuable, largely untapped source for mortality research. Objective: We aim to collect online obituaries through web scraping and evaluate their representa-tiveness, advantages, and limitations for use in mortality studies in Canada’s two largest provinces: Quebec and Ontario. Methods: We web scraped 236,290 and 288,623 obituaries for Quebec and Ontario, respectively, spanning the years 2017 to 2022. Using regular expressions, a formal language for defining text-search patterns, we derived demographic variables from the text to compute mortality measures, which we then compared to a gold-standard vital statistics dataset. Results: Although obituaries in Quebec and Ontario respectively account for only half and one-third of all recorded deaths, the age and gender distributions they capture closely align with those of the general population. Infant deaths remain notably underrepresented. Life expectancy estimates derived from obituaries exceed official figures by 0.4 years for women and 0.5 for men, while the modal age at death is slightly underestimated. Despite these limitations, the timeliness and demographic representativeness of online obituaries make them a valuable supplement to conventional mortality datasets in Canada. Contribution: This study draws attention to an underused data source by leveraging Canada’s bilingual context and developing methods for extracting demographic information from both French and English obituaries. We contribute to digital and computational demography by detailing techniques for web scraping, data cleaning, extraction, and validation, and by assessing coverage, age structures, gender disparities, and inherent biases in this type of textual data.

Suggested Citation

  • Pietro Violo & Nadine Ouellette, 2025. "Online obituaries as a complementary source of data for mortality in Canada," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 53(22), pages 661-704.
  • Handle: RePEc:dem:demres:v:53:y:2025:i:22
    DOI: 10.4054/DemRes.2025.53.22
    as

    Download full text from publisher

    File URL: https://www.demographic-research.org/volumes/vol53/22/53-22.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4054/DemRes.2025.53.22?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Debra J. Bassett, 2015. "Who Wants to Live Forever? Living, Dying and Grieving in Our Digital Society," Social Sciences, MDPI, vol. 4(4), pages 1-13, November.
    2. Camarda, Carlo G., 2012. "MortalitySmooth: An R Package for Smoothing Poisson Counts with P-Splines," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 50(i01).
    3. Jeremy Ginsberg & Matthew H. Mohebbi & Rajan S. Patel & Lynnette Brammer & Mark S. Smolinski & Larry Brilliant, 2009. "Detecting influenza epidemics using search engine query data," Nature, Nature, vol. 457(7232), pages 1012-1014, February.
    4. Diego Alburez-Gutierrez & Ugofilippo Basellini & Emilio Zagheni, 2025. "When do mothers bury a child? Heterogeneity in the maternal age at offspring loss," Population Studies, Taylor & Francis Journals, vol. 79(1), pages 45-57, January.
    5. Ariel Karlinsky, 2024. "International completeness of death registration," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 50(38), pages 1151-1170.
    6. Nadine Ouellette & Robert Bourbeau, 2011. "Changes in the age-at-death distribution in four low mortality countries: A nonparametric approach," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 25(19), pages 595-628.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ugofilippo Basellini & Søren Kjærgaard & Carlo Giovanni Camarda, 2020. "An age-at-death distribution approach to forecast cohort mortality," Working Papers axafx5_3agsuwaphvlfk, French Institute for Demographic Studies.
    2. Basellini, Ugofilippo & Kjærgaard, Søren & Camarda, Carlo Giovanni, 2020. "An age-at-death distribution approach to forecast cohort mortality," Insurance: Mathematics and Economics, Elsevier, vol. 91(C), pages 129-143.
    3. Marco Bonetti & Ugofilippo Basellini, 2021. "Epilocal: A real-time tool for local epidemic monitoring," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 44(12), pages 307-332.
    4. Cosmo Strozza & Marie-Pier Bergeron-Boucher & Julia Callaway & Sven Drefahl, 2024. "Forecasting Inequalities in Survival to Retirement Age by Socioeconomic Status in Denmark and Sweden," European Journal of Population, Springer;European Association for Population Studies, vol. 40(1), pages 1-28, December.
    5. Paola Vázquez-Castillo & Trifon Missov & Marie-Pier Bergeron-Boucher, 2024. "Longevity à la mode: A discretized derivative tests method for accurate estimation of the adult modal age at death," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 50(11), pages 325-346.
    6. Carlo Giovanni Camarda, 2019. "Smooth constrained mortality forecasting," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 41(38), pages 1091-1130.
    7. Viorela Diaconu & Nadine Ouellette & Robert Bourbeau, 2020. "Modal lifespan and disparity at older ages by leading causes of death: a Canada-U.S. comparison," Journal of Population Research, Springer, vol. 37(4), pages 323-344, December.
    8. Viorela Diaconu & Robert Bourbeau & Nadine Ouellette & Carlo Giovanni Camarda, 2016. "Insight on 'typical' longevity: An analysis of the modal lifespan by leading causes of death in Canada," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 35(17), pages 471-504.
    9. Xiaoli Wang & Shuangsheng Wu & C Raina MacIntyre & Hongbin Zhang & Weixian Shi & Xiaomin Peng & Wei Duan & Peng Yang & Yi Zhang & Quanyi Wang, 2015. "Using an Adjusted Serfling Regression Model to Improve the Early Warning at the Arrival of Peak Timing of Influenza in Beijing," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-14, March.
    10. Markowitz, Sara & Nesson, Erik & Robinson, Joshua J., 2019. "The effects of employment on influenza rates," Economics & Human Biology, Elsevier, vol. 34(C), pages 286-295.
    11. Bentzen, Jeanet Sinding, 2021. "In crisis, we pray: Religiosity and the COVID-19 pandemic," Journal of Economic Behavior & Organization, Elsevier, vol. 192(C), pages 541-583.
    12. Jesse T. Richman & Ryan J. Roberts, 2023. "Assessing Spurious Correlations in Big Search Data," Forecasting, MDPI, vol. 5(1), pages 1-12, February.
    13. repec:plo:pone00:0018687 is not listed on IDEAS
    14. Grechyna, Daryna, 2025. "Raising awareness of climate change: Nature, activists, politicians?," Ecological Economics, Elsevier, vol. 227(C).
    15. Yangkun Huang & Xiaoping Xu & Sini Su, 2021. "Diverging from News Media: An Exploratory Study on the Changing Dynamics between Media and Public Attention on Cancer in China from 2011–2020," IJERPH, MDPI, vol. 18(16), pages 1-13, August.
    16. Sean Coogan & Zhixian Sui & David Raubenheimer, 2018. "Gluttony and guilt: monthly trends in internet search query data are comparable with national-level energy intake and dieting behavior," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 4(1), pages 1-9, December.
    17. Borah, Abhishek & Rutz, Oliver, 2024. "Enhanced sales forecasting model using textual search data: Fusing dynamics with big data," International Journal of Research in Marketing, Elsevier, vol. 41(4), pages 632-647.
    18. Tobias Preis & Federico Botta & Helen Susannah Moat, 2020. "Sensing global tourism numbers with millions of publicly shared online photographs," Environment and Planning A, , vol. 52(3), pages 471-477, May.
    19. Liwen Ling & Dabin Zhang & Shanying Chen & Amin W. Mugera, 2020. "Can online search data improve the forecast accuracy of pork price in China?," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(4), pages 671-686, July.
    20. Klaus Ackermann & Simon D Angus & Paul A Raschky, 2020. "Estimating Sleep and Work Hours from Alternative Data by Segmented Functional Classification Analysis, SFCA," SoDa Laboratories Working Paper Series 2020-04, Monash University, SoDa Laboratories.
    21. Daniele Barchiesi & Helen Susannah Moat & Christian Alis & Steven Bishop & Tobias Preis, 2015. "Quantifying International Travel Flows Using Flickr," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-8, July.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • J1 - Labor and Demographic Economics - - Demographic Economics
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dem:demres:v:53:y:2025:i:22. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Editorial Office (email available below). General contact details of provider: https://www.demogr.mpg.de/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.