IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/ustxg.html
   My bibliography  Save this paper

Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Author

Listed:
  • Veale, Michael
  • Binns, Reuben

Abstract

Cite as: Veale, Michael and Binns, Reuben (2017) Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big Data & Society 4(2). doi:10.1177/2053951717743530 Decisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fair, accountable and transparent machine learning (FATML), their practical implementation faces real-world challenges. For legal, institutional or commercial reasons, organisations might not hold the data on sensitive attributes such as gender, ethnicity, sexuality or disability needed to diagnose and mitigate emergent indirect discrimination-by-proxy, such as redlining. Such organisations might also lack the knowledge and capacity to identify and manage fairness issues that are emergent properties of complex sociotechnical systems. This paper presents and discusses three potential approaches to deal with such knowledge and information deficits in the context of fairer machine learning. Trusted third parties could selectively store data necessary for performing discrimination discovery and incorporating fairness constraints into model-building in a privacy-preserving manner. Collaborative online platforms would allow diverse organisations to record, share and access contextual and experiential knowledge to promote fairness in machine learning systems. Finally, unsupervised learning and pedagogically interpretable algorithms might allow fairness hypotheses to be built for further selective testing and exploration. Real-world fairness challenges in machine learning are not abstract, constrained optimisation problems, but are institutionally and contextually grounded. Computational fairness tools are useful, but must be researched and developed in and with the messy contexts that will shape their deployment, rather than just for imagined situations. Not doing so risks real, near-term algorithmic harm.

Suggested Citation

  • Veale, Michael & Binns, Reuben, 2017. "Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data," SocArXiv ustxg, Center for Open Science.
  • Handle: RePEc:osf:socarx:ustxg
    DOI: 10.31219/osf.io/ustxg
    as

    Download full text from publisher

    File URL: https://osf.io/download/59f3559c9ad5a1026d107902/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/ustxg?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Munawer Sultan Khwaja & Rajul Awasthi & Jan Loeprick, 2011. "Risk-Based Tax Audits : Approaches and Country Experiences," World Bank Publications - Books, The World Bank Group, number 2314, December.
    2. Veale, Michael, 2017. "Logics and practices of transparency and opacity in real-world applications of public sector machine learning," SocArXiv 6cdhe, Center for Open Science.
    3. repec:elg:eebook:14251 is not listed on IDEAS
    4. Christian Bizer & Tom Heath & Tim Berners-Lee, 2009. "Linked Data - The Story So Far," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 5(3), pages 1-22, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Matus, Kira & Veale, Michael, 2021. "Certification Systems for Machine Learning: Lessons from Sustainability," SocArXiv pm3wy, Center for Open Science.
    2. Brielle Lillywhite & Gregor Wolbring, 2022. "Emergency and Disaster Management, Preparedness, and Planning (EDMPP) and the ‘Social’: A Scoping Review," Sustainability, MDPI, vol. 14(20), pages 1-50, October.
    3. Alexandru Constantin Ciobanu & Gabriela Meè˜Nièšä‚, 2021. "Ai Ethics In Business €“ A Bibliometric Approach," Review of Economic and Business Studies, Alexandru Ioan Cuza University, Faculty of Economics and Business Administration, issue 28, pages 169-202, December.
    4. Moritz Zahn & Stefan Feuerriegel & Niklas Kuehl, 2022. "The Cost of Fairness in AI: Evidence from E-Commerce," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 64(3), pages 335-348, June.
    5. Veale, Michael & Brass, Irina, 2019. "Administration by Algorithm? Public Management meets Public Sector Machine Learning," SocArXiv mwhnb, Center for Open Science.
    6. Veale, Michael & Van Kleek, Max & Binns, Reuben, 2018. "Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making," SocArXiv 8kvf4, Center for Open Science.
    7. Alina Köchling & Marius Claus Wehner, 2020. "Discriminated by an algorithm: a systematic review of discrimination and fairness by algorithmic decision-making in the context of HR recruitment and HR development," Business Research, Springer;German Academic Association for Business Research, vol. 13(3), pages 795-848, November.
    8. Veale, Michael & Binns, Reuben & Van Kleek, Max, 2018. "Some HCI Priorities for GDPR-Compliant Machine Learning," LawArXiv wm6yk, Center for Open Science.
    9. Irene Unceta & Jordi Nin & Oriol Pujol, 2020. "Risk mitigation in algorithmic accountability: The role of machine learning copies," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-26, November.
    10. Kira J.M. Matus & Michael Veale, 2022. "Certification systems for machine learning: Lessons from sustainability," Regulation & Governance, John Wiley & Sons, vol. 16(1), pages 177-196, January.
    11. Simerta Gill & Gregor Wolbring, 2022. "Auditing the ‘Social’ Using Conventions, Declarations, and Goal Setting Documents: A Scoping Review," Societies, MDPI, vol. 12(6), pages 1-100, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dina Pomeranz & José Vila-Belda, 2019. "Taking State-Capacity Research to the Field: Insights from Collaborations with Tax Authorities," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 755-781, August.
    2. Anne E Thessen & Cynthia Sims Parr, 2014. "Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life," PLOS ONE, Public Library of Science, vol. 9(3), pages 1-10, March.
    3. Kurt Sandkuhl & Hans-Georg Fill & Stijn Hoppenbrouwers & John Krogstie & Florian Matthes & Andreas Opdahl & Gerhard Schwabe & Ömer Uludag & Robert Winter, 2018. "From Expert Discipline to Common Practice: A Vision and Research Agenda for Extending the Reach of Enterprise Modeling," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 60(1), pages 69-80, February.
    4. Dai, Zhixin & Hogarth, Robin M. & Villeval, Marie Claire, 2015. "Ambiguity on audits and cooperation in a public goods game," European Economic Review, Elsevier, vol. 74(C), pages 146-162.
    5. Phillip Lord & Simon Cockell & Robert Stevens, 2012. "Three Steps to Heaven: Semantic Publishing in a Real World Workflow," Future Internet, MDPI, vol. 4(4), pages 1-12, November.
    6. Henselmann, Klaus & Haller, Stefanie, 2017. "Potentielle Risikofaktoren für die Erhöhung der Betriebsprüfungswahrscheinlichkeit - Eine analytische und empirische Untersuchung auf Basis der E-Bilanz-Taxonomie 6.0 -," Working Papers in Accounting Valuation Auditing 2017-1, Friedrich-Alexander University Erlangen-Nuremberg, Chair of Accounting and Auditing.
    7. Wuhui Chen & Incheon Paik, 2013. "Improving efficiency of service discovery using Linked data-based service publication," Information Systems Frontiers, Springer, vol. 15(4), pages 613-625, September.
    8. E. G. Stephan & T. O. Elsethagen & L. K. Berg & M. C. Macduff & P. R. Paulson & W. J. Shaw & C. Sivaraman & W. P. Smith & A. Wynne, 2016. "Semantic catalog of things, services, and data to support a wind data management facility," Information Systems Frontiers, Springer, vol. 18(4), pages 679-691, August.
    9. Hossein Hassani & Xu Huang & Mansi Ghodsi, 2018. "Big Data and Causality," Annals of Data Science, Springer, vol. 5(2), pages 133-156, June.
    10. Muhammad Sajid Qureshi & Ali Daud, 2021. "Fine-grained academic rankings: mapping affiliation of the influential researchers with the top ranked HEIs," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8331-8361, October.
    11. Costantino Thanos, 2017. "Research Data Reusability: Conceptual Foundations, Barriers and Enabling Technologies," Publications, MDPI, vol. 5(1), pages 1-19, January.
    12. Raymond Y. K. Lau & J. Leon Zhao & Wenping Zhang & Yi Cai & Eric W. T. Ngai, 2015. "Learning Context-Sensitive Domain Ontologies from Folksonomies: A Cognitively Motivated Method," INFORMS Journal on Computing, INFORMS, vol. 27(3), pages 561-578, August.
    13. Muhammad Ahtisham Aslam & Naif Radi Aljohani, 2017. "SPedia: A Central Hub for the Linked Open Data of Scientific Publications," International Journal on Semantic Web and Information Systems (IJSWIS), IGI Global, vol. 13(1), pages 128-147, January.
    14. Costantino Thanos, 2016. "A Vision for Open Cyber-Scholarly Infrastructures," Publications, MDPI, vol. 4(2), pages 1-18, May.
    15. Geser, G. & Jaques, Y. & Manouselis, Nikos & Protonotarios, Vassilis & Keizer, J. & Sicilia, M., 2012. "Building Blocks for a Data Infrastructure and Services to Empower Agricultural Research Communities," AGRIS on-line Papers in Economics and Informatics, Czech University of Life Sciences Prague, Faculty of Economics and Management, vol. 4(4), pages 1-8, December.
    16. Heimstädt, Maximilian & Saunderson, Fredric & Heath, Tom, 2014. "Conceptualizing Open Data ecosystems: A timeline analysis of Open Data development in the UK," Discussion Papers 2014/12, Free University Berlin, School of Business & Economics.
    17. Alex Coletti & Antonio De Nicola & Maria Luisa Villani, 2016. "Building climate change into risk assessments," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 84(2), pages 1307-1325, November.
    18. Eberhartinger, Eva & Safaei, Reyhaneh & Sureth, Caren & Wu, Yuchen, 2021. "Are risk-based tax audit stretegies rewarded? An analysis of corporate tax avoidance," arqus Discussion Papers in Quantitative Tax Research 267, arqus - Arbeitskreis Quantitative Steuerlehre.
    19. Heimstädt, Maximilian, 2017. "Openwashing: A decoupling perspective on organizational transparency," Technological Forecasting and Social Change, Elsevier, vol. 125(C), pages 77-86.
    20. Semjén, András, 2017. "Az adózói magatartás különféle magyarázatai [Various explanations for tax compliance]," Közgazdasági Szemle (Economic Review - monthly of the Hungarian Academy of Sciences), Közgazdasági Szemle Alapítvány (Economic Review Foundation), vol. 0(2), pages 140-184.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:ustxg. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.