Social Determinants in Machine Learning Cardiovascular Disease Prediction Models: A Systematic Review

  • Yuan Zhao
    Department of Epidemiology, NYU School of Global Public Health, New York University, New York, New York
    Search for articles by this author
  • Erica P. Wood
    Department of Social and Behavioral Sciences, NYU School of Global Public Health, New York University, New York, New York
    Search for articles by this author
  • Nicholas Mirin
    Department of Social and Behavioral Sciences, NYU School of Global Public Health, New York University, New York, New York
    Search for articles by this author
  • Stephanie H. Cook
    Department of Social and Behavioral Sciences, NYU School of Global Public Health, New York University, New York, New York

    Department of Biostatistics, NYU School of Global Public Health, New York University, New York, New York
    Search for articles by this author
  • Rumi Chunara
    Address correspondence to: Rumi Chunara, SM, PhD, Department of Computer Science & Engineering, Tandon School of Engineering, New York University, 370 Jay Street, 1106, Brooklyn NY 11201.
    Department of Biostatistics, NYU School of Global Public Health, New York University, New York, New York

    Department of Computer Science and Engineering, NYU Tandon School of Engineering, New York University, Brooklyn, New York
    Search for articles by this author


      Cardiovascular disease is the leading cause of death worldwide, and cardiovascular disease burden is increasing in low-resource settings and for lower socioeconomic groups. Machine learning algorithms are being developed rapidly and incorporated into clinical practice for cardiovascular disease prediction and treatment decisions. Significant opportunities for reducing death and disability from cardiovascular disease worldwide lie with accounting for the social determinants of cardiovascular outcomes. This study reviews how social determinants of health are being included in machine learning algorithms to inform best practices for the development of algorithms that account for social determinants.


      A systematic review using 5 databases was conducted in 2020. English language articles from any location published from inception to April 10, 2020, which reported on the use of machine learning for cardiovascular disease prediction that incorporated social determinants of health, were included.


      Most studies that compared machine learning algorithms and regression showed increased performance of machine learning, and most studies that compared performance with or without social determinants of health showed increased performance with them. The most frequently included social determinants of health variables were gender, race/ethnicity, marital status, occupation, and income. Studies were largely from North America, Europe, and China, limiting the diversity of the included populations and variance in social determinants of health.


      Given their flexibility, machine learning approaches may provide an opportunity to incorporate the complex nature of social determinants of health. The limited variety of sources and data in the reviewed studies emphasize that there is an opportunity to include more social determinants of health variables, especially environmental ones, that are known to impact cardiovascular disease risk and that recording such data in electronic databases will enable their use.
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to American Journal of Preventive Medicine
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • WHO
        Cardiovascular Diseases (CVDs).
        WHO, Geneva, SwitzerlandPublished May 17, 2017 (Accessed March 8, 2021)
        • Deaton C
        • Froelicher ES
        • Wu LH
        • Ho C
        • Shishani K
        • Jaarsma T
        The global burden of cardiovascular disease.
        Eur J Cardiovasc Nurs. 2011; 10: S5-S13
        • Heidenreich PA
        • Albert NM
        • Allen LA
        • et al.
        Forecasting the impact of heart failure in the United States: a policy statement from the American Heart Association.
        Circ Heart Fail. 2013; 6: 606-619
        • Critchley J
        • Liu J
        • Zhao D
        • Wei W
        • Capewell S
        Explaining the increase in coronary heart disease mortality in Beijing between 1984 and 1999.
        Circulation. 2004; 110: 1236-1244
        • Worth RM
        • Kato H
        • Rhoads GG
        • Kagan K
        • Syme SL
        Epidemiologic studies of coronary heart disease and stroke in Japanese men living in Japan, Hawaii and California: mortality.
        Am J Epidemiol. 1975; 102: 481-490
        • Patel JV
        • Vyas A
        • Cruickshank JK
        • et al.
        Impact of migration on coronary heart disease risk factors: comparison of Gujaratis in Britain and their contemporaries in villages of origin in India.
        Atherosclerosis. 2006; 185: 297-306
        • Hedlund E
        • Kaprio J
        • Lange A
        • et al.
        Migration and coronary heart disease: a study of Finnish twins living in Sweden and their co-twins residing in Finland.
        Scand J Public Health. 2007; 35: 468-474
        • Levenson JW
        • Skerrett PJ
        • Gaziano JM
        Reducing the global burden of cardiovascular disease: the role of risk factors.
        Prev Cardiol. 2002; 5: 188-199
        • Yusuf S
        • Joseph P
        • Rangarajan S
        • et al.
        Modifiable risk factors, cardiovascular disease, and mortality in 155 722 individuals from 21 high-income, middle-income, and low-income countries (PURE): a prospective cohort study [published correction appears in Lancet. 2020;395(10226):784].
        Lancet. 2020; 395: 795-808
        • WHO
        Closing the gap in a generation: health equity through action on the social determinants of health: commission on social determinants of health.
        WHO, Geneva, SwitzerlandPublished August 27, 2008 (Accessed March 8, 2021)
        • Havranek EP
        • Mujahid MS
        • Barr DA
        • et al.
        Social determinants of risk and outcomes for cardiovascular disease: a scientific statement from the American Heart Association.
        Circulation. 2015; 132: 873-898
        • Joseph P
        • Leong D
        • McKee M
        • et al.
        Reducing the global burden of cardiovascular disease, part 1: the epidemiology and risk factors.
        Circ Res. 2017; 121: 677-694
        • Tillmann T
        • Pikhart H
        • Peasey A
        • et al.
        Psychosocial and socioeconomic determinants of cardiovascular mortality in Eastern Europe: a multicentre prospective cohort study.
        PLoS Med. 2017; 14e1002459
        • He X
        • Matam BR
        • Bellary S
        • Ghosh G
        • Chattopadhyay AK
        CHD risk minimization through lifestyle control: machine learning gateway.
        Sci Rep. 2020; 10: 4090
        • Watson DS
        • Krutzinna J
        • Bruce IN
        • et al.
        Clinical applications of machine learning algorithms: beyond the black box.
        BMJ. 2019; 364: I886
        • Alaa AM
        • Bolton T
        • Di Angelantonio E
        • Rudd JHF
        • van der Schaar M
        Cardiovascular disease risk prediction using automated machine learning: a prospective study of 423,604 UK Biobank participants.
        PLoS One. 2019; 14e0213653
        • Dimopoulos AC
        • Nikolaidou M
        • Caballero FF
        • et al.
        Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk.
        BMC Med Res Methodol. 2018; 18: 179
        • Kakadiaris IA
        • Vrigkas M
        • Yen AA
        • Kuznetsova T
        • Budoff M
        • Naghavi M
        Machine learning outperforms ACC/AHA CVD risk calculator in MESA.
        J Am Heart Assoc. 2018; 7e009476
        • Cook NR
        • Ridker PM
        Further insight into the cardiovascular risk calculator: the roles of statins, revascularizations, and underascertainment in the Women's Health Study.
        JAMA Intern Med. 2014; 174: 1964-1971
        • Rose S
        Intersections of machine learning and epidemiological methods for health services research.
        Int J Epidemiol. 2021; 49: 1763-1770
        • Caballero FF
        • Soulis G
        • Engchuan W
        • et al.
        Advanced analytical methodologies for measuring healthy ageing and its determinants, using factor analysis and machine learning techniques: the ATHLOS project.
        Sci Rep. 2017; 7: 43955
        • Seligman B
        • Tuljapurkar S
        • Rehkopf D
        Machine learning approaches to the social determinants of health in the health and retirement study.
        SSM Popul Health. 2018; 4: 95-99
        • Kreatsoulas C
        • Anand SS
        The impact of social determinants on cardiovascular disease.
        Can J Cardiol. 2010; 26: 8C-13C
        • Bhatnagar A
        Environmental determinants of cardiovascular disease.
        Circ Res. 2017; 121: 162-180
        • Cheng I
        • Ho WE
        • Woo BK
        • Tsiang JT
        Correlations between health insurance status and risk factors for cardiovascular disease in the elderly Asian American population.
        Cureus. 2018; 10: e2303
        • Fang J
        • Yuan K
        • Gindi RM
        • Ward BW
        • Ayala C
        • Loustalot F
        Association of birthplace and coronary heart disease and stroke among U.S. adults: National Health Interview Survey, 2006 to 2014.
        J Am Heart Assoc. 2018; 7e008153
        • Lapane KL
        • Lasater TM
        • Allan C
        • Carleton RA
        Religion and cardiovascular disease risk.
        J Relig Health. 1997; 36: 155-164
        • Kotsiantis SB
        • Zaharakis I
        • Pintelas P
        Supervised machine learning: a review of classification techniques.
        Emerg Artif Intell Appl Comput Eng. 2007; 160 (Accessed March 8, 2021): 3-24
        • LeCun Y
        • Bengio Y
        • Hinton G
        Deep learning.
        Nature. 2015; 521: 436-444
        • GBD 2017 Causes of Death Collaborators
        Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980-2017: a systematic analysis for the Global Burden of Disease Study 2017 [published correction appears in Lancet. 2019;393(10190):e44].
        Lancet. 2018; 392: 1736-1788
        • Chen PC
        • Liu Y
        • Peng L
        How to develop machine learning models for healthcare.
        Nat Mater. 2019; 18: 410-414
        • Cheon S
        • Kim J
        • Lim J
        The use of deep learning to predict stroke patient mortality.
        Int J Environ Res Public Health. 2019; 16: 1876
        • Jabbar M
        • Deekshatulu B
        • Chndra P
        Alternating decision trees for early diagnosis of heart disease.
        in: Paper presented at: International Conference on Circuits, Communication, Control and Computing, Bangalore, IndiaNovember 21–22, 2014
        • McGeachie M
        • Ramoni RLB
        • Mychaleckyj JC
        • et al.
        Integrative predictive model of coronary artery calcification in arteriosclerosis.
        Circulation. 2009; 120: 2448-2454
        • Rasmy L
        • Wu Y
        • Wang N
        • et al.
        A study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous EHR data set.
        J Biomed Inform. 2018; 84: 11-16
        • Chen J
        • Li H
        • Luo L
        • et al.
        Machine learning-based forecast of hemorrhagic stroke healthcare service demand considering air pollution.
        J Healthc Eng. 2019; 20197463242
        • Illing B
        • Gerstner W
        • Brea J
        Biologically plausible deep learning—but how far can we go with shallow networks?.
        Neural Netw. 2019; 118: 90-101
        • Harrison RF
        • Kennedy RL
        Artificial neural network models for prediction of acute coronary syndromes using clinical data from the time of presentation.
        Ann Emerg Med. 2005; 46: 431-439
        • Hae H
        • Kang SJ
        • Kim WJ
        • et al.
        Machine learning assessment of myocardial ischemia using angiography: development and retrospective validation.
        PLoS Med. 2018; 15e1002693
        • Chu C
        • Hsu AL
        • Chou KH
        • Bandettini P
        • Lin C
        • Alzheimer's Disease Neuroimaging Initiative
        Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images.
        Neuroimage. 2012; 60: 59-70
        • Bishop CM
        Bayesian methods for Neural Networks.
        Neural Computing Research Group, Department of Computer Engineering and Applied Mathematics, Aston University, Birmingham, United KingdomPublished 1995
        • Dreiseitl S
        • Ohno-Machado L
        Logistic regression and artificial neural network classification models: a methodology review.
        J Biomed Inform. 2002; 35: 352-359
        • Ambale-Venkatesh B
        • Yang X
        • Wu CO
        • et al.
        Cardiovascular event prediction by machine learning: the multi-ethnic study of atherosclerosis.
        Circ Res. 2017; 121: 1092-1101
        • Sitar-tăut A
        • Zdrenghea D
        • Pop D
        • Sitar-tăut D
        Using machine learning algorithms in cardiovascular disease risk evaluation.
        J Appl Comput Sci Math. 2009; 1 (Accessed March 8, 2021.): 29-32
        • Weng SF
        • Reps J
        • Kai J
        • Garibaldi JM
        • Qureshi N
        Can machine-learning improve cardiovascular risk prediction using routine clinical data?.
        PLoS One. 2017; 12e0174944
        • Ahmad MA
        • Eckert C
        • Teredesai AM
        Interpretable machine learning in healthcare.
        in: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics; 2018 Aug 15. Association for Computing Machinery, Washington, DC. New York2018
        • Alpert MA
        • Omran J
        • Bostick BP
        Effects of obesity on cardiovascular hemodynamics, cardiac morphology, and ventricular function.
        Curr Obes Rep. 2016; 5: 424-434
        • Meade TW
        • Imeson J
        • Stirling Y
        Effects of changes in smoking and other characteristics on clotting factors and the risk of ischaemic heart disease.
        Lancet. 1987; 2: 986-988
        • Institute of Medicine
        Capturing Social and Behavioral Domains in Electronic Health Records: Phase 1.
        The National Academies Press, Washington, DC2014
        • Alley DE
        • Asomugha CN
        • Conway PH
        • Sanghavi DM
        Accountable health communities—addressing social needs through Medicare and Medicaid.
        N Engl J Med. 2016; 374: 8-11
        • Peasey A
        • Bobak M
        • Kubinova R
        • et al.
        Determinants of cardiovascular disease and other non-communicable diseases in Central and Eastern Europe: rationale and design of the HAPIEE study.
        BMC Public Health. 2006; 6: 255
        • Harper S
        • Lynch J
        • Smith GD
        Social determinants and the decline of cardiovascular diseases: understanding the links.
        Annu Rev Public Health. 2011; 32: 39-69
        • Yadlowsky S
        • Hayward RA
        • Sussman JB
        • McClelland RL
        • Min YI
        • Basu S
        Clinical implications of revised pooled cohort equations for estimating atherosclerotic cardiovascular disease risk.
        Ann Intern Med. 2018; 169: 20-29
        • Kleinberg J
        • Ludwig J
        • Mullainathan S
        • Rambachan A
        Algorithmic fairness.
        AEA Pap Proc. 2018; 108: 22-27
        • Garg A
        • Toy S
        • Tripodis Y
        • Silverstein M
        • Freeman E
        Addressing social determinants of health at well child care visits: a cluster RCT.
        Pediatrics. 2015; 135: e296-e304
        • Gottlieb LM
        • Tirozzi KJ
        • Manchanda R
        • Burns AR
        • Sandel MT
        Moving electronic medical records upstream: incorporating social determinants of health.
        Am J Prev Med. 2015; 48: 215-218
        • DeVoe JE
        • Bazemore AW
        • Cottrell EK
        • et al.
        Perspectives in primary care: a conceptual framework and path for integrating social determinants of health into primary care practice.
        Ann Fam Med. 2016; 14: 104-108
        • Mhasawade V
        • Chunara R
        Causal multi-level fairness.
        in: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’21), New York2021 May 19–21
        • Lund Crick
        • Brooke-Sumner C
        • Baingana F
        • et al.
        Social determinants of mental disorders and the Sustainable Development Goals: a systematic review of reviews.
        The Lancet Psychiatry. 2018; 5: 357-369