Poor Individual Risk Classi ﬁ cation From Adverse Childhood Experiences Screening

Introduction: Adverse childhood experiences confer an increased risk for physical and mental health problems across the population, prompting calls for routine clinical screening based on reported adverse childhood experience exposure. However, recent longitudinal research has questioned whether adverse childhood experiences can accurately identify ill health at an individual level. Methods: Revisiting data collected for the Adverse Childhood Experience Study between 1995 and 1997, this study derived approximate area under the curve estimates to test the ability of the retro-spectively reported adverse childhood experience score to discriminate between adults with and without a range of common health risk factors and disease conditions. Furthermore, the classi ﬁ cation accuracy of a recommended clinical de ﬁ nition for high-risk exposure ( ≥ 4 versus 0 − 3 adverse childhood experiences) was evaluated on the basis of sensitivity, speci ﬁ city, positive and negative predictive values, and positive likelihood ratios. Results: Across all health outcomes, the levels of discrimination for the continuous adverse childhood experience score ranged from very poor to fair (area under the curve=0.50 − 0.76). The binary classi ﬁ cation of ≥ 4 versus 0 − 3 adverse childhood experiences yielded high speci ﬁ city (true-negative detection) and negative predictive values (absence of ill health among low-risk adverse childhood experience groups). However, sensitivity (true-positive detection) and positive predictive values (presence of ill health among high-risk adverse childhood experience groups) were low, whereas positive likelihood ratios suggested only minimal-to-moderate increases in health risks among individuals reporting ≥ 4 adverse childhood experiences versus that among those reporting 0 − 3. Conclusions: These ﬁ ndings suggest that screening based on the adverse childhood experience score does not accurately identify those individuals at high risk of health problems. This can lead to both allocation of unnecessary interventions and lack of provision of necessary support. Second, for high-risk ACE exposure ( 4 versus 0 − 3 individual classi ﬁ cation was quanti ﬁ ed using the following meas-ures: sensitivity, speci ﬁ city, positive/negative predictive values, and positive likelihood ratios. 12


INTRODUCTION
A dverse childhood experiences (ACEs), including abuse, neglect, and household dysfunction (e.g., parental incarceration, mental illness, substance use), are consistently associated with elevated risk for physical and mental health conditions. 1,2 Given the global ubiquity of violence against children and broader family adversity, various regional, national, and international health bodies have advocated for routine ACE screening. 3,4 These efforts, they argue, would help to mitigate detrimental health effects by identifying individuals who may benefit from preventive interventions or other support services.
However, the clinical utility of widespread ACE screening has been questioned. 5−8 In particular, evidence linking high-risk ACE exposure (i.e., experiencing ≥4 different types of ACEs) 2,9 with ill health is largely based on population-level risk estimates (ORs/risk ratios). These metrics can quantify the mean differences in health risks between groups exposed to ≥4 ACEs and those exposed to 0−3 ACEs. However, to identify at-risk individuals, ACE scores must demonstrate sufficient discrimination (i.e., accurately differentiate between those with and without ill health). 7 Recent analyses of 2 population-representative birth cohorts found that the ability of the established highrisk cut off to discriminate individuals with later health problems from those without was universally poor, casting doubt on the ability of ACE screening to inform individual risk classification. 10 However, because these general population findings relied on ad hoc ACE measures that only partly reflect current screening practices, there is a need to estimate generalizability to higher-risk, treatment-seeking samples that utilize the conventional ACE questionnaire.

METHODS
Analyses revisited the seminal ACE Study-the first to systematically link ACEs and ill health. 9 In brief, among 8,506 adults (mean age=56.1 years, 52.1% female) who retrospectively reported ACEs after medical evaluation between 1995 and 1997, those reporting ≥4 ACEs (versus those reporting none) were significantly more likely to exhibit the most common health risk factors (e.g., smoking, obesity) and disease conditions (e.g., cancer, stroke) in the U.S.
In 2021, data from the original publication 9 were used to quantify individual discrimination and classification in 2 ways. First, for the continuous ACE score, the approximate area under the curve (AUC) 11 statistics were derived by trapezoidal integration of observed discriminative ability across several dichotomous cut offs. Second, for high-risk ACE exposure (≥4 versus 0−3 ACEs), individual classification was quantified using the following measures: sensitivity, specificity, positive/negative predictive values, and positive likelihood ratios. 12
Predictive values (which, unlike sensitivity and specificity, are influenced by the outcome's prevalence) can be used to estimate the probability of detecting an outcome on the basis of a particular test result (i.e., where disease status is unknown). Positive predictive values were low across outcomes (range=0.03 [drug injection] to 0.51 [depressed mood]; mean=0.15), indicating that adults reporting ≥4 ACEs had low rates of health problems. 12 13 Values indicated that individuals reporting ≥4 ACEs showed only minimal-to-moderate increases in their likelihood of exhibiting poor versus good health, suggesting limited value in using this definition of highrisk ACE exposure to classify ill health.

DISCUSSION
In replicating recent general population findings, 10 results obtained from the seminal ACE Study sample raise further concerns around the ACE score's ability to support individual classification of health risk. Proposed screening based on the advocated ≥4 ACEs cut off may help to rule out ill health among low-risk individuals (0 −3 ACEs) on the basis of high specificity and negative predictive values. However, low sensitivity and positive predictive values suggest that such screening would not accurately detect ill health among high-risk individuals (≥4 ACEs), undermining this cut off's utility in allocating preventative interventions. Poor classification estimates may have been influenced by relatively low prevalence rates among some health outcomes (Figure 1,

ARTICLE IN PRESS
light gray and black tiles), which can stifle detection. However, predictive performance was similarly poor for more prevalent outcomes (e.g., depressed mood, physical inactivity). Classification accuracy is also influenced by the distribution of ACEs in the sample: despite the greater increases in risk associated with inclusion in the high-risk group (≥4 ACEs), because most individuals are classed as low risk (0−3 ACEs) (Figure 1, white and light gray tiles), most of those with ill health are similarly found within these low-risk groups (Figure 1, light gray tiles).
From these findings, implementation of ACE screening may have detrimental consequences for resource allocation. Specifically, because most high-risk adults (≥4 ACEs) do not exhibit ill health, screening based on ACE scores alone would generate substantial numbers of false-positive results, potentially leading to inappropriate or unnecessary interventions (e.g., invasive testing/imaging, referral for psychiatric assessment). At the same time, because most adults with ill health were in the low-risk group (0−3 ACEs), such screening efforts would generate a substantial number of false-negative results, potentially excluding many individuals from beneficial health interventions.
The continuous ACE score's poor discrimination suggests that similar consequences would arise from any cut Note: Data were derived from Felitti et al. 9 with ≥4 ACEs used to define a high-risk ACE exposure group. a AUC was calculated by comparing sensitivity (true positive rate) and 1-specificity (false positive rate) at each individual ACE cut off (≥1, ≥2, ≥3, ≥4) before consolidating these in an omnibus AUC using trapezoidal integration by the trapz function within the caTools R package. Values range from 0.50 (random chance) to 1.00 (perfect discrimination), with performance interpreted using the following thresholds: very poor (0.5−0.6), poor (0.6 −0.7), fair (0.7−0.8), good (0.8−0.9), and excellent (0.9−1.0). 11 b Denotes the presence of at least 1 of the 10 individual health risk factors (smoking, severe obesity, physical inactivity, depressed mood, suicide attempt, alcoholism, illicit drug use, injected drug use, ≥50 lifetime sexual partners, or history of any sexually transmitted disease). ACE, adverse childhood experience; AUC, area under the (receiver operating characteristic) curve; LR+, positive likelihood ratio; NPV, negative predictive value; PPV, positive predictive value; Sens., sensitivity; Spec., specificity. For each mosaic plot, the relative size of individual tiles corresponds to cell frequencies from 2-way contingency tables for each health outcome in the ACE Study. Specifically, tile height is proportional to the number of individuals with 0−3 ACEs versus those with ≥4 ACEs, whereas tile width is proportional to the absence versus presence of the health outcome. As illustrated by the example plot in the bottom right-hand corner, this provides a visualization of the overall proportion of true negative (white tiles), false negative (light gray tiles), false positive (dark gray tiles), and true positive (black tiles) cases for each outcome. ACEs, adverse childhood experiences.

ARTICLE IN PRESS
off applied to this questionnaire. Unlike successful screening tools within cardiology and oncology, 14,15 no AUC values met the suggestive threshold for clinical utility (AUC>0.80). Although reliance on the ACE Study's retrospective adult data limits the ability to draw firm conclusions on the utility of ACE screening for secondary prevention, prospective findings suggest that childhood ACE screening offers a similarly poor prediction of health problems. 10 To enhance the clinical utility of ACE screening, prediction modeling techniques could be employed to ascribe differential weights to ACE questionnaire items on the basis of their relative contribution to discrimination of people with ill health from those without. Instead of relying on a simple count of exposures, this individualized approach could facilitate more nuanced risk indices from underlying components of the ACE score, without necessitating additional data collection. 6 In addition, multivariable prediction models that combine ACEs with other established health risk and protective factors may enhance individual classification by better capturing the multifactorial causes of disease. Several examples of such models have recently been derived for trauma-exposed children. 16−18 Of note, these ACEbased predictive algorithms should address common methodologic challenges (e.g., overfitting) and demonstrate reliable generalizability to new individuals. 7

CONCLUSIONS
Although ACE scores are important risk indicators across general and clinical populations, their utility in identifying individuals at heightened risk of ill health appears to be overstated. By shifting toward individualized classification metrics (i.e., discrimination, calibration), considering more sophisticated operationalizations of current ACE measures, and capitalizing on prediction modeling approaches, future research has the potential to improve the identification of individual health risks and, in turn, ensure that ACE screening can provide clinically meaningful health benefits. No financial disclosures were reported by the authors of this paper.