If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Harvard Medical School, Boston, MassachusettsDivision of General Pediatrics, Boston Children's Hospital, Boston, MassachusettsComputational Statistics and Machine Learning Group, Department of Statistics, University of Oxford, Oxford, United KingdomWellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Previously estimated effects of social distancing do not account for changes in individual behavior before the implementation of stay-at-home policies or model this behavior in relation to the burden of disease. This study aims to assess the asynchrony between individual behavior and government stay-at-home orders, quantify the true impact of social distancing using mobility data, and explore the sociodemographic variables linked to variation in social distancing practices.
This study was a retrospective investigation that leveraged mobility data to quantify the time to behavioral change in relation to the initial presence of COVID-19 and the implementation of government stay-at-home orders. The impact of social distancing that accounts for both individual behavior and testing data was calculated using generalized mixed models. The role of sociodemographics in accounting for variation in social distancing behavior was modeled using a 10-fold cross-validated elastic net (linear machine learning model). Analysis was conducted in April‒July 2020.
Across all the 1,124 counties included in this analysis, individuals began to socially distance at a median of 5 days (IQR=3−8) after 10 cumulative cases of COVID-19 were confirmed in their state, with state governments taking a median of 15 days (IQR=12−19) to enact stay-at-home orders. Overall, people began social distancing at a median of 12 days (IQR=8−17) before their state enacted stay-at-home orders. Of the 16 studies included in the review, 13 exclusively used government dates as a proxy for social distancing behavior, and none accounted for both testing and mobility. Using government stay-at-home dates as a proxy for social distancing (10.2% decrease in the number of daily cases) accounted for only 55% of the true impact of the intervention when compared with estimates using mobility (18.6% reduction). Using 10-fold cross-validation, 23 of 43 sociodemographic variables were significantly and independently predictive of variation in individual social distancing, with delays corresponding to an increase in a county's proportion of people without a high school diploma and proportion of racial and ethnic minorities.
This retrospective analysis of mobility patterns found that social distancing behavior occurred well before the onset of government stay-at-home dates. This asynchrony leads to the underestimation of the impact of social distancing. Sociodemographic characteristics associated with delays in social distancing can help explain the disproportionate case burden and mortality among vulnerable communities.
More than 90% of the global population has been under some form of social lockdown since the beginning of the coronavirus disease 2019 (COVID-19) pandemic, with >7,062,464 confirmed cases and 403,921 deaths reported worldwide.
In the U.S., social distancing has been the primary nonpharmaceutical intervention employed to minimize the spread of the virus. Quantifying the mitigating impact, if any, of social distancing policies on COVID-19 disease spread is critical for evaluating the efficacy of social restrictions and informing future health policy decisions.
The purpose of this study is to characterize the potential asynchrony between individual social distancing behavior and government stay-at-home policies. In particular, the time it took individuals to change their behavior in relation to the initial presence of COVID-19 within their state and the implementation of government stay-at-home orders is quantified. A systematic review reveals that of 16 studies focused on analyzing the impact of social distancing, 13 use government stay-at-home dates for modeling, and none explicitly quantify this asynchrony.
This asynchrony between individual behaviors and government actions may impact social distancing among the socially vulnerable. In particular, it can explain why vulnerable communities faced a higher disease burden and the risk of serious complications during the pandemic.
Building on these observations, machine learning techniques were subsequently used to model which sociodemographic variables can predict a community's mobility response to the pandemic. The models provide the strongest evidence to date that variation in individual behavior across certain groups may partially account for the disproportionate impact of COVID-19 on vulnerable communities throughout the U.S.
This was a 4-part study that included (1) quantification of the timeframe between changes in county-level mobility relative to both state-level COVID-19 prevalence and gubernatorial action, (2) a systematic literature review of previous studies examining social distancing efficacy, (3) an assessment of the degree of underestimation of the impact of social distancing as estimated using government stay-at-home dates as proxies for behavior when compared with estimates using mobility data, and (4) a machine learning modeling of the sociodemographic variables to explain the variation in county-level delays in social distancing.
Investigators obtained COVID-19 cases, deaths, and state policy dates from the open-access New York Times GitHub and website.
which use anonymized cell phone data to track mobility across grocery and pharmacy, parks, transit stations, retail and recreation, residential, and workplace domains.
A computerized search spanning from January 1, 2020 to May 9, 2020 was conducted in PubMed and medRxiv. In both databases, the following search terms were used: social distancing AND COVID-19. A total of 2 authors independently examined the titles and abstracts of potentially relevant studies identified by the computerized search. A detailed evaluation of manuscripts was performed for eligible studies in the systematic review. The exclusion criteria were the following: absence of empirical modeling, strictly international studies, literature reviews, opinion pieces, articles in other languages without English translations, and studies that did not examine an intervention effect. The following information was recorded from the included trials: first author, date of publication, observed effect, testing data, use of government stay-at-home dates, and use of mobility data. Nonpeer-reviewed studies from medRxiv were included to fully encompass the most recent research related to the impact of social distancing and COVID-19.
Google Mobility Reports provide mobility trends for 1,569 counties. Residential mobility was used as a proxy for social distancing behavior. Counties were included only if they had ≥45 days of residential mobility reported. Only states that enacted a stay-at-home order were analyzed. In total, 42 states and 1,124 counties that fit these inclusion criteria were included. Of those, 767 are classified as urban by the Rural–Urban Continuum Code, and 357 are classified as rural.
After data curation, the first step in this analysis was to calculate each county's changepoint for residential mobility, that is, when individuals in a given county rapidly began to increase the amount of time spent at home. Changepoint in the mobility trends for the 1,124 counties were estimated using the pruned exact linear time algorithm and a standard modified Bayes information criterion penalty with the changepoint package in R.
The pruned exact linear time algorithm minimizes a linear cost function to allow for exact calculation of a changepoint, with the modified Bayes information criterion penalty guarding against overfitting.
All the transition points for all the 1,124 counties were confirmed visually (Appendix Table 1, available online).
A total of 3 different measurements of time (Public Response, Unaccounted Distancing, and Government Response) were subsequently created (Appendix Figure 1, available online). Public Response was defined as the difference in days between individuals’ mobility changepoint and the date at which their state reaches 10 cumulative cases. This describes how long it takes for a community to socially distance after COVID-19 has been confirmed in their state. Unaccounted Distancing was defined as the difference in days between the mobility changepoint and the state's stay-at-home date, representing the unaccounted time of individual social distancing behavior that predates government intervention. Government Response was defined as the difference in days between a government's stay-at-home order and the date at which the state reaches 10 cumulative cases. This describes how long it takes state governments to enact stay-at-home policies after COVID-19’s prevalence was confirmed in their state.
Generalized mixed models (Poisson) quantified the degree of underestimation of the impact of social distancing when using government stay-at-home dates as a proxy for social distancing compared with efficacy estimates using actual mobility data. These models were fitted for the daily number of cases with state-specific intercepts and interaction terms between social distancing policy and time (day) by region and adjusted for number of tests and demographic characteristics (population density and proportion of individuals aged >65 years). The assumed COVID-19 incubation period was 5 days on the basis of previous studies, and a 28-day window (14 days before and 14 days after incubation) was used.
In sensitivity analyses, the analysis was repeated with an extended incubation period of 7 days and a shorter window period (20-day window: 10 days before and 10 days after the end of the incubation period).
The machine learning approach involved constructing linear regression models with combined L1 and L2 priors as regularizers (elastic nets) using 43 a priori sociodemographic variables for the 1,124 counties. These methods add restrictions when fitting the models to eliminate variables that contain limited information and select the most impactful variables. Public Response and Government Response were modeled separately, but in both cases, the approach is the same. Before analysis, no pairwise correlation exceeded an absolute magnitude of 0.90. Model performance was assessed using r2 score and 10-fold cross-validation; state effects were accounted for a priori with a simple linear model, and, thus, state of origin did not have an impact on predictive power. This adjustment also facilitated the use of linear regression models (the independent variables were largely linear). For each of the constructed models, approximately 10% of the counties for testing (n= 113 or 112) were withheld. The model was subsequently trained on the remaining 90% of counties (n= 1,011 or 1,012). Model performance was then assessed on the approximately 10% test set of counties that had been withheld from the entire training and parameter optimization process. Coefficients for the socioeconomic variables were estimated by averaging across all 10-folds. CIs were estimated by calculating the SE on the 10-folds. Data were analyzed with R, version 3.6.3, and Python 3.
Across all the 42 states with government stay-at-home orders, individuals began to spend more time at home before the enactment of government lockdowns and continued to stay-at-home after the expiration of the lockdown policy. Individuals in the 8 remaining states without stay-at-home orders also spent more time at home without any government intervention (Figure 1). After COVID-19 was confirmed in the U.S. (January 21, 2020), it took a median of 54 days (IQR=50−56) (Appendix Figure 2, available online) for a state to reach 10 cumulative cases; 90% of states fell within an 18-day window (5% percentile=30.2 days–95th percentile=48 days). Hereafter, locally prevalent is defined as the point in time when 10 cumulative cases were confirmed in the state.
Across all the 1,124 counties, individuals began to socially distance a median of 5 days after COVID-19 became locally prevalent (IQR=3−8 days) (Figure 2A). The quickest state to socially distance was West Virginia, with a mobility changepoint 5 days before COVID-19 was locally prevalent. The slowest states to change mobility behavior were Texas and California, both reacting 20 days after COVID-19 became locally prevalent. State governments took a median of 15 days to enact stay-at-home orders after COVID-19 became locally prevalent (IQR=12−19 days) (Figure 2B). The fastest state government to issue a stay-at-home order was West Virginia (3 days), and the slowest was Texas (38 days).
The time difference between the engagement of social distancing and government policy implementation also varied across states (Figure 2C). Overall, people began social distancing at a median of 12 days before their state enacted stay-at-home orders (IQR=8−17 days). Individuals in South Carolina and Missouri began to socially distance well before their governments intervened at a median of 22 and 20 days before implementation of stay-at-home orders, respectively. People in Illinois and California began to socially distance much closer to the onset of government intervention, with a mobility changepoint 4 and 5 days before implementation of government policy, respectively.
Between states, there was clearly significant variation in both how long it took individuals to quarantine once COVID-19 became locally prevalent (Figure 2A) as well as the time taken for governments to enact stay-at-home orders (Figure 2B). Within states, variation also existed between counties in their respective mobility response. Howard, Texas took the longest to socially distance (24 days) after COVID-19 became locally prevalent, whereas Cascade, Montana and Berkeley, West Virginia were quick responders, with a changepoint 6 days before COVID-19 became locally prevalent (Appendix Table 1, available online).
After noting the asynchrony between people's behavior and government action, the authors were interested in exploring how previous studies of social distancing have accounted for this discrepancy (Appendix Text 1 and Appendix Table 2, available online). Briefly, of 16 studies included in the systematic review, 13 studies exclusively used government dates for modeling. No studies quantified the asynchrony between individual behavior and government, and none directly accounted for both mobility and testing data when estimating the impact of social distancing policy. Using generalized mixed models, government stay-at-home dates as a proxy for social distancing (10.2% decrease in the number of daily cases, 95% CI=10.1%, 10.3%) accounts for only 55% of the true impact of the intervention when compared with generalized mixed modeling estimates using mobility changepoints (18.6% reduction in the number of daily cases, 95% CI=18.1%, 19.1%). The raw unexponentiated coefficients are tabulated in Table 1, and marginal plots are depicted in Appendix Figures 4 and 5 (available online). Sensitivity analyses based on varying the incubation periods (extending to 7 days) and shorter postpolicy implementation analysis period (10-day window) yielded similar results (Table 1).
Table 1Generalized Mixed Model Coefficients and Sensitivity Analyses for the Impact of Social Distancing Using Government Stay-At-Home Dates Versus Residential Mobility as Proxies
Mobility × time
Government × time
Incubation: 5 days
Window: ±14 days
Mobility × time
Government × time
Incubation: 5 days
Window: ±10 days
Mobility × time
Government × time
Incubation: 7 days
Window: ±14 days
Mobility × time
Government × time
Incubation: 7 days
Window: ±10 days
Note: Boldface indicates statistical significance (p<0.01). “A × B” denotes an interaction term between A and B.
After adjusting for state and county effects, sociodemographic variables were predictive of both individual response to the local prevalence of COVID-19 and the delay of state government action in relation to this behavior change (explaining 11.8% variation, range=2.10%–20.8%, 10-fold cross-validated). Of the 43 a priori specified sociodemographic variables (Appendix Table 3, available online), the 10-fold cross-validated elastic net model found 23 to be significantly and independently predictive (Table 2). Notably, accounting for all other variables, an SD (10.3%) decrease in a county's proportion of people with a bachelor's degree or higher corresponded to a 0.12-day delay in mobility changepoint (95% CI= −0.14, −0.10). This equates to a 1-day difference between the county with the lowest proportion of bachelor's degrees (8.2%) and the one with the highest (74.6%). Independently, an SD (5.2%) increase in a county's proportion of people without a high school diploma corresponded to a 0.10-day delay in mobility changepoint (95% CI=0.08, 0.13). This also equates to an approximate 1-day difference between the county with the highest (48.5%) and that with the lowest proportion (2%) of individuals without a high school diploma. Similarly, after accounting for all other sociodemographic variables, for each SD (19.0%) increase in a county's proportion of racial and ethnic minorities, there was a 0.11-day delay in mobility changepoint (95% CI=0.08, 0.15), which corresponds to a half-day difference between the county with the highest (99.3%) and lowest proportion (2.8%) of non-White, non-Hispanic constituents. The models for delay in Government Response, relative to individual behavior change, yielded identical results, with flipped coefficient signs (Appendix Table 4, available online).
Table 2Time to Mobility Changepoint After Disease Becomes Locally Prevalent (i.e., Public Response) and 23 Predictive Sociodemographic Variables, Coefficients, and CI
Elastic net coefficient(averaged over 10-fold cross-validation)
SE(calculated over 10-fold cross-validation)
Bachelor or higher, %
Commute worked at home, %
Different house in the U.S. 1 year ago, %
Households male householder no wife present family, %
As of June 8, 2020, the U.S. had 1,997,695 COVID-19 cases and 112,558 COVID-19 deaths. Every state has now lifted some aspect of social restriction owing to the growing economic pressures facing their communities.
These decisions might have been partly based on previous models of social distancing efficacy that, as demonstrated in this study, have not properly accounted for testing capacity or social mobility predating government intervention. The systematic review only yielded 1 manuscript that incorporated both variables but still used government dates as a proxy for social distancing. Previous studies have highlighted a lack of testing data as a limitation because as testing capacity inevitably increases, more cases will be diagnosed. This observed increase in cases will blunt the flattening of the curve potential of social distancing.
Although this study does not aim to predict the future of the pandemic, it is concerning that the current predictive Susceptible–Exposed–Infectious–Recovered models may not be properly accounting for these factors.
As demonstrated in this study, people began social distancing well before governments took strong action against COVID-19, and people continue to social distance after reopening. Furthermore, the analysis demonstrated that the use of government stay-at-home dates as the set point for the start of social distancing underestimates the true impact of social distancing by approximately 55% because individuals began to change mobility at a median of 12 days before stay-at-home orders, which is more than twice the incubation period of the disease.
Across the 1,124 counties included in this analysis, the median changepoint in mobility occurred 5 days after COVID-19 became locally prevalent. This is especially true for many southern states, including South Carolina and Missouri, where the use of government dates fails to capture almost 3 weeks’ worth of public distancing efforts. Although the changepoint analysis was limited to the U.S., this methodology can be reproduced in other countries as governments strive to properly track the efficacy of their respective interventions. Furthermore, the slight increase in case growth rate observed in states that reopened (with southern states now largely seeing the most case growth
This study is the first to directly analyze variation in social distancing practices as a contributor to the aforementioned disparities. This study also highlights critical sociodemographic variables that independently explain individual social distancing response to the local presence of COVID-19. Of the 43 variables, 23 were found to be significantly and independently predictive, explaining 11.8% variation (range =2.10%–20.8%, 10-fold cross-validated). These variables fell into 6 broad categories: education, health status, nationality, income/occupation, military status, and household characteristics.
It was observed that counties with lower educational attainment took longer to socially distance. Researchers at the University of Southern California Schaeffer (Los Angeles, CA) have noted that people in lower educational brackets perceive that their risk of COVID-19 infection is lower.
Furthermore, others have shown that education fosters a trust in science, and a lack of trust in science is associated with less compliance with COVID-19 prevention guidelines (N Plohl, unpublished data, April 2020).
This lack of scientific trust may explain the perceived lack of infection risk, which, combined, may explain why these counties with lower average educational attainment took longer to social distance. Lower educational attainment is also associated with lower medical literacy, making navigating the dynamic guidelines surrounding the COVID-19 pandemic more difficult.
All the 3 hypotheses may account for the fact that 2 separate variables of educational attainment (percentage with a bachelor's degree or higher and percentage without a high school diploma) were independently significant.
Counties with a greater proportion of non-English speakers had greater delays in social distancing. Pandemics require rapid dissemination of information, and non-English translations of these resources lead to an inevitable delay in access by non-English speakers. Efforts such as Contra COVID and the COVID-19 Health Literacy project are essential to mitigating potential inequalities surrounding information accessibility.
Furthermore, counties with a greater proportion of non-White residents, independent of other covariates, had greater delays in social distancing as well. Black and Hispanic populations have historically reported higher levels of physician distrust than reported their White counterparts.
This distrust can manifest in the context of the COVID-19 pandemic as a predominantly White healthcare field attempts to prescribe social distancing practices to minority populations who have been victims of medical injustices throughout history. Because delays in social distancing contribute to a greater risk of infection, the higher case burden and mortality observed in these populations may be partially explained by the findings described in the study by N Dreher (unpublished data, May 2020). Further interpretation of these results should be mindful of potential misrepresentation of causal links.
The remaining significant variables and limitations of this study are discussed in Appendix Text 2 (available online).
Changes in mobility are asynchronous to government policy. Using government dates as a set point to determine the impact of social distancing fails to capture social distancing that predated government intervention and underestimates the true impact of social distancing. Counties with lower educational attainment, a higher proportion of minorities, and non-English speakers exhibited greater delays in social distancing, which may explain the disproportionate case burden and mortality among these vulnerable communities. Future investigation on the effects of social distancing should not solely rely on policy timepoints but instead should take into account predated awareness and action.
Moustafa Abdalla, Arjan Abar, and Evan R. Beiter contributed equally to this work.
Per the Common Rule, IRB review was not required for this study, which used deidentified, publicly available data.
This study did not have any sponsors or funding sources.
No financial disclosures were reported by the authors of this paper.