Can a Single Variable Predict Early Dropout from Digital Health Interventions? Comparison of Predict
Friday, March 17, 2023
Posted by: Natalia Gromov
Bricker
J, Miao Z, Mull K, Santiago-Torres M, Vock DM.
Can a Single Variable
Predict Early Dropout from Digital Health Interventions? Comparison of
Predictive Models from Two Large Randomized Trials.
J Med Internet Res. 2023 Jan 20;25:e43629. doi: 10.2196/43629. PMID: 36662550;
PMCID: PMC9898835.
Background. A
single generalizable metric that accurately predicts early dropout from digital
health interventions has the potential to readily inform intervention targets
and treatment augmentations that could boost retention and intervention
outcomes. We recently identified a type of early dropout from digital health
interventions for smoking cessation, specifically, users who logged in during
the first week of the intervention and had little to no activity thereafter.
These users also had a substantially lower smoking cessation rate with our
iCanQuit smoking cessation app compared with users who used the app for longer
periods.
Objective. This
study aimed to explore whether log-in count data, using standard statistical
methods, can precisely predict whether an individual will become an iCanQuit
early dropout while validating the approach using other statistical methods and
randomized trial data from 3 other digital interventions for smoking cessation
(combined randomized N=4529).
Methods. Standard
logistic regression models were used to predict early dropouts for individuals
receiving the iCanQuit smoking cessation intervention app, the National Cancer
Institute QuitGuide smoking cessation intervention app, the WebQuit.org smoking
cessation intervention website, and the Smokefree.gov smoking cessation
intervention website. The main predictors were the number of times a
participant logged in per day during the first 7 days following randomization.
The area under the curve (AUC) assessed the performance of the logistic
regression models, which were compared with decision trees, support vector
machine, and neural network models. We also examined whether 13 baseline
variables that included a variety of demographics (eg, race and ethnicity,
gender, and age) and smoking characteristics (eg, use of e-cigarettes and
confidence in being smoke free) might improve this prediction.
Results. The
AUC for each logistic regression model using only the first 7 days of log-in
count variables was 0.94 (95% CI 0.90-0.97) for iCanQuit, 0.88 (95% CI
0.83-0.93) for QuitGuide, 0.85 (95% CI 0.80-0.88) for WebQuit.org, and 0.60
(95% CI 0.54-0.66) for Smokefree.gov. Replacing logistic regression models with
more complex decision trees, support vector machines, or neural network models
did not significantly increase the AUC, nor did including additional baseline
variables as predictors. The sensitivity and specificity were generally good,
and they were excellent for iCanQuit (ie, 0.91 and 0.85, respectively, at the
0.5 classification threshold).
Conclusions. Logistic
regression models using only the first 7 days of log-in count data were
generally good at predicting early dropouts. These models performed well when
using simple, automated, and readily available log-in count data, whereas
including self-reported baseline variables did not improve the prediction. The
results will inform the early identification of people at risk of early dropout
from digital health interventions with the goal of intervening further by
providing them with augmented treatments to increase their retention and,
ultimately, their intervention outcomes.
|
|