Advances in Methods and Practices in Psychological Science

Just in Time or Just a Guess? Addressing Challenges in Validating Prediction Models Based on Longitudinal Data

Abstract

A common goal of researchers using intensive longitudinal data is to develop models that predict emotions or behaviors, often using passively collected data from smartphone sensors or wearable devices. A frequent use case for such models is the development of just-in-time adaptive interventions (JITAIs). However, real-world effectiveness depends on rigorous evaluation. Previous research has highlighted challenges in selecting appropriate evaluation methods. To address these, we review key pitfalls in predictive modeling and provide recommendations for avoiding them. We focus on a common problem: the mismatch between development, evaluation and application, and use simulations to illustrate three pitfalls. First, although models may perform well from applying group-level validation (area under the curve [AUC] = .82), they may lack the ability in predicting within-persons change (mean AUC = .54, SD = .13). For JITAIs, this will prevent the model from identifying intervention-delivery moments and will discriminate only between individuals. Second, ensuring adequate variability in the outcome variable is critical. If outcomes remain stable, frequent prediction may offer little practical benefit. Third, selecting appropriate baseline models is essential; models that appear effective may underperform compared with simple baselines (e.g., AUC = .82 vs. AUC = .96). To address these pitfalls, we present recommendations for matching validation and evaluation strategies to the intended use-case scenario and provide a tool that can help researchers investigate whether their strategy and goal are misaligned. This can help improve the effectiveness of predictive models and increase their utility in real-world applications.