Classification of lapses in smokers attempting to stop: A supervised machine learning approach using data from a popular smoking cessation smartphone app

Perski, Olga, Li, Kezhi, Pontikos, Nikolas, Simons, David, Goldstein, Stephanie P., Naughton, Felix and Brown, Jamie (2023) Classification of lapses in smokers attempting to stop: A supervised machine learning approach using data from a popular smoking cessation smartphone app. Nicotine and Tobacco Research, 25 (7). 1330–1339. ISSN 1462-2203

[thumbnail of Perski_etal_2023_NaTR]
Preview
PDF (Perski_etal_2023_NaTR)
Available under License Creative Commons Attribution.

Download (985kB) | Preview

Abstract

Introduction: Smoking lapses after the quit date often lead to full relapse. To inform the development of real time, tailored lapse prevention support, we used observational data from a popular smoking cessation app to develop supervised machine learning algorithms to distinguish lapse from non-lapse reports. Aims and Methods: We used data from app users with ≥20 unprompted data entries, which included information about craving severity, mood, activity, social context, and lapse incidence. A series of group-level supervised machine learning algorithms (eg, Random Forest, XGBoost) were trained and tested. Their ability to classify lapses for out-of-sample (1) observations and (2) individuals were evaluated. Next, a series of individual-level and hybrid algorithms were trained and tested. Results: Participants (N = 791) provided 37 002 data entries (7.6% lapses). The best-performing group-level algorithm had an area under the receiver operating characteristic curve (AUC) of 0.969 (95% confidence interval [CI] = 0.961 to 0.978). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUC = 0.482–1.000). Individual-level algorithms could be constructed for 39/791 participants with sufficient data, with a median AUC of 0.938 (range: 0.518–1.000). Hybrid algorithms could be constructed for 184/791 participants and had a median AUC of 0.825 (range: 0.375–1.000). Conclusions: Using unprompted app data appeared feasible for constructing a high-performing group-level lapse classification algorithm but its performance was variable when applied to unseen individuals. Algorithms trained on each individual’s dataset, in addition to hybrid algorithms trained on the group plus a proportion of each individual’s data, had improved performance but could only be constructed for a minority of participants. Implications: This study used routinely collected data from a popular smartphone app to train and test a series of supervised machine learning algorithms to distinguish lapse from non-lapse events. Although a high-performing group-level algorithm was developed, it had variable performance when applied to new, unseen individuals. Individual-level and hybrid algorithms had somewhat greater performance but could not be constructed for all participants because of the lack of variability in the outcome measure. Triangulation of results with those from a prompted study design is recommended prior to intervention development, with real-world lapse prediction likely requiring a balance between unprompted and prompted app data.

Item Type: Article
Additional Information: Data Availability: The data and R code underpinning the analyses are available on GitHub (https://github.com/OlgaPerski/lapses_smokefree_ml). Funding information: OP and JB receive salary support from Cancer Research UK (PRCRPG-Nov21\100002). OP and JB are members of SPECTRUM, a UK Prevention Research Partnership Consortium (MR/S037519/1). UKPRP is an initiative funded by the UK Research and Innovation Councils, the Department of Health and Social Care (England) and the UK devolved administrations, and leading health research charities. DS is supported by a PhD studentship from the UK Biotechnology and Biological Sciences Research Council [BB/M009513/1]. KL is supported by the Rosetrees Trust (UCL-IHE-2020\102), NIHR (AI AWARD 01786) and EPSRC (EP/S021612/1). NP is supported by an NIHR AI Award (AI_AWARD02488).
Uncontrolled Keywords: medicine(all) ,/dk/atira/pure/subjectarea/asjc/2700
Faculty \ School: Faculty of Medicine and Health Sciences > School of Health Sciences
UEA Research Groups: Faculty of Medicine and Health Sciences > Research Centres > Norwich Institute for Healthy Aging
Faculty of Medicine and Health Sciences > Research Groups > Behavioural and Implementation Science
Faculty of Medicine and Health Sciences > Research Groups > Health Promotion
Faculty of Medicine and Health Sciences > Research Centres > Lifespan Health
Related URLs:
Depositing User: LivePure Connector
Date Deposited: 17 Apr 2023 16:30
Last Modified: 19 Oct 2023 03:35
URI: https://ueaeprints.uea.ac.uk/id/eprint/91822
DOI: 10.1093/ntr/ntad051

Actions (login required)

View Item View Item