Leveraging Machine Learning to Advance Alcohol Research: Current Applications, Challenges, and Opportunities

Qingyu Zhao; Kilian M. Pohl

Qingyu Zhao¹ and Kilian M. Pohl^{2, 3}

¹Department of Radiology, Weill Cornell Medicine, New York, New York

²Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, California

³Department of Electrical Engineering, Stanford University, Stanford, California

Volume 46, Issue 1 ⦁ Article Number: 03 ⦁ https://doi.org/10.35946/arcr.v46.1.03

Abstract

PURPOSE: The review surveys the type of machine learning approaches currently used in the alcohol literature, reviews challenges in applying machine learning tools to alcohol data, and explores how overcoming these challenges could advance personalized medicine for alcohol use disorder (AUD).

SEARCH METHODS: The authors conducted a search of publications on PubMed, ScienceDirect, and EBSCO Academic Search Premier published from 2015 to April 15, 2025, for articles that used machine learning to analyze alcohol-related outcomes. Search terms were (“drinking” OR “alcohol”) AND (“machine learning” OR “deep learning” OR “predict” OR “classify”) in the title or abstract.

SEARCH RESULTS: The search returned 2,618 manuscripts. Keeping those that predicted alcohol-related outcomes and excluding those that merely used alcohol as a predictor for other outcomes reduced the selection to 567 manuscripts. A final manual selection resulted in 110 original peer-reviewed human research studies that primarily analyzed alcohol consumption behaviors and tested their models on data that they were not trained on.

DISCUSSION AND CONCLUSIONS: Predictions focused on alcohol consumption or AUD diagnosis in cohorts with a mean age of 50 years or younger (i.e., when long-term drinking behaviors are being or have been established). Most studies confined the data-driven searches to a single modality and relied on conventional machine learning approaches, which tended to produce accurate and transparent predictions on the relatively small datasets typically collected by AUD studies. The small number of available samples was the most common limitation mentioned by the reviewed articles. Investigators also wished for machine learning models to provide insights about causality. Gaining these insights will be essential to improve diagnosis and treatment of AUD, for which the field must foster multidisciplinary research teams to build rigorous and trustworthy machine learning models and quantitative benchmarks that can capture the multifaceted nature of alcohol use and its comorbidities.

Key Takeaways

Alcohol-related publications using machine learning almost exclusively relied on conventional techniques, whereas current public discourse emphasizes state-of-the-art models.
A majority of models predicted alcohol consumption or alcohol use disorder (AUD) diagnosis, which is generally easier to forecast than, for example, disorder or treatment outcome.
Only 40% of models utilized multimodal data, which is needed for encoding the complexity of AUD and related clinical outcomes.
Addressing the complexity of AUD requires creating machine learning models and quantitative benchmarks that accurately capture the multifaceted nature of alcohol use and its comorbidities.

Introduction

Alcohol use disorder (AUD) currently affects 27.9 million Americans¹ and costs the United States more than $250 billion annually.² It is the fourth leading preventable cause of death,³ underscoring the urgent need for effective strategies to understand, predict, and prevent alcohol misuse. At the mechanistic level, AUD arises from complex interactions between neural, genetic, and environmental factors that collectively influence vulnerability and resilience to alcohol’s effects. Genetic and epigenetic factors account for 40% to 60% of a person’s addiction risk,^4,5 although psychological and environmental factors (such as family history,^6,7 traumatic events,⁸ peer pressure,^9,10 other drug use,^11,12 and externalizing behaviors^13,14) also play crucial roles in shaping drinking behaviors. However, findings about these predictive factors are often inconsistent or weak: many individuals without known risk factors develop AUD,¹⁵ whereas at-risk individuals remain resilient¹⁶ (e.g., approximately 50% of maltreated youth¹⁷). Equally pressing is the challenge of predicting treatment response and long-term abstinence. Although pharmacological and behavioral interventions are available, only 7.9% of people with AUD receive alcohol use treatment in the United States,¹⁸ and of those, only 16% achieve abstinence.¹⁹

One major hurdle in alcohol research has been the predominant reliance on fragmented, small-scale datasets unable to reveal the complex and heterogeneous nature of AUD.^20,21 The need for larger heterogeneous data sets for AUD is recognized by funding agencies.²² Since 2012, the National Institute of Alcohol Abuse and Alcoholism has funded the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) study²³ to annually collect brain magnetic resonance imaging (MRI) data, neuropsychology testing, alcohol use, and related data of 831 individuals who were age 12 to 21 at baseline. NCANDA was the first to report on in vivo disruption due to alcohol of white matter microstructural development during adolescence.²⁴ An even larger sample (more than 11,000 children age 9 to 10 years at baseline) has been recruited by the Adolescent Brain Cognitive Development (ABCD) Study,²⁵ which is tracking brain development, behavior, and environmental changes to gain insights into how early life factors shape substance use and mental health outcomes. Over three decades, the Collaborative Studies on the Genetics of Alcoholism (COGA) has collected phenotypic data²⁶ from nearly 18,000 individuals ages 7 to 97 across more than 2,200 families, as well as DNA and electrophysiological measures on a large subset. The longitudinal data set provides a unique opportunity for researchers to identify genes influencing the risk of AUD and related outcomes. Another key effort is the Monitoring the Future (MTF) study—an ongoing, nationally representative study that has been surveying more than 25,000 8th, 10th, and 12th grade students²⁷ and approximately 20,000 adults ages 19 to 65 each year since 1973.²⁸ By capturing longitudinal behavioral and attitudinal changes, MTF offers a unique resource for understanding population-level patterns and risk factors for substance use. Finally, aiming to advance personalized medicine, the All of Us Research Program²⁹ is planning to collect multimodal data (including genomic profiles, electronic health records, and environmental exposure) from at least 1 million people in the United States to discern how individual differences in lifestyle, environment, and biology affect health outcomes, such as the relationship between risky drinking behavior and cancer diagnosis.³⁰

In addition to increasing sample sizes, alcohol studies have expanded the scope and diversity of data collected from each participant to enable more precise characterization of them. Current studies increasingly integrate more comprehensive self-reports,³¹ medical records,³² genetic profiling,³³ neuropsychological testing,³⁴ and imaging.³⁵ The advent of wearable technologies and ubiquitous mobile devices (such as smartphones³⁶ and biosensors measuring physical activity³⁷ and intoxication³⁸) has made it possible to conduct ecological momentary assessments.^39,40 This surge in data volume introduces substantial computational and methodological challenges, regardless of cohort size.

The emergence of larger and more diverse data sets calls for a transformation in fundamental analytical approaches. Alcohol research has traditionally relied on hypothesis-driven analyses, which involve preselecting a narrow set of variables^41,42 and applying univariate regression models at the group level to test pairwise associations. Although conceptually straightforward and statistically interpretable, this approach fragments understanding into siloed single factors that each explain only a small fraction of variance in alcohol outcomes, failing to capture the intricate multivariate interactions among behavioral traits, cognitive functions, environmental influences, and neural mechanisms that underlie AUD.^43,44 Furthermore, the findings only reflect group-level trends that generally do not translate to individual-level insights.⁴⁵

A promising alternative is machine learning (Figure 1), which is designed to translate complex, multimodal data into predictions on an individual basis.⁴⁶ Machine learning is a branch of artificial intelligence (AI) that enables computers to identify interactions among measurements (i.e., patterns) predictive of the outcome directly from data rather than relying on predefined statistical formulas or human-crafted rules. Typically, a machine learning model needs to be trained on a large number of samples, where each sample is described by a set of features (predictor variables) and a target outcome variable of interest (e.g., diagnosis, symptom severity, drinking level, or treatment response). In alcohol research, predictors may span multiple domains, including neuroimaging (e.g., functional connectivity, cortical thickness, regional volume, or microstructural integrity), behavioral and cognitive traits (e.g., measures of impulsivity, decision-making, or sensation seeking), environmental exposures (e.g., peer drinking, family history, or neighborhood stressors), and genetic or physiological markers (e.g., polygenic risk scores, heart rate, sleep disturbance). The goal of the model is to integrate information across these domains to generate a single prediction of the chosen target outcome specific to each individual.

Figure 1. Diagram of workflows in studies using machine learning to predict alcohol use disorder frequency and outcomes. Sections depict raw data, machine learning, outcomes, training, and testing.

Figure 1. Machine learning methods can transform complex, multimodal data into individualized predictions of alcohol use disorder (AUD) and related outcomes. Data modalities used by studies reviewed here included, among others, behavior, biological specimens, demographics, sensor data, neuroimaging, genes, and mental health assessments. Raw data were either directly analyzed by deep learning models or first extracted into aggregate measurements for use in conventional machine learning approaches, which were used in most studies reviewed. AUD-related outcomes predicted by these methods encompassed AUD diagnosis, consumption patterns, prognosis, relapse, and treatment response. To obtain the predictions, the studies first trained the machine learning models by determining the parameter setting that minimized the difference between predicted and observed outcomes on the training data set. The studies then measured the accuracy of the predictions and the constellation of measurements (i.e., pattern) that drove the predictions, on a separate test set.

During training, the model learns (or is trained) by iteratively adjusting its internal parameters to minimize the prediction error, which is the difference between actual and predicted outcome of the training samples. Once this error is minimized, the parameterized model is evaluated on a set of test samples (i.e., participants and their data not used for training). This evaluation assesses the degree to which the model generalizes to new individuals. Common evaluation metrics⁴⁷ include accuracy (i.e., quantifying the proportion of correct predictions) and the area under the receiver operating characteristic curve (AUC), which measures how well the model discriminates between outcome classes over the full range of decision thresholds. Lastly, one can identify the pattern driving inference, offering insights into potential mechanisms or risk factors underlying alcohol outcomes.

Conventional machine learning models (e.g., random forest, support vector machine [SVM]) typically confine analysis to tabular data, where summary scores are extracted from raw measurements before training the model. For example, brain MRI studies relied on region-level summaries (such as brain volume or cortical thickness) as input to machine learning models, which confined prediction accuracy to the anatomical granularity of the brain regions defined by human experts beforehand.⁴⁸ Removing this constraint necessitates performing predictions directly from raw higher-dimensional data (e.g., all voxels of a 3D brain MRI). This is the domain of deep learning models (Figure 1), a subfield of machine learning that gained traction starting in 2012.⁴⁹ Deep learning models jointly learn to derive measurements (i.e., features) from the raw data and to predict the outcome using large-scale artificial neural networks. Treating feature extraction as part of the optimization process has led to significant improvements in predictive accuracy across numerous domains, fueling growing interest in applying deep learning approaches to data-driven medical research.⁵⁰

This review provides a systematic overview of how machine learning methods have been applied across the landscape of alcohol research over the past decade. Rather than emphasizing the specific constellations of neurobiological or psychological factors identified by machine learning analyses, the review focuses on the methodological trends, data modalities, and modeling choices that characterize current machine learning practice across alcohol-related research. Specifically, the review examines which populations and clinical outcomes have been most frequently targeted, what level of prediction accuracy has been achieved, what types of data have been used to predict outcomes, and how different machine learning models (including deep learning) have been adopted. The review also highlights the challenges associated with applying these methods to AUD studies and discusses how overcoming those challenges could transform diagnosis and treatment from subjective observations to objective, individualized assessments that lead to accurate precision medicine.

Search Method

In April and July 2025, authors conducted a literature search in PubMed, ScienceDirect, and EBSCO Academic Search Premier for publications that used machine learning to analyze alcohol-related outcomes. Across the three databases, a key word search using (alcohol[Title] OR drinking[Title]) AND (alcohol[Title/Abstract]) AND (“machine learning”[Title/Abstract] OR “deep learning”[Title/Abstract] OR predict[Title/Abstract] OR classify[Title/Abstract]) returned 2,618 manuscripts published from 2015 to April 15, 2025.

A rule-based text-mining Python script (available upon request from the authors) searched for the co-occurrence of prediction-related terms (e.g., predict, model, classify, identify) and alcohol-related terms (e.g., alcohol, drinking, dependence, relapse), while removing articles in which alcohol was a predictor rather than an outcome (e.g., “Alcohol use predicts depression”). This automatic screening reduced the selection to 567 manuscripts. A final manual selection removed eight preprints that had not been peer-reviewed, 21 articles that used animal models, 27 articles that studied fetal alcohol spectrum disorders, 139 articles that did not use machine learning approaches, and 262 articles that did not primarily analyze alcohol consumption behaviors (e.g., withdrawal symptoms, alcohol-related liver diseases, drunk driving). This selection resulted in 110 original peer-reviewed human research studies^7,51-161 that tested the model on data that it was not trained on, which is a key difference between machine learning and population-level statistical analysis.

Results of the Literature Search

The search identified 2,618 articles for initial examination. Of those, 2,508 were excluded as described above, and 110 were included in the review. For each selected article, Appendix 1 records the number of subjects, mean age, sex ratio, prediction task (e.g., predicting relapse, AUD diagnosis classification), number of predictors, machine learning model used (e.g., random forest, SVM), and reported model accuracy (e.g., AUC and classification accuracy). If multiple alcohol-related prediction tasks were explored, the average prediction performance was recorded. If multiple machine learning models were explored, the model with the highest performance was recorded.

With respect to demographic factors, 85 of the 110 reviewed articles (or 77%) reported sex ratios: eight studies (9% of 85 articles) focused exclusively on men, 40 manuscripts (47%) were somewhat balanced between the sexes (i.e., the percentage of men was 40% to 60%), and three only studied women. Of the 110 reviewed articles, 89 manuscripts (81%) reported the mean age of cohorts, with most studies studying samples with a mean age from late childhood to middle-aged adulthood: specifically, 82 articles studied cohorts with mean ages of 10 to 50 years. Only seven studies focused on cohorts with a mean age greater than 50 years.

Among all prediction outcomes (Figure 2A), AUD diagnosis and alcohol consumption patterns were the two categories most often evaluated (both 31% of articles). Articles focusing on classifying AUD diagnosis reported a median AUC of 89% (Figure 2B), which was the highest reported among categories with at least two articles. In comparison, articles on predicting alcohol consumption had a median AUC of 78%. Treatment response was the third most frequently predicted outcome but had the lowest median AUC (71%), followed by prognosis (14%, median AUC: 80%). Relapse and alcohol concentration both were assessed in only 4% of studies but had high median AUCs (85% and 96%, respectively). However, the very high AUC (96%) for sensor-based prediction of alcohol concentration requires confirmation as it was only published by one article.

Figure 2. Four charts describing the studies covered in the review: a pie chart of Prediction Outcomes Assessed (A); a floating bar chart showing AUCs for Different Outcomes (B); a bar chart listing Predictors Used, from most to least commonly used (C): and a bar chart comparing Machine Learning Methods Used, from most to least common (D).

Figure 2. Characteristics of the studies identified by the review. (A) Articles are categorized based on the clinical outcome targeted by the machine learning prediction: more than half the articles focus on predicting alcohol use disorder (AUD) diagnosis or consumption patterns. (B) Area under the curve (AUC) of the models of various outcomes. The most accurate clinical outcomes to predict were alcohol concentration and AUD diagnosis. (C) Number of studies that used a certain type of input predictors. The most common predictors were neuroimaging data, alcohol and other substance use patterns, demographics, mental health symptoms, and behavioral data from neuropsychological testing and self-reports. (D) Number of studies that used a certain type of machine learning model. The most frequently adopted models were random forest, linear regression, and support vector machine. Note. EHR, electronic health record; kNN, k-nearest neighbor; MLP, multilayer perceptron; SVM, support vector machine.

Of the 110 articles, 46 (42%) used more than one type of predictor to predict outcomes. The most frequently used predictors were imaging data (32% of articles), followed by self-reported substance use (27%), demographics (24%), mental health (24%), and behavioral assessment (23%) (Figure 2C). Other data types were rarely used for prediction (10% or less). Conventional machine learning methods¹⁶² were the most popular approaches for prediction, including the random forest approach (used in 24% of 110 articles), linear regression (17%), and SVM (16%) (Figure 2D). Other approaches, such as logistic regression, gradient boosting, decision tree, multilayer perceptron, ensemble methods, k nearest neighbor, and clustering were used in 10% of studies or less. Deep learning approaches were used in 3% of investigations. Studies with larger sample sizes tended to use higher-dimensional input features for prediction (i.e., used larger numbers of predictors) (Figure 3A; r = .20, p = .05); they also resulted in lower prediction accuracy (Figure 3B; r = −.26, p = .03).

Figure 3. Two scatter plots showing (A) a positive correlation between the study sample size and the number of inputs used in a model and (B) a negative correlation between sample size and the accuracy of the model’s predictions.

Figure 3. Correlation of sample size with number of predictors and accuracy. The sample size showed (A) a positive correlation with the number of input features to the machine learning models; and (B) a negative correlation with model prediction accuracy.

Results of the Studies Reviewed

Demographics

Historically, AUD has been mostly diagnosed in men.¹⁶³ However, in recent years, the prevalence of AUD among women has been rising. This is a concerning trend from a public health standpoint, as the adverse effects of alcohol misuse tend to be more severe in women than in men.^164,165 This shift is evident in the articles reviewed herein, with only 9% of studies focusing exclusively on men, and the proportion of male participants in studies published since 2023 (n = 49) significantly lower (p = .032) compared to those published before 2023 (n = 36). Accordingly, the reviewed articles identified sex differences in the correlations of neural substrates and behavioral correlates with alcohol misuse and treatment response.^77,97 For example, childhood trauma and microstructural integrity in temporal and motor structures were female-exclusive predictors of alcohol misuse in young adulthood, whereas social recognition ability, personality traits, and microstructural integrity in cerebellar, motor, and occipital structures were more important predictors for young males.^77,97 Liver function test results predicted treatment response exclusively in males, whereas mental health symptoms were more informative predictors for females.¹⁵⁴ Discovery of these distinctions was performed either by training a single machine learning algorithm on both sexes and comparing important predictors between males and females,⁷⁷ or by training separate models for males and females and comparing the outcomes of the two models.⁹⁷

In addition to sex, age is a critical demographic factor in alcohol use.¹⁶⁶ Most recent machine learning studies have focused on adolescents through middle-aged adults, a critical window when drinking behaviors typically initiate^167,168 and consolidate.^169-171 As acknowledged in the reviewed articles, adolescence and early adulthood are critical for brain development^60,146 and encompass major life transitions (e.g., transition from high school to college, entry into the workforce, and early parenthood). These transitions often involve shifts in social norms and heightened stress, both of which can drive alcohol use.^97,140 Notably, binge drinking (defined as five or more drinks per occasion) peaks during the late teens to early 30s¹⁷² and is one of the strongest predictors of developing AUD later in life.¹⁷³ By contrast, alcohol misuse in adulthood poses the greatest public health burden,^77,96 including alcohol-related injuries (e.g., accidents, violence, fatalities)^174-176 and economic costs (e.g., lost productivity and increased health care expenditure).^165,177 Thus, machine learning studies are particularly valuable for simultaneously analyzing tens or hundreds of factors drawn from many conceptually different domains with the goal of identifying early markers of vulnerability and understanding later-life health and social outcomes to inform prevention and intervention strategies.

Prediction Outcome

Distinguishing individuals with AUD from controls was a prevalent prediction task probed by machine learning analysis; it was addressed in 35 of the 110 articles (31%; Figure 2A). The high proportion may be due to the relatively high accuracy achieved by machine learning approaches (median AUC: 89.0%; Figure 2B). Predicting AUD is relatively simple compared to other clinical outcomes because it is a binary decision process based on clear diagnostic framing (i.e., meets the criteria of the Diagnostic and Statistical Manual of Mental Disorders: yes/no). Furthermore, AUD is typically associated with severe and enduring alterations in brain regions and behavior, as also documented by the reviewed articles.^84,129 Accordingly, the prediction is often based on neuroimaging data (20 out of 35 articles). For example, prediction models based on structural and functional MRI have yielded converging evidence of AUD-related abnormalities in prefrontal circuits (executive dysfunction),^55,69,96,138 ventral striatum (reward sensitivity),^84,96 the cingulate cortex,^24,55 default mode network,^138,139 and the sensory cortex^96,107 (Appendix 1). Although these regions also have been previously reported by literature based on traditional group-level analyses, the effect sizes revealed by machine learning approaches far exceed those identified by univariate statistical tests. For example, a neural network classifier achieved an AUC of 79% in classifying 51 people with AUD and 51 control subjects using whole-brain resting-state functional connectivity.⁹⁶ In contrast, individual functional connectivity features typically exhibited group differences of t < 4.0, corresponding to an accuracy of only about 0.65.⁹⁶

Predicting alcohol consumption patterns that did not meet the criteria for AUD was another task frequently investigated by machine learning models (35 studies, or 31% of the articles). Because alcohol misuse is among the strongest predictors of future AUD onset, this task holds significant value for prevention and policy design. Prediction models for this task often involved mixed data sources beyond imaging-based predictors. Common predictors of alcohol misuse identified by machine learning included sociodemographic factors (e.g., male sex or lower socioeconomic status),^{51,119,130,142} poor executive control,^51,119 social behavior change,^83,119 psychological dysfunction,^83,119 and other substance use.^51,83 The accuracy of these predictors in differentiating people who drink heavily from control subjects (median AUC 78%) was substantially higher than the accuracy of single predictors, which typically yielded only marginal improvements above chance.^51,61,119 Collectively, these findings position machine learning approaches as a promising and potentially clinically actionable framework for early risk quantification and targeted prevention of AUD.

The third most predicted outcome of machine learning was treatment response. Common predictors identified by the reviewed articles included severity of dependence and craving at baseline,^7,98,127 self-efficacy,^127,154 and psychiatric comorbidity.^7,98,154 Initial evidence also suggests that the quantitative, individualized prediction by machine learning models aggregating multiple predictors was more accurate than predictions made by human experts.¹²⁷ Despite the potential, this prediction task was associated with the lowest median AUC (71%) among all prediction outcomes. One reason is that treatment response often is assessed using nonstandardized subjective measures (e.g., craving scales and self-reported relapse) that are known to exhibit low test-retest reliability and high individual variability.⁷¹ Further increasing variability was the wide difference in treatment duration, from a few days¹³² to a few years¹³⁶ resulting in inconsistent timing of follow-up assessments. Thus, clear neurobiological and behavioral correlates with treatment outcomes remain elusive. In addition, studies on treatment outcomes also suffer from only a limited proportion of enrolled individuals completing the full course of treatment (e.g., 78% drop out rate⁶⁶). This low completion rate contributes to significant class imbalance between individuals who do or do not respond to treatment. Collectively, these factors make it difficult to robustly train machine learning models and hinder their generalizability,¹⁷⁸ which is reflected in treatment response emerging as the clinical outcome with the lowest average predictive accuracy (Figure 2B).

Types of Data Used for Training Machine Learning Models

This review revealed that predictors of alcohol-related clinical outcomes encompass a diverse array of data types (Figure 2C), with the most popular one being medical imaging data. As mentioned, chronic heavy alcohol use results in significant volume loss in brain regions⁵⁵ and disrupts brain function.^69,84,96 Neuroimaging can capture these alterations noninvasively, making it appealing for investigating addiction mechanisms.¹⁷⁹ Beyond improving the understanding of addiction mechanisms, the objectivity of nonfunctional neuroimaging transcends traditional self-report measures and thus has the potential for providing quantitative biomarkers for identifying individuals at elevated risk for drinking onset, predicting relapse,¹⁸⁰ and assessing treatment outcomes.¹⁸¹

However, neuroimaging data also have some disadvantages. They are relatively expensive to acquire, can be susceptible to measurement variability and artifacts (e.g., head motion, scanner differences, and physiological fluctuations), and are associated with small effect sizes (typically explaining < 5% to 10% of the variance),^182,183 because they are indirect proxies reflecting intermediate phenotypes (such as functional connectivity or activation patterns) rather than direct measures of behaviors, such as craving or relapse.^184,185 In contrast to neuroimaging studies, neuropsychological assessments, self-reports, and demographic variables (e.g., alcohol use history, age of onset, or family history) are more proximally aligned with AUD outcomes and easier to collect, often translating to stronger predictive accuracy. This explains why the 18 studies using only neuroimaging as predictors had a lower median AUC of 79% than the 22 studies using a single type of nonneural predictors (median AUC: 91%).

No single data type (whether neuroimaging, genetic, behavioral, or self-report) fully captures the complexity of AUD and related clinical outcomes. In this review, 42% of studies investigated predictors from multiple modalities, which machine learning models are primed to do because their data-driven search does not require prior knowledge about the modalities they analyze. Each modality—whether it be genetic data, brain imaging, behavioral traits, or environmental exposures—generally captures a distinct yet complementary layer of risk or resilience.^26,128,186 By seamlessly integrating these diverse information sources, the reviewed studies consistently indicated that multimodal machine learning models not only provide a more comprehensive representation of individual differences in alcohol-related outcomes but also enhance prediction accuracy and generalizability compared to single-modal models.^7,55,119,129

Discussion

Model Design Choices

In general, the choice of machine learning models is heavily influenced by data set characteristics, particularly sample size and number of predictors, which are often strongly correlated (Figure 3A).¹⁸⁷ Training highly complex models on a limited number of samples with many predictors can lead to “overfitting,” where the model captures noise patterns in the training data rather than generalizable relationships based on reliable measurements. Therefore, given that most alcohol-related datasets typically only include fewer than 1,000 subjects, conventional machine learning methods (i.e., random forest, SVM, linear regression)¹⁶² are more common because they depend on relatively fewer model parameters to learn. In contrast, state-of-the-art deep learning approaches require far larger data sets to generalize.^188,189 Empirical studies have shown that these “sophisticated” models often have overestimated accuracy and are easily overfitted on small data sets.¹⁸⁹

Moreover, the data in alcohol studies were often structured tabular data (e.g., demographic variables, self-report scores, or brain regional measurements) rather than raw, high-dimensional inputs (e.g., minimally processed structural MRI data). In these tabular-feature scenarios, traditional models often performed on par with or even better than deep learning because the key signals of the precomputed features (e.g., regional brain volumes or cognitive scores) and relatively simple prediction targets (often binary outcomes, such as diagnosis vs. control) did not require complex modeling. Recent evaluations on structured data sets found that tree-based ensembles (such as random forest and XGBoost¹⁹⁰) consistently outperformed deep neural networks (which are considered the state-of-the-art in AI) for both purely numerical features and mixed data types.^191,192 Thus, although deep learning has only begun to tread in alcohol research, its broader potential will depend on the availability of large, well-curated data sets.

Beyond prediction accuracy, the decision process of conventional models is generally easier to interpret compared with deep learning. This is essential for results from alcohol studies to be useful to clinicians and researchers who need to understand why a model makes a prediction to trust and act on it. For example, random forest can provide ranked feature importance¹⁹³ to indicate which features are more influential to the model prediction. By contrast, deep neural networks are often regarded as “black boxes” with complex internal computations that defy straightforward explanations.^188,194 This opacity can lead to reluctance in adopting deep learning models for medical decisions, because clinicians cannot easily verify what the model has learned or identify potential biases.¹⁹⁵

Another important advantage of conventional machine learning methods over deep learning for alcohol researchers is the relatively low barrier to deployment and maintenance.¹⁹⁶ Conventional methods are typically easier to set up, requiring fewer design choices and less parameter tuning than deep learning approaches. They handle mixed data types and occasional missing values with minimal preprocessing, whereas deep networks usually demand careful data cleaning and customization for each new task. Conventional machine learning algorithms also run efficiently on standard computational resources, with training of the models often completed in seconds or minutes on a normal personal computer. In contrast, deep learning approaches necessitate high-performance GPUs or cloud infrastructure, which is expensive and harder to operate.¹⁹⁴

In summary, the dominance of conventional machine learning approaches in alcohol-related neuroimaging and behavioral prediction can be attributed to a combination of data limitations, interpretability needs, and ease of use. Although this observation might be biased by the search criteria of this review, which were confined to nonspecific key words, such as “machine learning” and “deep learning,” none of the deep learning approaches used in the studies were based on state-of-the-art architectures (e.g., transformers, generative models, and large language models) that recently have been the focus of public discourse.

Key Methodological Challenges and Future Directions for AI in Alcohol Research

Despite the increased use of machine learning analyses in alcohol research, several methodological limitations continue to constrain the full potential of this technology. The limiting factor most commonly mentioned among the reviewed articles is the number of available samples (25 articles). As mentioned, generalizability of machine learning methods relies on training them on heterogeneous data sets, which requires collecting data from a large number of samples. Funding institutions have identified this need and have recently funded several large, multicenter studies, such as NCANDA, ABCD, COGA, and MTF. However, simply increasing sample size does not overcome existing limitations; machine learning methodology also needs to be tailored to study such large samples in alcohol research. This review revealed a negative correlation between sample size and model prediction accuracy (Figure 3B), contradicting the general understanding that the accuracy of machine learning should increase with the size of training data. This paradox reflects a broader challenge in psychiatry increasingly recognized in the literature,^197,198 namely the trade-off between predictive accuracy and population heterogeneity.¹⁹⁷ Smaller data sets tend to be more homogeneous, allowing models to learn a single predictive pattern with high accuracy. However, this pattern is likely specific to a restricted subpopulation, thus resulting in a lower accuracy in larger heterogenous data sets; this was pointed out by 35 articles of this review. A promising way forward lies in the development of foundation models.¹⁹⁹ These are large-scale, general-purpose models that are first trained on broader data sets, including those not necessarily related to alcohol research. Once pretrained, these models can be fine-tuned to specific subpopulations or clinical questions (e.g., predicting relapse in young females or treatment response in people with chronic AUD with liver disease). Creating such models could be essential for moving from subjective, population-level generalizations toward truly individualized, quantitative, and context-sensitive predictions that are the foundational promise of precision psychiatry.

Another challenge in using machine learning to advance alcohol research is the modeling of confounders, which were stringently accounted for by only 27 of the 110 articles (25%) reviewed. When confounding variables (e.g., age, sex, or comorbid mental health conditions) influence both the input features (e.g., neural or behavioral measures) and the outcome (e.g., alcohol consumption or treatment response), models may learn spurious associations that lead to misinterpretation of machine learning findings²⁰⁰ as also acknowledged by 24 articles of this review. For instance, age affects both brain connectivity and drinking patterns.²⁰¹ If age is not adjusted for, the model may “predict” alcohol misuse by detecting brain maturation, not alcohol effects. Similarly, alcohol consumption patterns significantly differ between males and females; consequently, a machine learning model not controlling for sex might simply detect sex differences in the input predictors. Another type of confound is caused by comorbidity (mentioned by nine articles), because alcohol use often co-occurs with other physical and psychiatric symptoms (e.g., liver disease and depression) and other substance use. The identified predictors hence might not be linked to alcohol outcomes but to other confounding phenotypes. This type of signal leakage caused by confounds^202,203 often results in misleading model interpretations²⁰⁴ and inflates accuracy scores (e.g., AUC, accuracy). This is especially problematic in nonrandomized, observational data sets typically encountered in alcohol research.

Mitigation of confounding effects is well established in traditional statistics but often underdeveloped or overlooked in machine learning research, which has historically only focused on maximizing predictive accuracy. In particular, modern deep learning models have the capacity to encode complex nonlinear confounding effects that are hard to detect or adjust for.²⁰³ This gap could be addressed by treating modeling confounders as a core component of the machine learning pipeline. Researchers could apply appropriate preprocessing techniques (e.g., residualization, stratification, or harmonization²⁰⁵) before training, and may want to consider models specifically designed to reduce confounding, such as domain-invariant²⁰⁶ or confounder-aware approaches.^61,207,208 Also, evaluation of the models could be enhanced by going beyond reporting a single AUC and incorporating fairness metrics²⁰⁹ (e.g., subgroup AUCs) and sensitivity analyses²¹⁰ (e.g., performance changes with and without confounders). Validation of models on external data sets where the distribution of individual confounders differs would be crucial for ensuring generalizability and robustness.

Beyond modeling confounding effects, six of the 110 articles reviewed here explicitly mentioned the limitation that the adopted machine learning approach could only reveal statistical association between predictors and outcome but could not reveal causality between them. This issue highlights an urgent need for a paradigm shift in machine learning applications within alcohol research. Rather than focusing solely on optimizing prediction accuracy, models could also be designed to reveal dependencies and potential causal pathways among variables. New models designed to discern how external environmental and sociodemographic factors (i.e., those that give rise to population heterogeneity) moderate genetic, neural, and behavioral underpinnings of AUD in a data-driven fashion could move the field forward.^211,212 A promising future direction is the application of canonical correlation analysis²¹³ and its variants, which can uncover shared patterns across domains that are not only interpretable but can also serve as input features for downstream phenotype prediction.²¹⁴ To further capture directional influences, structural equation modeling²¹⁵ can be integrated to model hypothesized causal pathways and test how upstream factors (e.g., genes, environment) propagate their effects through intermediate phenotypes (e.g., brain function) to shape behavioral outcomes. This causal approach might offer a powerful new avenue in generative modeling²¹² that could ultimately enable virtual interventions, allowing researchers to test “what-if” scenarios without carrying out real-world in vivo experiments. For example, a generative model could simulate how removing environmental risk, improving modifiable behaviors, such as sleep hygiene, or neural simulation could predict downstream effects on alcohol-related behaviors. Achieving this vision will require alcohol researchers to closely work together with AI methodologists and implementation scientists, such as facilitated by Stanford’s AI for Mental Health Initiative.²¹⁶

As mentioned, compared to statistical group analysis that only quantifies a population average trend, a strength of machine learning is to condense multivariate patterns across many features to an individualized score that can be mapped directly onto clinical decisions (e.g., risk of AUD onset or likelihood of relapse). However, none of the reviewed studies discussed the deployment of machine learning approaches in clinical settings. This issue is symptomatic for psychiatry in general, whereas other areas of medicine are beginning to integrate AI models into clinical workflows and even clinical trials.^217,218 Unlike other fields of medicine that rely on measurable physiological indicators, psychiatry still relies heavily on subjective reports and assessment for diagnosis and treatment. This lack of objective, quantifiable targets poses a major barrier to training and validating clinically actionable machine learning models. Consequently, as of March 2026, none of the FDA-approved AI-enabled medical devices were tailored toward psychiatric conditions.^219,220 Unique to the field of alcohol research is that studies soon will be able to replace self-reported alcohol use with real-time quantitative measurements provided by noninvasive alcohol biosensors.^221,222 Such real‐time, quantitative monitoring of alcohol use will provide objective, continuous data that can transform model development and clinical validation.

To fully realize this potential, rigorous evaluation standards must be established. The studies reviewed exhibit substantial variability in validation practices: thus, 18% of studies relied on a single train/test split without cross-validating results on all samples, only 12% of studies validated findings on external data sets, and 16% of articles adopted potentially flawed methodological designs, such as double dipping (e.g., feature selection on the full data set before training) and data leakage (e.g., information of test samples were used for training). The establishment of standardized benchmark data sets and data splits, as well as the preregistration of machine learning analysis and evaluation plans,²²³ including training and testing data construction, can help remedy these concerns. Once quantitative assessments are coupled with stringent model evaluation, researchers will be able to easily test the validity of their models, allowing the field to gain a deeper insight into addiction. Ensuring the responsible deployment of this technology in clinical settings will enable practitioners to diagnose patients based on quantitative markers and objectively assess the progress of alcohol treatments. Finally, prevention programs might be able to accurately determine risk of alcohol misuse in individuals. By doing so, the field of alcohol research could be a trailblazer in psychiatry as it would use AI technology to radically improve the diagnosis and treatment of a psychiatric disease, namely AUD.

Conclusion

This review focused on studies employing machine learning methods that, in contrast to hypothesis-driven analyses, can uncover complex multivariate patterns predictive of alcohol-related outcomes on an individual basis. Most of the 110 reviewed articles trained conventional machine learning approaches on a single data modality to predict alcohol consumption or AUD diagnosis in cohorts with mean ages between 12 and 50 years. Studies were limited by the small number of available samples, not being able to gain insights about causality, and failing to account for confounders. Significantly improving the diagnosis and treatment of AUD will require harnessing the full potential of recent advances in deep learning (such as foundation models).²²⁴ Specifically, fostering multidisciplinary research teams to create rigorous and trustworthy models can help analyze large, multimodal data sets that can capture the multifaceted nature of alcohol use and its comorbidities based on quantitative measures. Additionally, setting guidelines for the responsible development and deployment of these machine learning models would help ensure that these approaches will improve precision alcohol treatment.

Acknowledgments

The work was partly supported by the National Institutes of Health grants R00AA028840 (to Q.Z.); R01DA057567, U24AA021697, R01AA010723, R01AA05965, and R01AA017347 (to K.M.P.); the Brain and Behavior Research Foundation Young Investigator Grant (to Q.Z.), and the 2024 Stanford HAI Hoffman-Yee Grant (to K.M.P.). The authors thank Haopeng Xue for helping with formatting the article.

Correspondence

Address correspondence concerning this article to Kilian M. Pohl, 1070 Arastradero Road, Palo Alto, CA 94304. Email: [email protected]

Disclosures

The authors declare no competing financial or nonfinancial interests.

Publisher's note

Opinions expressed in contributed articles do not necessarily reflect the views of the National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health. The U.S. government does not endorse or favor any specific commercial product or commodity. Any trade or proprietary names appearing in Alcohol Research: Current Reviewsare used only because they are considered essential in the context of the studies reported herein.

References

Substance Abuse and Mental Health Services Administration (SAMHSA). Center for Behavioral Statistics and Quality. National Survey on Drug Use and Health Table 5.9A. 2024. Alcohol use disorder in past year: among people aged 12 or older; by age group and demographic characteristics, numbers in thousands, 2023 and 2024. https://www.samhsa.gov/data/report/2024-nsduh-detailed-tables.
Bose J, Hedden SL, Lipari RN, Park-Lee E. Key substance use and mental health indicators in the United States: Results from the 2017 National Survey on Drug Use and Health. Substance Abuse and Mental Health Services Administration. 2018. https://www.samhsa.gov/data.
Pilar MR, Eyler AA, Moreland-Russell S, Brownson RC. Actual causes of death in relation to media, policy, and funding attention: Examining public health priorities. Front Public Health. 2020;8:279. doi:10.3389/fpubh.2020.00279
Bevilacqua L, Goldman D. Genes and addictions. Clin Pharmacol Ther. 2009;85(4):359-361. doi:10.1038/clpt.2009.6
Zhou H, Gelernter J. Human genetics and epigenetics of alcohol use disorder. J Clin Invest. 2024;134(16):e172885. doi:10.1172/JCI172885
Goldman D, Oroszi G, Ducci F. The genetics of addictions: Uncovering the genes. Nat Rev Genet. 2005;6(7):521-532. doi:10.1038/nrg1635
Kinreich S, Meyers JL, Maron-Katz A, et al. Predicting risk for Alcohol Use Disorder using longitudinal data with multimodal biomarkers and family history: A machine learning study. Mol Psychiatry. 2021;26(4):1133-1141. doi:10.1038/s41380-019-0534-x.
Konkolÿ Thege B, Horwood L, Slater L, Tan MC, Hodgins DC, Wild TC. Relationship between interpersonal trauma exposure and addictive behaviors: A systematic review. BMC Psychiatry. 2017;17(1):164. doi:10.1186/s12888-017-1323-1
Simons-Morton B, Haynie DL, Crump AD, Eitel SP, Saylor KE. Peer and parent influences on smoking and drinking among early adolescents. Health Educ Behav. 2001;28(1):95-107. doi:10.1177/109019810102800109
Morris H, Larsen J, Catterall E, Moss AC, Dombrowski SU. Peer pressure and alcohol consumption in adults living in the UK: A systematic qualitative review. BMC Public Health. 2020;20(1):1014. doi:10.1186/s12889-020-09060-2
Blanco C, Hasin DS, Wall MM, et al. Cannabis use and risk of psychiatric disorders: Prospective evidence from a U.S. national longitudinal study. JAMA Psychiatry. 2016;73(4):388-395. doi:10.1001/jamapsychiatry.2015.3229
Kohut SJ. Interactions between nicotine and drugs of abuse: A review of preclinical findings. Am J Drug Alcohol Abuse. 2017;43(2):155-170. doi:10.1080/00952990.2016.1209513
Benegal V, Antony G, Venkatasubramanian G, Jayakumar PN. Gray matter volume abnormalities and externalizing symptoms in subjects at high risk for alcohol dependence. Addict Biol. 2007;12(1):122-132. doi:10.1111/j.1369-1600.2006.00043.x
Savage JE, Spit for Science Working Group, Dick DM. Internalizing and externalizing subtypes of alcohol misuse and their relation to drinking motives. Addict Behav. 2023;136:107461. doi:10.1016/j.addbeh.2022.107461
Merline A, Jager J, Schulenberg JE. Adolescent risk factors for adult alcohol use and abuse: Stability and change of predictive value across early and middle adulthood. Addiction. 2008;103(suppl 1):84-99. doi:10.1111/j.1360-0443.2008.02178.x
Cusack SE, Wright AW, Amstadter AB. Resilience and alcohol use in adulthood in the United States: A scoping review. Prev Med. 2023;168:107442. doi:10.1016/j.ypmed.2023.107442
Perini I, Mayo LM, Capusan AJ, et al. Resilience to substance use disorder following childhood maltreatment: Association with peripheral biomarkers of endocannabinoid function and neural indices of emotion regulation. Mol Psychiatry. 2023;28(6):2563-2571. doi:10.1038/s41380-023-02033-y
Substance Abuse and Mental Health Services Administration. National Survey on Drug Use and Health. 2023. https://www.samhsa.gov/data/report/2023-nsduh-detailed-tables.
Fan AZ, Chou SP, Zhang H, Jung J, Grant BF. Prevalence and correlates of past-year recovery from DSM-5 alcohol use disorder: Results from National Epidemiologic Survey on Alcohol and Related Conditions-III. Alcohol Clin Exp Res. 2019;43(11):2406-2420. doi:10.1111/acer.14192
Whelan R, Watts R, Orr CA, et al. Neuropsychosocial profiles of current and future adolescent alcohol misusers. Nature. 2014;512(7513):185-189. doi:10.1038/nature13402
Friske MM, Torrico EC, Haas MJW, et al. A systematic review and meta-analysis on the transcriptomic signatures in alcohol use disorder. Mol Psychiatry. 2025;30(1):310-326. doi:10.1038/s41380-024-02719-x
National Institute on Alcohol Abuse and Alcoholism. National Institute on Alcohol Abuse and Alcoholism Strategic Plan: Fiscal Years 2024-2028. https://www.niaaa.nih.gov/sites/default/files/NIAAA-2024-2028-Strategic-Plan.pdf.
Brown SA, Brumback T, Tomlinson K, et al. The National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA): A multisite study of adolescent development and substance use. J Stud Alcohol Drugs. 2015;76(6):895-908. doi:10.15288/jsad.2015.76.895
Zhao Q, Sullivan EV, Honnorat N, et al. Association of heavy drinking with deviant fiber tract development in frontal brain systems in adolescents. JAMA Psychiatry. 2021;78(4):407-415. doi:10.1001/jamapsychiatry.2020.4064
Volkow ND, Koob GF, Croyle RT, et al. The conception of the ABCD Study: From substance use to a broad NIH collaboration. Dev Cogn Neurosci. 2018;32:4-7. doi:10.1016/j.dcn.2017.10.002
Agrawal A, Brislin SJ, Bucholz KK, et al. The collaborative study on the genetics of alcoholism: Overview. Genes Brain Behav. 2023;22(5):e12864. doi:10.1111/gbb.12864
Miech RA, Johnston LD, Patrick ME, O’Malley PM. Monitoring the Future national survey results on drug use, 1975-2023: Overview and detailed results for secondary school students. Ann Arbor, MI: University of Michigan. 2024. https://monitoringthefuture.org/wp-content/uploads/2024/01/mtfoverview2024.pdf.
Patrick ME, Miech RA, Johnston LD, O’Malley PM. Monitoring the Future Panel Study annual report: National data on substance use among adults ages 19 to 65, 1976-2023. Ann Arbor, MI: University of Michigan. 2024. https://monitoringthefuture.org/wp-content/uploads/2024/07/mtfpanel2024.pdf.
All of Us Research Program Investigators. The “All of Us” research program. N Engl J Med. 2019;381(7):668-676. doi:10.1056/NEJMsr1809937
Shi M, Luo C, Oduyale OK, Zong X, LoConte NK, Cao Y. Alcohol consumption among adults with a cancer diagnosis in the All of Us research program. JAMA Netw Open. 2023;6(8):e2328328. doi:10.1001/jamanetworkopen.2023.28328
Tevik K, Bergh S, Selbæk G, Johannessen A, Helvik AS. A systematic review of self-report measures used in epidemiological studies to assess alcohol consumption among older adults. PLOS One. 2021;16(12):e0261292. doi:10.1371/journal.pone.0261292
Singer A, Kosowan L, Loewen S, Spitoff S, Greiver M, Lynch J. Who is asked about alcohol consumption? A retrospective cohort study using a national repository of Electronic Medical Records. Prev Med Rep. 2021;22:101346. doi:10.1016/j.pmedr.2021.101346
Gupta I, Dandavate R, Gupta P, Agrawal V, Kapoor M. Recent advances in genetic studies of alcohol use disorders. Curr Genet Med Rep. 2020;8(2):27-34. doi:10.1007/s40142-020-00185-9
Day AM, Kahler CW, Ahern DC, Clark US. Executive functioning in alcohol use studies: A brief review of findings and challenges in assessment. Curr Drug Abuse Rev. 2015;8(1):26-40. doi:10.2174/1874473708666150416110515
Zahr NM, Pfefferbaum A. Alcohol’s effects on the brain: Neuroimaging results in humans and animal models. Alcohol Res. 2017;38(2):183-206. doi:10.35946/arcr.v38.2.04
Haucke M, Heinzel S, Liu S. Social mobile sensing and problematic alcohol consumption: Insights from smartphone metadata. Int J Med Inform. 2024;188:105486. doi:10.1016/j.ijmedinf.2024.105486
De Rosa O, Menghini L, Kerr E, et al. Exploring the relationship between sleep patterns, alcohol and other substances consumption in young adults: Insights from wearables and Mobile surveys in the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) cohort. Int J Psychophysiol. 2025;209:112524. doi:10.1016/j.ijpsycho.2025.112524
Davis-Martin RE, Alessi SM, Boudreaux ED. Alcohol use disorder in the age of technology: A review of wearable biosensors in alcohol use disorder treatment. Front Psychiatry. 2021;12:642813. doi:10.3389/fpsyt.2021.642813
Piasecki TM. Assessment of alcohol use in the natural environment. Alcohol Clin Exp Res. 2019;43(4):564-577. doi:10.1111/acer.13975
Stone C, Adams S, Wootton RE, Skinner A. Smartwatch-based ecological momentary assessment for high-temporal-density, longitudinal measurement of alcohol use (AlcoWatch): Feasibility evaluation. JMIR Form Res. 2025;9:e63184. doi:10.2196/63184
Stevenson BL, Kunicki ZJ, Brick L, Blevins CE, Stein M, Abrantes AM. Using ecological momentary assessments and Fitbit data to examine daily associations between physical activity, affect and alcohol cravings in patients with alcohol use disorder. Int J Behav Med. 2022;29(5):543-552. doi:10.1007/s12529-021-10039-5
Abrantes AM, Blevins CE, Battle CL, Read JP, Gordon AL, Stein MD. Developing a Fitbit-supported lifestyle physical activity intervention for depressed alcohol dependent women. J Subst Abuse Treat. 2017;80:88-97. doi:10.1016/j.jsat.2017.07.006
Vilar-Ribó L, Cabana-Domínguez J, Alemany S, et al. Disentangling heterogeneity in substance use disorder: Insights from genome-wide polygenic scores. Transl Psychiatry. 2024;14(1):221. doi:10.1038/s41398-024-02923-x
Yang W, Singla R, Maheshwari O, Fontaine CJ, Gil-Mohapel J. Alcohol use disorder: Neurobiology and therapeutics. Biomedicines. 2022;10(5):1192. doi:10.3390/biomedicines10051192
Mattoni M, Fisher AJ, Gates KM, Chein J, Olino TM. Group-to-individual generalizability and individual-level inferences in cognitive neuroscience. Neurosci Biobehav Rev. 2025;169:106024. doi:10.1016/j.neubiorev.2025.106024
Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science. 2015;349(6245):255-260. doi:10.1126/science.aaa8415
Jiang T, Gradus JL, Rosellini AJ. Supervised machine learning: A brief primer. Behav Ther. 2020;51(5):675-687. doi:10.1016/j.beth.2020.05.002
Wu X, Liang C, Bustillo J, et al. The impact of atlas parcellation on functional connectivity analysis across six psychiatric disorders. Hum Brain Mapp. 2025;46(5):e70206. doi:10.1002/hbm.70206
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
Yang Y, Zhang H, Gichoya JW, Katabi D, Ghassemi M. The limits of fair medical imaging AI in real-world generalization. Nat Med. 2024;30(10):2838-2848. doi:10.1038/s41591-024-03113-4
O’Halloran L, Pennie B, Jollans L, et al. A combination of impulsivity subdomains predict alcohol intoxication frequency. Alcohol Clin Exp Res. 2018;42(8):1530-1540. doi:10.1111/acer.13779
Kim SY, Park T, Kim K, Oh J, Park Y, Kim DJ. A deep learning algorithm to predict hazardous drinkers and the severity of alcohol-related problems using K-NHANES. Front Psychiatry. 2021;12:684406. doi:10.3389/fpsyt.2021.684406
Bonnell LN, Littenberg B, Wshah SR, Rose GL. A machine learning approach to identification of unhealthy drinking. J Am Board Fam Med. 2020;33(3):397-406. doi:10.3122/jabfm.2020.03.190421
Johnson KA, McDaniel JT, Okine J, et al. A machine learning model for the prediction of unhealthy alcohol use among women of childbearing age in Alabama. Alcohol Alcohol. 2024;59(2):agad075. doi:10.1093/alcalc/agad075
Guggenmos M, Schmack K, Veer IM, et al. A multimodal neuroimaging classifier for alcohol dependence. Sci Rep. 2020;10(1):298. doi:10.1038/s41598-019-56923-9
May AC, Jacobus J, Simmons AN, Tapert SF. A prospective investigation of youth alcohol experimentation and reward responsivity in the ABCD study. Front Psychiatry. 2022;13:886848. doi:10.3389/fpsyt.2022.886848
Fairbairn CE, Han J, Caumiant EP, Benjamin AS, Bosch N. A wearable alcohol biosensor: Exploring the accuracy of transdermal drinking detection. Drug Alcohol Depend. 2025;266:112519. doi:10.1016/j.drugalcdep.2024.112519
Wyant K, Moshontz H, Ward SB, Fronk GE, Curtin JJ. Acceptability of personal sensing among people with alcohol use disorder: Observational study. JMIR Mhealth Uhealth. 2023;11:e41833. doi:10.2196/41833
Mulholland PJ, Berto S, Wilmarth PA, McMahan C, Ball LE, Woodward JJ. Adaptor protein complex 2 in the orbitofrontal cortex predicts alcohol use disorder. Mol Psychiatry. 2023;28(11):4766-4776. doi:10.1038/s41380-023-02236-3
Sun D, Adduru VR, Phillips RD, et al. Adolescent alcohol use is linked to disruptions in age-appropriate cortical thinning: An unsupervised machine learning approach. Neuropsychopharmacology. 2023;48(2):317-326. doi:10.1038/s41386-022-01457-4
Park SH, Zhang Y, Kwon D, et al. Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals. Sci Rep. 2018;8(1):8297. doi:10.1038/s41598-018-26627-7
Oszkinat C, Luczak SE, Rosen IG. An abstract parabolic system-based physics-informed long short-term memory network for estimating breath alcohol concentration from transdermal alcohol biosensor data. Neural Comput Appl. 2022;34(21):18933-18951. doi:10.1007/s00521-022-07505-w
Lin Y, Kranzler HR, Farrer LA, Xu H, Henderson DC, Zhang H. An analysis of the effect of mu-opioid receptor gene (OPRM1) promoter region DNA methylation on the response of naltrexone treatment of alcohol dependence. Pharmacogenomics J. 2020;20(5):672-680. doi:10.1038/s41397-020-0158-1
Mumtaz W, Vuong PL, Xia L, Malik AS, Rashid RBA. An EEG-based machine learning method to screen alcohol use disorder. Cogn Neurodyn. 2017;11(2):161-171. doi:10.1007/s11571-016-9416-y
Smink WAC, Sools AM, Postel MG, et al. Analysis of the Emails from the Dutch web-based intervention “Alcohol de Baas”: Assessment of early indications of drop-out in an online alcohol abuse intervention. Front Psychiatry. 2021;12:575931. doi:10.3389/fpsyt.2021.575931
Collin A, Ayuso-Muñoz A, Tejera-Nevado P, et al. Analyzing dropout in alcohol recovery programs: A machine learning approach. J Clin Med. 2024;13(16):4825. doi:10.3390/jcm13164825
Zhang Z, Zhang S, Huang J, et al. Association between abnormal plasma metabolism and brain atrophy in alcohol-dependent patients. Front Mol Neurosci. 2022;15:999938. doi:10.3389/fnmol.2022.999938
Komarnyckyj M, Retzler C, Cao Z, et al. At-risk alcohol users have disrupted valence discrimination during reward anticipation. Addict Biol. 2022;27(3):e13174. doi:10.1111/adb.13174
Song H, Yang P, Zhang X, et al. Atypical effective connectivity from the frontal cortex to striatum in alcohol use disorder. Transl Psychiatry. 2024;14(1):381. doi:10.1038/s41398-024-03083-8
Besong OTO, Koo JS, Zhang H. Brain lncRNA-mRNA co-expression regulatory networks and alcohol use disorder. Genomics. 2024;116(5):110928. doi:10.1016/j.ygeno.2024.110928
Curtis B, Giorgi S, Buffone AEK, et al. Can Twitter be used to predict county excessive alcohol consumption rates? PLOS One. 2018;13(4):e0194290. doi:10.1371/journal.pone.0194290
Uceta M, Cerro-León AD, Shpakivska-Bilán D, García-Moreno LM, Maestú F, Antón-Toro LF. Clustering electrophysiological predisposition to binge drinking: An unsupervised machine learning analysis. Brain Behav. 2024;14(11):e70157. doi:10.1002/brb3.70157
Stevely AK, Holmes J, Meier PS. Combinations of drinking occasion characteristics associated with units of alcohol consumed among British adults: An event-level decision tree modeling study. Alcohol Clin Exp Res. 2021;45(3):630-637. doi:10.1111/acer.14560
Zhu X, Huang J, Huang S, et al. Combining metabolomics and interpretable machine learning to reveal plasma metabolic profiling and biological correlates of alcohol-dependent inpatients: What about tryptophan metabolism regulation? Front Mol Biosci. 2021;8:760669. doi:10.3389/fmolb.2021.760669
Peng Q, Wilhelmsen KC, Ehlers CL. Common genetic substrates of alcohol and substance use disorder severity revealed by pleiotropy detection against GWAS catalog in two populations. Addict Biol. 2021;26(1):e12877. doi:10.1111/adb.12877
Pinar-Sanchez J, Bermejo López P, Solís García Del Pozo J, et al. Common laboratory parameters are useful for screening for alcohol use disorder: Designing a predictive model using machine learning. J Clin Med. 2022;11(7):2061. doi:10.3390/jcm11072061
Li Y, Li G, Yang L, et al. Connectomics modeling of regional networks of white-matter fractional anisotropy to predict the severity of young adult drinking. Quant Imaging Med Surg. 2025;15(3):2405-2419. doi:10.21037/qims-24-2131
Ebrahimi A, Wiil UK, Mansourvar M, Naemi A, Andersen K, Nielsen AS. Deep neural network to identify patients with alcohol use disorder. Stud Health Technol Inform. 2021;281:238-242. doi:10.3233/SHTI210156
Miranda O, Fan P, Qi X, et al. DeepBiomarker2: Prediction of alcohol and substance use disorder risk in post-traumatic stress disorder patients using electronic medical records and multiple social determinants of health. J Pers Med. 2024;14(1):94. doi:10.3390/jpm14010094
Crocamo C, Viviani M, Bartoli F, Carrà G, Pasi G. Detecting binge drinking and alcohol-related risky behaviours from Twitter’s users: An exploratory content- and topology-based analysis. Int J Environ Res Public Health. 2020;17(5):1510. doi:10.3390/ijerph17051510
Bharat C, Glantz MD, Aguilar-Gaxiola S, et al. Development and evaluation of a risk algorithm predicting alcohol dependence after early onset of regular alcohol use. Addiction. 2023;118(5):954-966. doi:10.1111/add.16122
Bush NJ, Cushnie AK, Sinclair M, et al. Development of an accelerometer-based wearable sensor approach for alcohol consumption detection. Alcohol Clin Exp Res. 2024;48(12):2341-2351. doi:10.1111/acer.15465
Lee S. Development of deep learning auto-encoder algorithms for predicting alcohol use in Korean adolescents based on cross-sectional data. Soc Sci Med. 2025;367:117690. doi:10.1016/j.socscimed.2025.117690
Kamarajan C, Ardekani BA, Pandey AK, et al. Differentiating individuals with and without alcohol use disorder using resting-state fMRI functional connectivity of reward network, neuropsychological performance, and impulsivity measures. Behav Sci. 2022;12(5):128. doi:10.3390/bs12050128
Chen F, Xiao M, Chen C, et al. Discrimination of alcohol dependence based on the convolutional neural network. PLOS One. 2020;15(10):e0241268. doi:10.1371/journal.pone.0241268
Liang X, Justice AC, So-Armah K, Krystal JH, Sinha R, Xu K. DNA methylation signature on phosphatidylethanol, not on self-reported alcohol consumption, predicts hazardous alcohol consumption in two distinct populations. Mol Psychiatry. 2021;26(6):2238-2253. doi:10.1038/s41380-020-0668-x
Derksen M, van Beek M, Blankers M, et al. Effectiveness of machine learning-based adjustments to an eHealth intervention targeting mild alcohol use. Eur Addict Res. 2025;31(1):47-59. doi:10.1159/000543252
Li R, Balakrishnan GP, Nie J, et al. Estimation of blood alcohol concentration from smartphone gait data using neural networks. IEEE Access. 2021;9:61237-61255. doi:10.1109/access.2021.3054515
Ariss T, Fairbairn CE, Bosch N. Examining new-generation transdermal alcohol biosensor performance across laboratory and field contexts. Alcohol Clin Exp Res. 2023;47(1):50-59. doi:10.1111/acer.14977
Marengo D, Azucar D, Giannotta F, Basile V, Settanni M. Exploring the association between problem drinking and language use on Facebook in young adults. Heliyon. 2019;5(10):e02523. doi:10.1016/j.heliyon.2019.e02523
Lin Y, Sharma B, Thompson HM, et al. External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients. Addiction. 2022;117(4):925-933. doi:10.1111/add.15730
Schwebel FJ, Wilson AD, Pearson MR, McCool MW, Witkiewitz K. Finding purpose: Integrated latent profile and machine learning analyses identify purpose in life as an important predictor of high-functioning recovery after alcohol treatment. Addict Behav. 2025;165:108273. doi:10.1016/j.addbeh.2025.108273
Huang T, Elghafari A, Relia K, Chunara R. High-resolution temporal representations of alcohol and tobacco behaviors from social media data. Proc ACM Hum Comput Interact. 2017;1(54):1-26. doi:10.1145/3134689
Ebrahimi A, Wiil UK, Naemi A, Mansourvar M, Andersen K, Nielsen AS. Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods. BMC Med Inform Decis Mak. 2022;22(1):304. doi:10.1186/s12911-022-02051-w
Zhu T, Becquey C, Chen Y, Lejuez CW, Li CR, Bi J. Identifying alcohol misuse biotypes from neural connectivity markers and concurrent genetic associations. Transl Psychiatry. 2022;12(1):253. doi:10.1038/s41398-022-01983-1
Vergara VM, Espinoza FA, Calhoun VD. Identifying alcohol use disorder with resting state functional magnetic resonance imaging data: A comparison among machine learning classifiers. Front Psychol. 2022;13:867067. doi:10.3389/fpsyg.2022.867067
Zhao Q, Paschali M, Dehoney J, et al. Identifying high school risk factors that forecast heavy drinking onset in understudied young adults. Dev Cogn Neurosci. 2024;68:101413. doi:10.1016/j.dcn.2024.101413
Cavicchioli M, Calesella F, Cazzetta S, et al. Investigating predictive factors of dialectical behavior therapy skills training efficacy for alcohol and concurrent substance use disorders: A machine learning study. Drug Alcohol Depend. 2021;224:108723. doi:10.1016/j.drugalcdep.2021.108723
Guleken Z, Sarıbal D, Mırsal H, Cebulski J, Ceylan Z, Depciuch J. Investigating the impact of long-term alcohol consumption on serum chemical changes: Fourier transform infrared spectroscopy for human blood serum. J Biophotonics. 2025;18(5):e202400550. doi:10.1002/jbio.202400550
Ruberu TLM, Kenyon EA, Hudson KA, et al. Joint risk prediction for hazardous use of alcohol, cannabis, and tobacco among adolescents: A preliminary study using statistical and machine learning. Prev Med Rep. 2022;25:101674. doi:10.1016/j.pmedr.2021.101674
Morris LS, Kundu P, Baek K, et al. Jumping the gun: Mapping neural correlates of waiting impulsivity and relevance across alcohol misuse. Biol Psychiatry. 2016;79(6):499-507. doi:10.1016/j.biopsych.2015.06.009
Sania A, Pini N, Nelson ME, et al. K-nearest neighbor algorithm for imputing missing longitudinal prenatal alcohol data. Adv Drug Alcohol Res. 2024;4:13449. doi:10.3389/adar.2024.13449
Andrade FC, Meyerson WU, Hoyle RH. Large-scale longitudinal analysis of the progression of alcohol use among members of a social media platform: An observational study. Am J Drug Alcohol Abuse. 2024;51(1):116-126. doi:10.1080/00952990.2024.2414324
Bae SW, Suffoletto B, Zhang T, et al. Leveraging mobile phone sensors, machine learning, and explainable artificial intelligence to predict imminent same-day binge-drinking events to support just-in-time adaptive interventions: Algorithm development and validation study. JMIR Form Res. 2023;7:e39862. doi:10.2196/39862
Zhang Z, Robinson L, Whelan R, et al. Machine learning models for diagnosis and risk prediction in eating disorders, depression, and alcohol use disorder. J Affect Disord. 2025;379:889-899. doi:10.1016/j.jad.2024.12.053
Wyant K, Sant’Ana SJ, Fronk GE, Curtin JJ. Machine learning models for temporally precise lapse prediction in alcohol use disorder. J Psychopathol Clin Sci. 2024;133(7):527-540. doi:10.1037/abn0000901
Zhu T, Wang W, Chen Y, Kranzler HR, Li CR, Bi J. Machine learning of functional connectivity to biotype alcohol and nicotine use disorders. Biol Psychiatry Cogn Neurosci Neuroimaging. 2024;9(3):326-336. doi:10.1016/j.bpsc.2023.08.010
Symons M, Feeney GFX, Gallagher MR, Young RM, Connor JP. Machine learning vs addiction therapists: A pilot study predicting alcohol dependence treatment outcome from patient data in behavior therapy with adjunctive medication. J Subst Abuse Treat. 2019;99:156-162. doi:10.1016/j.jsat.2019.01.020
Rezapour M, Niazi MKK, Gurcan MN. Machine learning-based analytics of the impact of the Covid-19 pandemic on alcohol consumption habit changes among United States healthcare workers. Sci Rep. 2023;13(1):6003. doi:10.1038/s41598-023-33222-y
Afzali MH, Sunderland M, Stewart S, et al. Machine-learning prediction of adolescent alcohol use: A cross-study, cross-cultural validation. Addiction. 2019;114(4):662-671. doi:10.1111/add.14504
Hinton DJ, Vázquez MS, Geske JR, et al. Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects. Sci Rep. 2017;7(1):2496. doi:10.1038/s41598-017-02442-4
Kummerfeld E, Anker JA, Rix A, Kushner MG. Methodological advances in the study of hidden variables: A demonstration on clinical alcohol use disorder data. AMIA Annu Symp Proc. 2018;2018:710-719.
Sangle SB, Kachare PH, Puri DV, Al-Shoubarji I, Jabbari A, Kirner R. Explaining electroencephalogram channel and subband sensitivity for alcoholism detection. Comput Biol Med. 2025;188:109826. doi:10.1016/j.compbiomed.2025.109826
Anuragi A, Singh Sisodia D. Alcohol use disorder detection using EEG Signal features and flexible analytical wavelet transform. Biomed Signal Process Control. 2019;52:384-393. doi:10.1016/j.bspc.2018.10.017
Bae S, Chung T, Ferreira D, Dey AK, Suffoletto B. Mobile phone sensors and supervised machine learning to identify alcohol use events in young adults: Implications for just-in-time adaptive interventions. Addict Behav. 2018;83:42-47. doi:10.1016/j.addbeh.2017.11.039
Grodin EN, Montoya AK, Bujarski S, Ray LA. Modeling motivation for alcohol in humans using traditional and machine learning approaches. Addict Biol. 2021;26(3):e12949. doi:10.1111/adb.12949
Lee JY, Song MS, Yoo SY, et al. Multimodal-based machine learning approach to classify features of internet gaming disorder and alcohol use disorder: A sensor-level and source-level resting-state electroencephalography activity and neuropsychological study. Compr Psychiatry. 2024;130:152460. doi:10.1016/j.comppsych.2024.152460
Afshar M, Phillips A, Karnik N, et al. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: Development and internal validation. J Am Med Inform Assoc. 2019;26(3):254-261. doi:10.1093/jamia/ocy166
Squeglia LM, Ball TM, Jacobus J, et al. Neural predictors of initiating alcohol use during adolescence. Am J Psychiatry. 2017;174(2):172-185. doi:10.1176/appi.ajp.2016.15121587
Sekutowicz M, Guggenmos M, Kuitunen-Paul S, et al. Neural response patterns during Pavlovian-to-instrumental transfer predict alcohol relapse and young adult drinking. Biol Psychiatry. 2019;86(11):857-863. doi:10.1016/j.biopsych.2019.06.028
Mohd Nazri AK, Yahya N, Khan DM, et al. Partial directed coherence analysis of resting-state EEG signals for alcohol use disorder detection using machine learning. Front Neurosci. 2024;18:1524513. doi:10.3389/fnins.2024.1524513
Witkiewitz K, Kirouac M, Baurley JW, McMahan CS. Patterns of drinking behavior around a treatment episode for alcohol use disorder: Predictions from pre-treatment measures. Alcohol Clin Exp Res. 2023;47(11):2138-2148. doi:10.1111/acer.15183
Marcon G, de Ávila Pereira F, Zimerman A, et al. Patterns of high-risk drinking among medical students: A web-based survey with machine learning. Comput Biol Med. 2021;136:104747. doi:10.1016/j.compbiomed.2021.104747
Leenaerts N, Soyster P, Ceccarini J, Sunaert S, Fisher A, Vrieze E. Person-specific and pooled prediction models for binge eating, alcohol use and binge drinking in bulimia nervosa and alcohol use disorder. Psychol Med. 2024;54(10):2758-2773. doi:10.1017/S0033291724000862
Yang JJ, Luo X, Trucco EM, Buu A. Polygenic risk prediction based on singular value decomposition with applications to alcohol use disorder. BMC Bioinform. 2022;23(1):28. doi:10.1186/s12859-022-04566-5
Soyster PD, Ashlock L, Fisher AJ. Pooled and person-specific machine learning models for predicting future alcohol consumption, craving, and wanting to drink: A demonstration of parallel utility. Psychol Addict Behav. 2022;36(3):296-306. doi:10.1037/adb0000666
Symons M, Feeney GFX, Gallagher MR, Young RM, Connor JP. Predicting alcohol dependence treatment outcomes: A prospective comparative study of clinical psychologists versus “trained” machine learning models. Addiction. 2020;115(11):2164-2175. doi:10.1111/add.15038
Kinreich S, McCutcheon VV, Aliev F, et al. Predicting alcohol use disorder remission: A longitudinal multimodal multi-featured machine learning approach. Transl Psychiatry. 2021;11(1):166. doi:10.1038/s41398-021-01281-2
Kamarajan C, Pandey AK, Chorlian DB, et al. Predicting alcohol-related memory problems in older adults: A machine learning study with multi-domain features. Behav Sci. 2023;13(5):427. doi:10.3390/bs13050427
Leaks K, Norden-Krichmar T, Brody JP. Predicting moderate drinking behaviors in National Health and Nutrition Examination Survey participants using biochemical and demographical factors with machine learning. Alcohol. 2023;113:1-10. doi:10.1016/j.alcohol.2023.07.005
Zhang J, Qian S, Su G, Deng C, Yu P. Predicting readmission following hospital treatment for patients with alcohol related diagnoses in an Australian regional health district. Stud Health Technol Inform. 2022;290:1072-1073. doi:10.3233/SHTI220273
Ramos LA, Blankers M, van Wingen G, de Bruijn T, Pauws SC, Goudriaan AE. Predicting success of a digital self-help intervention for alcohol and substance use with machine learning. Front Psychol. 2021;12:734633. doi:10.3389/fpsyg.2021.734633
Seo S, Mohr J, Beck A, Wüstenberg T, Heinz A, Obermayer K. Predicting the future relapse of alcohol-dependent patients from structural and functional brain images. Addict Biol. 2015;20(6):1042-1055. doi:10.1111/adb.12302
Agarwal K, Chaudhary S, Tomasi D, Volkow ND, Joseph PV. Prediction of alcohol intake patterns with olfactory and gustatory brain connectivity networks. Neuropsychopharmacology. 2025;50(7):1167-1175. doi:10.1038/s41386-025-02058-7
Chung T, Suffoletto B, Feldstein Ewing SW, Bhurosy T, Jiang Y, Valera P. Prediction rules identify which young adults have higher rates of heavy episodic drinking after exposure to 12-week text message interventions. Subst Use Addctn J. 2024;45(1):144-149. doi:10.1177/29767342231206653
Gueorguieva R, Wu R, Fucito LM, O’Malley SS. Predictors of abstinence from heavy drinking during follow-up in COMBINE. J Stud Alcohol Drugs. 2015;76(6):935-941. doi:10.15288/jsad.2015.76.935
Wallach JD, Gueorguieva R, Phan H, Witkiewitz K, Wu R, O’Malley SS. Predictors of abstinence, no heavy drinking days, and a 2-level reduction in World Health Organization drinking levels during treatment for alcohol use disorder in the COMBINE study. Alcohol Clin Exp Res. 2022;46(7):1331-1339. doi:10.1111/acer.14877
Zhu X, Du X, Kerich M, Lohoff FW, Momenan R. Random forest based classification of alcohol dependence patients and healthy controls using resting state MRI. Neurosci Lett. 2018;676:27-33. doi:10.1016/j.neulet.2018.04.007
Kamarajan C, Ardekani BA, Pandey AK, et al. Random forest classification of alcohol use disorder using fMRI functional connectivity, neuropsychological functioning, and impulsivity measures. Brain Sci. 2020;10(2):115. doi:10.3390/brainsci10020115
Schwebel FJ, Pearson MR, Richards DK, et al. Regression tree applications to studying alcohol-related problems among college students. Exp Clin Psychopharmacol. 2024;32(5):542-553. doi:10.1037/pha0000718
Duadi D, Yosovich A, Beiderman M, et al. Remote sensing of alcohol consumption using machine learning speckle pattern analysis. J Biomed Opt. 2025;30(3):037001. doi:10.1117/1.JBO.30.3.037001
Fede SJ, Grodin EN, Dean SF, Diazgranados N, Momenan R. Resting state connectivity best predicts alcohol use severity in moderate to heavy alcohol users. NeuroImage Clin. 2019;22:101782. doi:10.1016/j.nicl.2019.101782
Rosato AJ, Chen X, Tanaka Y, et al. Salivary microRNAs identified by small RNA sequencing and machine learning as potential biomarkers of alcohol dependence. Epigenomics. 2019;11(7):739-749. doi:10.2217/epi-2018-0177
Didier NA, King AC, Polley EC, Fridberg DJ. Signal processing and machine learning with transdermal alcohol concentration to predict natural environment alcohol consumption. Exp Clin Psychopharmacol. 2024;32(2):245-254. doi:10.1037/pha0000683
Rane RP, de Man EF, Kim J, et al. Structural differences in adolescent brains can predict alcohol misuse. Elife. 2022;11:e77545. doi:10.7554/eLife.77545
Weidacker K, Kim SG, Buhl-Callesen M, et al. The prediction of resilience to alcohol consumption in youths: Insular and subcallosal cingulate myeloarchitecture. Psychol Med. 2022;52(11):2032-2042. doi:10.1017/S0033291720003852
Dagnew TM, Tseng CJ, Yoo CH, et al. Toward AI-driven neuroepigenetic imaging biomarker for alcohol use disorder: A proof-of-concept study. iScience. 2024;27(7):110159. doi:10.1016/j.isci.2024.110159
Rane RP, Musial MPM, Beck A, et al. Uncontrolled eating and sensation-seeking partially explain the prediction of future binge drinking from adolescent brain structure. NeuroImage Clin. 2023;40:103520. doi:10.1016/j.nicl.2023.103520
Lindner P, Johansson M, Gajecki M, Berman AH. Using alcohol consumption diary data from an internet intervention for outcome and predictive modeling: A validation and machine learning study. BMC Med Res Methodol. 2020;20(1):111. doi:10.1186/s12874-020-00995-z
Amialchuk A, Sapci O, Elhai JD. Applying machine learning methods to model social interactions in alcohol consumption among adolescents. Addict Res Theory. 2021;29(5):436-443. doi:10.1080/16066359.2021.1887147
Lee MR, Sankar V, Hammer A, et al. Using machine learning to classify individuals with alcohol use disorder based on treatment seeking status. EClinicalmedicine. 2019;12:70-78. doi:10.1016/j.eclinm.2019.05.008
Schwebel FJ, Emery NN, Pfund RA, Pearson MR, Witkiewitz K. Using machine learning to examine predictors of treatment goal change among individuals seeking treatment for alcohol use disorder. J Subst Abuse Treat. 2022;140:108825. doi:10.1016/j.jsat.2022.108825
Walters ST, Businelle MS, Suchting R, Li X, Hébert ET, Mun EY. Using machine learning to identify predictors of imminent drinking and create tailored messages for at-risk drinkers experiencing homelessness. J Subst Abuse Treat. 2021;127:108417. doi:10.1016/j.jsat.2021.108417
Roberts W, Zhao Y, Verplaetse T, et al. Using machine learning to predict heavy drinking during outpatient alcohol treatment. Alcohol Clin Exp Res. 2022;46(4):657-666. doi:10.1111/acer.14802
To D, Sharma B, Karnik N, Joyce C, Dligach D, Afshar M. Validation of an alcohol misuse classifier in hospitalized patients. Alcohol. 2020;84:49-55. doi:10.1016/j.alcohol.2019.09.008
Rehm J, Manthey J, Struzzo P, Gual A, Wojnar M. Who receives treatment for alcohol use disorders in the European Union? A cross-sectional representative study in primary and specialized health care. Eur Psychiatry. 2015;30(8):885-893. doi:10.1016/j.eurpsy.2015.07.012
Foster S, Gmel G, Mohler-Kuo M. Young Swiss men’s risky single-occasion drinking: Identifying those who do not respond to stricter alcohol policy environments. Drug Alcohol Depend. 2022;234:109410. doi:10.1016/j.drugalcdep.2022.109410
Kumari D, Swetapadma A. A novel method for predicting time of alcohol use based on personality traits and demographic information. IETE J Res. 2023;69(11):7846-7855. doi:10.1080/03772063.2022.2060874
Ruiz-España S, Ortiz-Ramón R, Pérez-Ramírez Ú, et al. MRI texture-based radiomics analysis for the identification of altered functional networks in alcoholic patients and animal models. Comput Med Imaging Graph. 2023;104:102187. doi:10.1016/j.compmedimag.2023.102187
Adeli E, Zahr NM, Pfefferbaum A, Sullivan EV, Pohl KM. Novel machine learning identifies brain patterns distinguishing diagnostic membership of human immunodeficiency virus, alcoholism, and their comorbidity of individuals. Biol Psychiatry Cogn Neurosci Neuroimaging. 2019;4(6):589-599. doi:10.1016/j.bpsc.2019.02.003
Mumtaz W, Vuong PL, Xia L, Malik AS, Rashid RBA. Automatic diagnosis of alcohol use disorder using EEG features. Knowl-Based Syst. 2016;105:48-59. doi:10.1016/j.knosys.2016.04.026
Bishop CM. Pattern Recognition and Machine Learning. New York, NY: Springer-Verlag; 2006.
White AM. Gender differences in the epidemiology of alcohol use and related harms in the United States. Alcohol Res. 2020;40(2):01. doi:10.35946/arcr.v40.2.01
Carlini LE, Fernandez AC, Mellinger JL. Sex and gender in alcohol use disorder and alcohol-associated liver disease in the United States: A narrative review. Hepatology. 2024;83(1):178-194. doi:10.1097/HEP.0000000000000905
National Institute on Alcohol Abuse and Alcoholism. Alcohol’s Effects on Health: Women and Alcohol. Updated 2025. https://www.niaaa.nih.gov/publications/brochures-and-fact-sheets/women-and-alcohol.
Keyes KM. Age, period, and cohort effects in alcohol use in the United States in the 20th and 21st centuries: Implications for the coming decades. Alcohol Res. 2022;42(1):02. doi:10.35946/arcr.v42.1.02
Zucker RA. Anticipating problem alcohol use developmentally from childhood into middle adulthood: What have we learned? Addiction. 2008;103(suppl 1):100-108. doi:10.1111/j.1360-0443.2008.02179.x
Dawson DA, Goldstein RB, Chou SP, Ruan WJ, Grant BF. Age at first drink and the first incidence of adult-onset DSM-IV alcohol use disorders. Alcohol Clin Exp Res. 2008;32(12):2149-2160. doi:10.1111/j.1530-0277.2008.00806.x
Merrill JE, Carey KB. Drinking over the lifespan: Focus on college ages. Alcohol Res. 2016;38(1):103-114. doi:10.35946/arcr.v38.1.13
Rohde P, Lewinsohn PM, Kahler CW, Seeley JR, Brown RA. Natural course of alcohol use disorders from adolescence to young adulthood. J Am Acad Child Adolesc Psychiatry. 2001;40(1):83-90. doi:10.1097/00004583-200101000-00020
Sher KJ, Gotham HJ. Pathological alcohol involvement: A developmental disorder of young adulthood. Dev Psychopathol. 1999;11(4):933-956. doi:10.1017/s0954579499002394
Patrick ME, Terry-McElrath YM, Lanza ST, Jager J, Schulenberg JE, O’Malley PM. Shifting age of peak binge drinking prevalence: Historical changes in normative trajectories among young adults aged 18 to 30. Alcohol Clin Exp Res. 2019;43(2):287-298. doi:10.1111/acer.13933
Patrick ME, Pang YC, Jang BJ, Arterberry BJ, Terry-McElrath YM. Alcohol Use Disorder Symptoms Reported during Midlife: Results from the Monitoring the Future Study among US Adults at Modal Ages 50, 55, and 60. Subst Use Misuse. 2023;58(3):380-388. doi: 10.1080/10826084.2022.2161826
Hingson RW, Zha W. Age of drinking onset, alcohol use disorders, frequent heavy drinking, and unintentionally injuring oneself and others after drinking. Pediatrics. 2009;123(6):1477-1484. doi:10.1542/peds.2008-2176
Chikritzhs T, Livingston M. Alcohol and the risk of injury. Nutrients. 2021;13(8):2777. doi:10.3390/nu13082777
Institute for Health Metrics and Evaluation. Findings from the Global Burden of Disease Study 2017. 2018. https://www.healthdata.org/sites/default/files/files/policy_report/2019/GBD_2017_Booklet.pdf.
Manthey J, Hassan SA, Carr S, Kilian C, Kuitunen-Paul S, Rehm J. What are the economic costs to society attributable to alcohol use? A systematic review and modelling study. Pharmacoeconomics. 2021;39(7):809-822. doi:10.1007/s40273-021-01031-8
Thabtah F, Hammoud S, Kamalov F, Gonsalves A. Data imbalance in classification: Experimental evaluation. Inf Sci. 2020;513:429-441. doi:10.1016/j.ins.2019.11.004
Ekhtiari H, Sangchooli A, Carmichael O, et al. Neuroimaging biomarkers in addiction. medRxiv. 2024. doi:10.1101/2024.09.02.24312084
Volkow ND, Baler RD. Brain imaging biomarkers to predict relapse in alcohol addiction. JAMA Psychiatry. 2013;70(7):661-663. doi:10.1001/jamapsychiatry.2013.1141
Hargreaves TL, McIntyre-Wood C, Vandehei E, et al. Brain structural magnetic resonance imaging predictors of brief intervention response in individuals with alcohol use disorder. Alcohol Alcohol. 2025;60(3):agaf009. doi:10.1093/alcalc/agaf009
Elliott ML, Knodt AR, Ireland D, et al. What is the test-retest reliability of common task-functional MRI measures? New empirical evidence and a meta-analysis. Psychol Sci. 2020;31(7):792-806. doi:10.1177/0956797620916786
Marek S, Tervo-Clemmens B, Calabro FJ, et al. Reproducible brain-wide association studies require thousands of individuals. Nature. 2022;603(7902):654-660. doi:10.1038/s41586-022-04492-9
Tomasi D, Volkow ND. Association between brain activation and functional connectivity. Cereb Cortex. 2019;29(5):1984-1996. doi:10.1093/cercor/bhy077
Passaro AD, Vettel JM, McDaniel J, Lawhern V, Franaszczuk PJ, Gordon SM. A novel method linking neural connectivity to behavioral fluctuations: Behavior-regressed connectivity. J Neurosci Methods. 2017;279:60-71. doi:10.1016/j.jneumeth.2017.01.010
Mascarell Maričić L, Walter H, Rosenthal A, et al. The IMAGEN study: A decade of imaging genetics in adolescents. Mol Psychiatry. 2020;25(11):2648-2671. doi:10.1038/s41380-020-0822-5
Hua J, Xiong Z, Lowey J, Suh E, Dougherty ER. Optimal number of features as a function of sample size for various classification rules. Bioinformatics. 2005;21(8):1509-1515. doi:10.1093/bioinformatics/bti171
Ebrahimi A, Wiil UK, Schmidt T, et al. Predicting the risk of alcohol use disorder using machine learning: A systematic literature review. IEEE Access. 2021;9:151697-151712. doi:10.1109/ACCESS.2021.3126777
Zantvoort K, Nacke B, Görlich D, Hornstein S, Jacobi C, Funk B. Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions. NPJ Digit Med. 2024;7(1):361. doi:10.1038/s41746-024-01360-w
Chen T, Guestrin C. XGBoost: A scalable tree boosting system. Presented at Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13-17, 2016; San Francisco, CA. doi:10.1145/2939672.2939785
Shwartz-Ziv R, Armon A. Tabular data: Deep learning is not all you need. Inf Fusion. 2022;81:84-90. doi:10.1016/j.inffus.2021.11.011
Grinsztajn L, Oyallon E, Varoquaux G. Why do tree-based models still outperform deep learning on typical tabular data? Presented at Proceedings of the 36th International Conference on Neural Information Processing Systems. NeurIPS Proceedings. November 28-December 9, 2022; New Orleans, LA. https://proceedings.neurips.cc/paper_files/paper/2022/file/0378c7692da36807bdec87ab043cdadc-Paper-Datasets_and_Benchmarks.pdf.
Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform. 2007;8:25. doi:10.1186/1471-2105-8-25
Stihec J. Choose your AI weapon: Deep learning or traditional machine learning. 2024. https://shelf.io/blog/choose-your-ai-weapon-deep-learning-or-traditional-machine-learning.
Abgrall G, Holder AL, Chelly Dagdia Z, Zeitouni K, Monnet X. Should AI models be explainable to clinicians? Crit Care. 2024;28(1):301. doi:10.1186/s13054-024-05005-y
Shivashankar K, Al Hajj GA, Martini A. Maintainability and scalability in machine learning: Challenges and solutions. ACM Comput Surv. 2025;57(12):Article 318. doi:10.1145/3736751
Woo C-W, Chang LJ, Lindquist MA, Wager TD. Building better biomarkers: Brain models in translational neuroimaging. Nat Neurosci. 2017;20(3):365-377. doi:10.1038/nn.4478
Rashid B, Calhoun V. Towards a brain-based predictome of mental illness. Hum Brain Mapp. 2020;41(12):3468-3535. doi:10.1002/hbm.25013
Delile J, Mukherjee S, Mueller J, Khalil I, Zhukov L, Meier C. Foundation models in drug discovery: Phenomenal growth today, transformative potential tomorrow? Drug Discov Today. 2025;30(12):104518. doi:10.1016/j.drudis.2025.104518
Duffy G, Clarke SL, Christensen M, et al. Confounders mediate AI prediction of demographics in medical imaging. NPJ Digit Med. 2022;5(1):188. doi:10.1038/s41746-022-00720-8
Mayhugh RE, Moussa MN, Simpson SL, et al. Moderate-heavy alcohol consumption lifestyle in older adults is associated with altered central executive network community structure during cognitive task. PLOS One. 2016;11(8):e0160214. doi:10.1371/journal.pone.0160214
Rosenblatt M, Tejavibulya L, Jiang R, Noble S, Scheinost D. Data leakage inflates prediction performance in connectome-based machine learning models. Nat Commun. 2024;15(1):1829. doi:10.1038/s41467-024-46150-w
Hamdan S, Love BC, von Polier GG, et al. Confound-leakage: Confound removal in machine learning leads to leakage. GigaScience. 2022;12:giad071. doi:10.1093/gigascience/giad071
Zhao Q, Adeli E, Pfefferbaum A, Sullivan EV, Pohl KM. Confounder-aware visualization of ConvNets. In: Machine Learning in Medical Imaging. Cham, Switzerland; 2019;11861:328-336. doi:10.1007/978-3-030-32692-0_38
Eshaghzadeh Torbati M, Minhas DS, Ahmad G, et al. A multi-scanner neuroimaging data harmonization using RAVEL and ComBat. Neuroimage. 2021;245:118703. doi:10.1016/j.neuroimage.2021.118703
Zhao H, des Combes RT, Zhang K, Gordon GJ. On Learning Invariant Representations for Domain Adaptation. Presented at 36th International Conference on Machine Learning; June 9-15, 2019; Long Beach, CA.
Zhao Q, Adeli E, Pohl KM. Training confounder-free deep learning models for medical applications. Nat Commun. 2020;11(1):6010. doi:10.1038/s41467-020-19784-9
Chen RJ, Wang JJ, Williamson DFK, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng. 2023;7(6):719-742. doi:10.1038/s41551-023-01056-8
Castelnovo A, Crupi R, Greco G, Regoli D, Penco IG, Cosentini AC. A clarification of the nuances in the fairness metrics landscape. Sci Rep. 2022;12(1):4209. doi:10.1038/s41598-022-07939-1
Chyzhyk D, Varoquaux G, Milham M, Thirion B. How to remove or control confounds in predictive models, with applications to brain biomarkers. GigaScience. 2022;11:giac014. doi:10.1093/gigascience/giac014
Zhao Q, Nooner KB, Tapert SF, et al. The transition from homogeneous to heterogeneous machine learning in neuropsychiatric research. Biol Psychiatry Glob Open Sci. 2025;5(1):100397. doi:10.1016/j.bpsgos.2024.100397
Peng W, Bosschieter T, Ouyang J, et al. Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs. Med Image Anal. 2024;98:103325. doi:10.1016/j.media.2024.103325
Hardoon DR, Szedmak S, Shawe-Taylor J. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 2004;16(12):2639-2664. doi:10.1162/0899766042321814
Zhao Q, Milecki L, Kuceyeski A, et al. Socioemotional and Executive Control Mismatch in Adolescence Heightens Risks for Initiating Drinking. JAMA Netw Open. 2025; 8(9):e2531378. doi:10.1001/jamanetworkopen.2025.31378
Ullman JB, Bentler PM. Structural equation modeling. In: Weine I, ed. Handbook of Psychology. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc.; 2012:661-690. doi:10.1002/9781118133880.hop202023
AI for Mental Health. Stanford. http://ai4mh.stanford.edu.
Han R, Acosta JN, Shakeri Z, Ioannidis JPA, Topol EJ, Rajpurkar P. Randomised controlled trials evaluating artificial intelligence in clinical practice: A scoping review. Lancet Digit Health. 2024;6(5):e367-e373. doi:10.1016/S2589-7500(24)00047-5
Wu K, Wu E, Theodorou B, et al. Characterizing the clinical adoption of medical AI devices through U.S. insurance claims. NEJM Ai. 2024;1(1):AIoa2300030. doi:10.1056/AIoa2300030
Muralidharan V, Adewale BA, Huang CJ, et al. A scoping review of reporting gaps in FDA-approved AI medical devices. NPJ Digit Med. 2024;7(1):273. doi:10.1038/s41746-024-01270-x
U.S. Food and Drug Administration. Artificial intelligence-enabled medical devices. 2025. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-enabled-medical-devices.
Sempionatto JR, Brazaca LC, García-Carmona L, et al. Eyeglasses-based tear biosensing system: Non-invasive detection of alcohol, vitamins and glucose. Biosens Bioelectron. 2019;137:161-170. doi:10.1016/j.bios.2019.04.058
Campbell AS, Kim J, Wang J. Wearable electrochemical alcohol biosensors. Curr Opin Electrochem. 2018;10:126-135. doi:10.1016/j.coelec.2018.05.014
Gallitto G, Englert R, Kincses B, et al. External validation of machine learning models-registered models and adaptive sample splitting. GigaScience. 2025;14:giaf036. doi:10.1093/gigascience/giaf036
Wiggins WF, Tejani AS. On the opportunities and risks of foundation models for natural language processing in radiology. Radiol Artif Intell. 2022;4(4):e220119. doi:10.1148/ryai.220119

Appendices

Appendix 1. Characteristics of the 110 Peer-Reviewed Human Studies Identified in This Article
Author	Title	AI Model	Sample Size	Predictor Category	Number of Predictor Variables	Area Under the Curve	External Data Use	Mean Age (Years)	Male Proportion
Alcohol Concentration
Fairbairn (2025)⁵⁷	A wearable alcohol biosensor: exploring the accuracy of transdermal drinking detection	Ensemble method	100	Sensor		96%	No	24.2	50%
Oszkinat (2022)⁶²	An abstract parabolic system-based physics-informed long short-term memory network for estimating breath alcohol concentration from transdermal alcohol biosensor data	LSTM	40	Sensor	26		No	27	50%
Li (2021)⁸⁸	Estimation of blood alcohol concentration from smartphone gait data using neural networks	Deep learning	65	Sensor	48		No	30.8	38%
Ariss (2023)⁸⁹	Examining new-generation transdermal alcohol biosensor performance across laboratory and field contexts	ExtraTrees	256	Sensor			Yes	25.5	45%
AUD Diagnosis
Guggenmos (2020)⁵⁵	A multimodal neuroimaging classifier for alcohol dependence	Ensemble method	216	Neuroimaging	1,791	79%	No	45	16%
Mulholland (2023)⁵⁹	Adapter protein complex 2 in the orbitofrontal cortex predicts alcohol use disorder	sPLS-DA	28	Specimen	4,503	100%	No	55	50%
Anuragi (2019)¹¹⁴	Alcohol use disorder detection using EEG Signal features and flexible analytic wavelet transform	SVM	122	Neuroimaging	15	99%	No
Mumtaz (2017)⁶⁴	An EEG-based machine learning method to screen alcohol use disorder	Logistic regression	45	Neuroimaging	4		No	51.1
Zhang (2022)⁶⁷	Association between abnormal plasma metabolism and brain atrophy in alcohol-dependent patients	XGBoost	226	Specimen	26	77%	No	40	100%
Komarnyckyj (2022)⁶⁸	At-risk alcohol users have disrupted valence discrimination during reward anticipation	Multivariate discriminant analysis	44	Neuroimaging			No	23.8	54%
Song (2024)⁶⁹	Atypical effective connectivity from the frontal cortex to striatum in alcohol use disorder	Uncertain	62	Neuroimaging		97%	No	38	100%
Mumtaz (2016)¹⁶¹	Automatic diagnosis of alcohol use disorder using EEG features	Logistic model trees	45	Neuroimaging	133	97%	No	49
Besong (2024)⁷⁰	Brain lncRNA-mRNA co-expression regulatory networks and alcohol use disorder	Random forest	24	Gene	10	89%	No	48.4	50%
Zhu (2021)⁷⁴	Combining metabolomics and interpretable machine learning to reveal plasma metabolic profiling and biological correlates of alcohol-dependent inpatients: What about tryptophan metabolism regulation?	Decision tree	85	Specimen	39	93%	No	45.5	100%
Ebrahimi (2021)⁷⁸	Deep neural network to identify patients with alcohol use disorder	MLP	2,571	EHR	857	97%	No
Kamarajan (2022)⁸⁴	Differentiating individuals with and without alcohol use disorder using resting-state fMRI functional connectivity of reward network, neuropsychological performance, and impulsivity measures	Random forest	60	Neuroimaging, behavior	47	93%	No	34	100%
Chen (2020)⁸⁵	Discrimination of alcohol dependence based on the convolutional neural network	Deep learning	317	Gene, demographic	20	92%	No	40
Sangle (2025)¹¹³	Explaining electroencephalogram channel and subband sensitivity for alcoholism detection	Neural network	122	Neuroimaging	320	99%	No	35.8
Ebrahimi (2022)⁹⁴	Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods	Random forest	2,571	EHR	272	96%	No	70	47%
Vergara (2022)⁹⁶	Identifying alcohol use disorder with resting state functional magnetic resonance imaging data: A comparison among machine learning classifiers	SVM, MLP, logistic regression, random forest	102	Neuroimaging	561	79%	No	34.6	65%
Guleken (2025)⁹⁹	Investigating the Impact of long-term alcohol consumption on serum chemical changes: Fourier-transform infrared spectroscopy for human blood serum	SVM	71	Specimen	2,493	100%	No
Morris (2016)¹⁰¹	Jumping the gun: mapping neural correlates of waiting impulsivity and relevance across alcohol misuse	SVM	134	Neuroimaging		60%	No	32.4	53%
Zhang (2025)¹⁰⁵	Machine learning models for diagnosis and risk prediction in eating disorders, depression, and alcohol use disorder	Elastic net	258	Behavior, environment, mental health, substance use, physical	33	80%	Yes	22.6	42%
Zhu (2024)¹⁰⁷	Machine learning of functional connectivity to biotype alcohol and nicotine use disorders	MLP	850	Neuroimaging	35,778	76%	No	54	72%
Kummerfeld (2018)¹¹²	Methodological advances in the study of hidden variables: A demonstration on clinical alcohol use disorder data	GFCI	362	Substance use, mental health	10		No
Ruiz-España (2023)¹⁵⁹	MRI texture-based radiomics analysis for the identification of altered functional networks in alcoholic patients and animal model	SVM	68	Neuroimaging	43	82%	Yes	43.9	100%
Lee (2024)¹¹⁷	Multimodal-based machine learning approach to classify features of Internet gaming disorder and alcohol use disorder: A sensor-level and source-level resting-state electroencephalography activity and neuropsychological study	Logistic regression	124	Neuroimaging, behavior, mental health, demographic	45		No	26.4	82%
Adeli (2019)¹⁶⁰	Novel machine learning identifies brain patterns distinguishing diagnostic membership of human immunodeficiency virus, alcoholism, and their comorbidity of individuals	Logistic regression	421	Neuroimaging	298	70%	No	47.5	62.4%
Mohd Nazri (2025)¹²¹	Partial directed coherence analysis of resting-state EEG signals for alcohol use disorder detection using machine learning	SVM	70	Neuroimaging	324	98.8%	No	48	56%
Yang (2022)¹²⁵	Polygenic risk prediction based on singular value decomposition with applications to alcohol use disorder	Linear regression	11,982	Gene			Yes
Kamarajan (2023)¹²⁹	Predicting alcohol-related memory problems in older adults: A machine learning study with multidomain features	Random forest	94	Neuroimaging, behavior, alcohol use, gene	72	88%	No	40	55%
Kinreich (2021)⁷	Predicting risk for alcohol use disorder using longitudinal data with multimodal biomarkers and family history: A machine learning study	SVM	656	Neuroimaging, substance use, gene	371	91%	No		57%
Zhu (2018)¹³⁸	Random forest based classification of alcohol dependence patients and healthy controls using resting state MRI	Random forest	92	Neuroimaging	496	73%	No	36	64%
Kamarajan (2020)¹³⁹	Random forest classification of alcohol use disorder using fMRI functional connectivity, neuropsychological functioning, and impulsivity measures	Random forest	60	Neuroimaging, behavior	83	76%	No	34	100%
Rosato (2019)¹⁴³	Salivary microRNAs identified by small RNA sequencing and machine learning as potential biomarkers of alcohol dependence	Random forest	120	Gene	5	75%	No	41	46%
Dagnew (2024)¹⁴⁷	Toward AI-driven neuro-epigenetic imaging biomarker for alcohol use disorder: A proof-of-concept study	SVM	27	Neuroimaging	5	83%	No	38.8	70%
Lee (2019)¹⁵¹	Using machine learning to classify individuals with alcohol use disorder based on treatment seeking status	Decision tree	1,014	Environment, behavior, mental health, specimen	178	73%	Yes	43.3	70.6%
To (2020)¹⁵⁵	Validation of an alcohol misuse classifier in hospitalized patients	LASSO	1,000	EHR	10,000	91%	No	49
Consumption Pattern
O’Halloran (2018)⁵¹	A combination of impulsivity subdomains predict alcohol intoxication frequency	Elastic net	106	Substance use, behavior	18		No	19.5	56%
Kim (2021)⁵²	A deep learning algorithm to predict hazardous drinkers and the severity of alcohol-related problems using K-NHANES	MLP	69,187	Physical, specimen	325	87%	No
Bonnell (2020)⁵³	A machine learning approach to identification of unhealthy drinking	Random forest	43,545	Demographic, specimen	15	78%	No	49	48.6%
Johnson (2024)⁵⁴	A machine learning model for the prediction of unhealthy alcohol use among women of childbearing age in Alabama	SVM	2,397	Demographic, mental health	2		No	30.6	0%
Kumari (2022)¹⁵⁸	A novel method for predicting time of alcohol use based on personality traits and demographic information	Random forest		Behavior, demographic		77%
May (2022)⁵⁶	A prospective investigation of youth alcohol experimentation and reward responsivity in the ABCD study	SVM, linear regression	7,409	Neuroimaging, substance use	9	56%	No	10	50%
Sun (2023)⁶⁰	Adolescent alcohol use is linked to disruptions in age-appropriate cortical thinning: an unsupervised machine learning approach	NMF	657	Neuroimaging			No	15.6	50%
Park (2018)⁶¹	Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals	Linear regression	750	Neuroimaging	144	83.2%	No	15.8	47%
Curtis (2018)⁷¹	Can Twitter be used to predict county excessive alcohol consumption rates?	Ridge regression	1,384	Social media	2,000		No
Stevely (2021)⁷³	Combinations of drinking occasion characteristics associated with units of alcohol consumed among British adults: an event-level decision tree modeling study	Linear regression, decision tree	18,409	Substance use, behavior	53		No
Li (2025)⁷⁷	Connectomics modeling of regional networks of white-matter fractional anisotropy to predict the severity of young adult drinking	Linear regression	949	Neuroimaging	13,456		No	28.8	48%
Crocamo (2020)⁸⁰	Detecting binge drinking and alcohol-related risky behaviors from Twitter’s users: an exploratory content- and topology-based analysis	SVM	500	Social media	11	76%	No
Bush (2024)⁸²	Development of an accelerometer-based wearable sensor approach for alcohol consumption detection	Random forest	194	Sensor	3		No	37.1	33%
Lee (2025)⁸³	Development of deep learning auto-encoder algorithms for predicting alcohol use in Korean adolescents based on cross-sectional data	MLP	41,239	Physical, behavior, environment	46	72%	No	15	57%
Liang (2021)⁸⁶	DNA methylation signature on phosphatidylethanol, not on self-reported alcohol consumption, predicts hazardous alcohol consumption in two distinct populations	Elastic net	1,549	Gene	143	74%	Yes	42	79%
Marengo (2019)⁹⁰	Exploring the association between problem drinking and language use on Facebook in young adults	Random forest	296	Social media	69		No	28.4	33%
Lin (2022)⁹¹	External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients	Logistic regression	57,605	EHR	54,000	91%	Yes	61	42%
Huang (2017)⁹³	High-resolution temporal representations of alcohol and tobacco behaviors from social media data	Logistic regression		Social media		80%	No
Zhu (2022)⁹⁵	Identifying alcohol misuse biotypes from neural connectivity markers and concurrent genetic associations	MLP	739	Neuroimaging	521	70%	No	28.5	43%
Sania (2025)¹⁰²	K-nearest neighbor algorithm for imputing missing longitudinal prenatal alcohol data	kNN	1,1083	Substance use	325		No		0%
Rezapour (2023)¹⁰⁹	Machine learning-based analytics of the impact of the Covid-19 pandemic on alcohol consumption habit changes among U.S. healthcare workers	XGBoost	273	Mental health, behavior	25	91%	No	38	30%
Bae (2018)¹¹⁵	Mobile phone sensors and supervised machine learning to identify alcohol use events in young adults: implications for just-in-time adaptive interventions	Random forest	38	Sensor	56	96%	No	23	60%
Grodin (2021)¹¹⁶	Modeling motivation for alcohol in humans using traditional and machine learning approaches	Random forest	67	Demographic, substance use, behavior	32		No	29	54%
Afshar (2019)¹¹⁸	Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation	Logistic regression	1,422	EHR	16	78%	No	44	70%
Squeglia (2017)¹¹⁹	Neural predictors of initiating alcohol use during adolescence	Random forest	137	Neuroimaging, demographic, mental health, behavior	34		No	13	55%
Marcon (2021)¹²³	Patterns of high-risk drinking among medical students: A web-based survey with machine learning	Elastic net	4,840	Demographic, environment, substance use	33	72%	No	21.8	24%
Soyster (2022)¹²⁶	Pooled and person-specific machine learning models for predicting future alcohol consumption, craving, and wanting to drink: A demonstration of parallel utility	Elastic net	33	Behavior, mental health, substance use	40	76%	No	19	7%
Leaks (2023)¹³⁰	Predicting moderate drinking behaviors in National Health and Nutrition Examination Survey participants using biochemical and demographical factors with machine learning	Ensemble method	4,219	Demographic, specimen	33	80%	No	51.9	43%
Agarwal (2025)¹³⁴	Prediction of alcohol intake patterns with olfactory and gustatory brain connectivity networks	Linear regression	1,003	Neuroimaging	41		No	28.7	54%
Schwebel (2024)¹⁴⁰	Regression tree applications to studying alcohol-related problems among college students	Decision tree	5,090	Demographic, substance use, behavior, mental health	71		Yes	20.8	29%
Duadi (2025)¹⁴¹	Remote sensing of alcohol consumption using machine learning speckle pattern analysis	XGBoost	5	Sensor	4	88%	No	35.2	30%
Fede (2019)¹⁴²	Resting state connectivity best predicts alcohol use severity in moderate to heavy alcohol users	Random forest	83	Neuroimaging, demographic	922		No	41	63%
Didier (2024)¹⁴⁴	Signal processing and machine learning with transdermal alcohol concentration to predict natural environment alcohol consumption	Elastic net	36	Sensor	11	98%	No	27.3	50%
Foster (2022)¹⁵⁷	Young Swiss men’s risky single-occasion drinking: identifying those who do not respond to stricter alcohol policy environments	Random forest	5,986	Demographic, mental health, behavior, environment	21		No	20	100%
Prognostic
Amialchuk (2021)¹⁵⁰	Applying machine learning methods to model social interactions in alcohol consumption among adolescents	Gradient boosting	4,686	Demographic, substance use	27	80%	No	14.9	48%
Uceta (2024)⁷²	Clustering electrophysiological predisposition to binge drinking: an unsupervised machine learning analysis	Clustering	103	Neuroimaging	78		No	13.7	48%
Pinar-Sanchez (2022)⁷⁶	Common laboratory parameters are useful for screening for alcohol use disorder: designing a predictive model using machine learning	Naive bayes	337	Specimen	60		No	44	75%
Miranda (2024)⁷⁹	DeepBiomarker2: prediction of alcohol and substance use disorder risk in post-traumatic stress disorder patients using electronic medical records and multiple social determinants of health	Deep learning	15,612	Environment, behavior, mental health		93%	No	36	27%
Bharat (2023)⁸¹	Development and evaluation of a risk algorithm predicting alcohol dependence after early onset of regular alcohol use	Ensemble method	6,526	Demographic, substance use, mental health, environment	59	78%	No	15	50%
Zhao (2024)⁹⁷	Identifying high school risk factors that forecast heavy drinking onset in understudied young adults	SVM	106	Mental health, behavior, environment	27	74%	No	16.1	41%
Ruberu (2021)¹⁰⁰	Joint risk prediction for hazardous use of alcohol, cannabis, and tobacco among adolescents: A preliminary study using statistical and machine learning	Lasso regression	270	Demographic, environment, substance use	18		No	15.5	74%
Andrade (2024)¹⁰³	Large-scale longitudinal analysis of the progression of alcohol use among members of a social media platform: an observational study	Random forest	4,160	Social media	10,006	92%	No
Bae (2023)¹⁰⁴	Leveraging mobile phone sensors, machine learning, and explainable artificial intelligence to predict imminent same-day binge-drinking events to support just-in-time adaptive interventions: algorithm development and validation study	XGBoost	75	Sensor	70		No	22.4	29%
Afzali (2019)¹¹⁰	Machine-learning prediction of adolescent alcohol use: a cross-study, cross-cultural validation	Elastic net	3,826	Demographics, mental health, behavior	27	87%	Yes	12.8	50.8%
Leenaerts (2024)¹²⁴	Person-specific and pooled prediction models for binge eating, alcohol use and binge drinking in bulimia nervosa and alcohol use disorder	Elastic net	70	Environment, social, behavior, mental health, substance use	110	91.2%	No	21	0
Chung (2024)¹³⁵	Prediction rules identify which young adults have higher rates of heavy episodic drinking after exposure to 12-week text message interventions	Ensemble method	1,131	Substance use	21		No	22.1	32%
Rane (2022)¹⁴⁵	Structural differences in adolescent brains can predict alcohol misuse	SVM, logistic regression, gradient boosting	1,182	Neuroimaging	719	75%	No	14	47%
Weidacker (2022)¹⁴⁶	The prediction of resilience to alcohol consumption in youths: insular and subcallosal cingulate myeloarchitecture	RVR	86	Neuroimaging	3		No	21.76	67%
Rane (2023)¹⁴⁸	Uncontrolled eating and sensation-seeking partially explain the prediction of future binge drinking from adolescent brain structure	SVM	555	Neuroimaging	719	73%	No	28	47%
Relapse
Lin (2020)⁶³	An analysis of the effect of mu-opioid receptor gene (OPRM1) promoter region DNA methylation on the response of naltrexone treatment of alcohol dependence	Random forest	93	Gene	32		No	50	100%
Wyant (2024)¹⁰⁶	Machine learning models for temporally precise lapse prediction in alcohol use disorder	XGBoost	151	Substance use, mental health, behavior	286	91%	No	41	51%
Sekutowicz (2019)¹²⁰	Neural response patterns during Pavlovian-to-instrumental transfer predict alcohol relapse and young adult drinking	SVM	52	Neuroimaging	350		No	44.5	27%
Seo (2015)¹³³	Predicting the future relapse of alcohol-dependent patients from structural and functional brain images	RSLVQ	46	Neuroimaging	48	79%	No	40.7	65%
Schwebel (2022)¹⁵²	Using machine learning to examine predictors of treatment goal change among individuals seeking treatment for alcohol use disorder	Random forest, decision tree	441	Demographic, substance use, mental health, environment	111		No	34.5	58.3%
Treatment Response
Smink (2021)⁶⁵	Analysis of the emails from the Dutch web-based intervention “Alcohol de Baas”: assessment of early indications of drop-out in an online alcohol abuse intervention	Decision tree	770	Social media	304		No	46	44%
Collin (2024)⁶⁶	Analyzing dropout in alcohol recovery programs: A machine learning approach	SVM	31,087	Mental health	19		Yes		82.7%
Derksen (2025)⁸⁷	Effectiveness of machine learning-based adjustments to an eHealth intervention targeting mild alcohol use	Random forest	234	Behavior	6		No	58.2
Schwebel (2025)⁹²	Finding purpose: integrated latent profile and machine learning analyses identify purpose in life as an important predictor of high-functioning recovery after alcohol treatment	Random forest	809	Mental health, physical health, environment, alcohol use, other substance use, demographics	28	69%	No	40.3	70%
Cavicchioli (2021)⁹⁸	Investigating predictive factors of dialectical behavior therapy skills training efficacy for alcohol and concurrent substance use disorders: A machine learning study	Elastic net	275	Substance use, mental health	30	71%	No	47	59%
Symons (2019)¹⁰⁸	Machine learning vs. addiction therapists: A pilot study predicting alcohol dependence treatment outcome from patient data in behavior therapy with adjunctive medication	Decision tree	830	Demographic, substance use, mental health		57%	Yes	41
Hinton (2017)¹¹¹	Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects	Lasso regression	120	Demographic, mental health, substance use, specimen		65%	No	45	68%
Witkiewitz (2023)¹²²	Patterns of drinking behavior around a treatment episode for alcohol use disorder: predictions from pretreatment measures	XGBoost	1,726	Demographic, substance use, behavior, physical, mental health, environment	33		Yes	20	75%
Symons (2020)¹²⁷	Predicting alcohol dependence treatment outcomes: a prospective comparative study of clinical psychologists vs. “trained” machine learning models	Logistic regression	1,236	Demographic, substance use	31	64%	Yes	40	66%
Kinreich (2021)¹²⁸	Predicting alcohol use disorder remission: a longitudinal multimodal multifeatured machine learning approach	SVM	1,376	Neuroimaging, gene, demographic	10,391	87%	No	33	60%
Zhang (2022)¹³¹	Predicting Readmission following hospital treatment for patients with alcohol related diagnoses in an Australian regional health district	SVM, random forest					No
Ramos (2021)¹³²	Predicting success of a digital self-help intervention for alcohol and substance use with machine learning	Random forest	2,126	Behavior	31	71%	No
Gueorguieva (2015)¹³⁶	Predictors of abstinence from heavy drinking during follow-up in COMBINE	Logistic regression	1,150	Demographic, substance use, behavior, physical	100	74%	No
Wallach (2022)¹³⁷	Predictors of abstinence, no heavy drinking days, and a 2-level reduction in World Health Organization drinking levels during treatment for alcohol use disorder in the COMBINE study	Decision tree	1,168	Substance use, physical, demographic, specimen, mental health	89		No
Lindner (2020)¹⁴⁹	Using alcohol consumption diary data from an Internet intervention for outcome and predictive modeling: a validation and machine learning study	Random forest	607	Substance use	18	60%	No
Walters (2021)¹⁵³	Using machine learning to identify predictors of imminent drinking and create tailored messages for at-risk drinkers experiencing homelessness	Gradient boosting	78	Behavior	7	87%	No	46.2	84%
Roberts (2022)¹⁵⁴	Using machine learning to predict heavy drinking during outpatient alcohol treatment	Random forest	1,383	Demographic, substance use, mental health	154	77%	No	44	69%
Rehm (2015)¹⁵⁶	Who receives treatment for alcohol use disorders in the European Union? A cross-sectional representative study in primary and specialized health care	Logistic regression	8,476	Substance use, mental health			No

Note. ABCD, Adolescent Brain Cognitive Development; AI, artificial intelligence; AUD, alcohol use disorder; EHR, electronic health record; GFCI, greedy fast causal inference; kNN, k-nearest neighbor; LASSO, least absolute shrinkage and selection operator; LSTM, long short-term memory; MLP, multilayer perceptron; MRI, magnetic resonance imaging; NMF, nonnegative matrix factorization; RSLVQ, robust soft learning vector quantization; RVR, relevant vector regression; sPLS-DA, sparse partial least squares discriminant analysis; SVM, support vector machine.