Abstract
PURPOSE: The review surveys the type of machine learning approaches currently used in the alcohol literature, reviews challenges in applying machine learning tools to alcohol data, and explores how overcoming these challenges could advance personalized medicine for alcohol use disorder (AUD).
SEARCH METHODS: The authors conducted a search of publications on PubMed, ScienceDirect, and EBSCO Academic Search Premier published from 2015 to April 15, 2025, for articles that used machine learning to analyze alcohol-related outcomes. Search terms were (“drinking” OR “alcohol”) AND (“machine learning” OR “deep learning” OR “predict” OR “classify”) in the title or abstract.
SEARCH RESULTS: The search returned 2,618 manuscripts. Keeping those that predicted alcohol-related outcomes and excluding those that merely used alcohol as a predictor for other outcomes reduced the selection to 567 manuscripts. A final manual selection resulted in 110 original peer-reviewed human research studies that primarily analyzed alcohol consumption behaviors and tested their models on data that they were not trained on.
DISCUSSION AND CONCLUSIONS: Predictions focused on alcohol consumption or AUD diagnosis in cohorts with a mean age of 50 years or younger (i.e., when long-term drinking behaviors are being or have been established). Most studies confined the data-driven searches to a single modality and relied on conventional machine learning approaches, which tended to produce accurate and transparent predictions on the relatively small datasets typically collected by AUD studies. The small number of available samples was the most common limitation mentioned by the reviewed articles. Investigators also wished for machine learning models to provide insights about causality. Gaining these insights will be essential to improve diagnosis and treatment of AUD, for which the field must foster multidisciplinary research teams to build rigorous and trustworthy machine learning models and quantitative benchmarks that can capture the multifaceted nature of alcohol use and its comorbidities.
Key Takeaways
- Alcohol-related publications using machine learning almost exclusively relied on conventional techniques, whereas current public discourse emphasizes state-of-the-art models.
- A majority of models predicted alcohol consumption or alcohol use disorder (AUD) diagnosis, which is generally easier to forecast than, for example, disorder or treatment outcome.
- Only 40% of models utilized multimodal data, which is needed for encoding the complexity of AUD and related clinical outcomes.
- Addressing the complexity of AUD requires creating machine learning models and quantitative benchmarks that accurately capture the multifaceted nature of alcohol use and its comorbidities.
Introduction
Alcohol use disorder (AUD) currently affects 27.9 million Americans1 and costs the United States more than $250 billion annually.2 It is the fourth leading preventable cause of death,3 underscoring the urgent need for effective strategies to understand, predict, and prevent alcohol misuse. At the mechanistic level, AUD arises from complex interactions between neural, genetic, and environmental factors that collectively influence vulnerability and resilience to alcohol’s effects. Genetic and epigenetic factors account for 40% to 60% of a person’s addiction risk,4,5 although psychological and environmental factors (such as family history,6,7 traumatic events,8 peer pressure,9,10 other drug use,11,12 and externalizing behaviors13,14) also play crucial roles in shaping drinking behaviors. However, findings about these predictive factors are often inconsistent or weak: many individuals without known risk factors develop AUD,15 whereas at-risk individuals remain resilient16 (e.g., approximately 50% of maltreated youth17). Equally pressing is the challenge of predicting treatment response and long-term abstinence. Although pharmacological and behavioral interventions are available, only 7.9% of people with AUD receive alcohol use treatment in the United States,18 and of those, only 16% achieve abstinence.19
One major hurdle in alcohol research has been the predominant reliance on fragmented, small-scale datasets unable to reveal the complex and heterogeneous nature of AUD.20,21 The need for larger heterogeneous data sets for AUD is recognized by funding agencies.22 Since 2012, the National Institute of Alcohol Abuse and Alcoholism has funded the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) study23 to annually collect brain magnetic resonance imaging (MRI) data, neuropsychology testing, alcohol use, and related data of 831 individuals who were age 12 to 21 at baseline. NCANDA was the first to report on in vivo disruption due to alcohol of white matter microstructural development during adolescence.24 An even larger sample (more than 11,000 children age 9 to 10 years at baseline) has been recruited by the Adolescent Brain Cognitive Development (ABCD) Study,25 which is tracking brain development, behavior, and environmental changes to gain insights into how early life factors shape substance use and mental health outcomes. Over three decades, the Collaborative Studies on the Genetics of Alcoholism (COGA) has collected phenotypic data26 from nearly 18,000 individuals ages 7 to 97 across more than 2,200 families, as well as DNA and electrophysiological measures on a large subset. The longitudinal data set provides a unique opportunity for researchers to identify genes influencing the risk of AUD and related outcomes. Another key effort is the Monitoring the Future (MTF) study—an ongoing, nationally representative study that has been surveying more than 25,000 8th, 10th, and 12th grade students27 and approximately 20,000 adults ages 19 to 65 each year since 1973.28 By capturing longitudinal behavioral and attitudinal changes, MTF offers a unique resource for understanding population-level patterns and risk factors for substance use. Finally, aiming to advance personalized medicine, the All of Us Research Program29 is planning to collect multimodal data (including genomic profiles, electronic health records, and environmental exposure) from at least 1 million people in the United States to discern how individual differences in lifestyle, environment, and biology affect health outcomes, such as the relationship between risky drinking behavior and cancer diagnosis.30
In addition to increasing sample sizes, alcohol studies have expanded the scope and diversity of data collected from each participant to enable more precise characterization of them. Current studies increasingly integrate more comprehensive self-reports,31 medical records,32 genetic profiling,33 neuropsychological testing,34 and imaging.35 The advent of wearable technologies and ubiquitous mobile devices (such as smartphones36 and biosensors measuring physical activity37 and intoxication38) has made it possible to conduct ecological momentary assessments.39,40 This surge in data volume introduces substantial computational and methodological challenges, regardless of cohort size.
The emergence of larger and more diverse data sets calls for a transformation in fundamental analytical approaches. Alcohol research has traditionally relied on hypothesis-driven analyses, which involve preselecting a narrow set of variables41,42 and applying univariate regression models at the group level to test pairwise associations. Although conceptually straightforward and statistically interpretable, this approach fragments understanding into siloed single factors that each explain only a small fraction of variance in alcohol outcomes, failing to capture the intricate multivariate interactions among behavioral traits, cognitive functions, environmental influences, and neural mechanisms that underlie AUD.43,44 Furthermore, the findings only reflect group-level trends that generally do not translate to individual-level insights.45
A promising alternative is machine learning (Figure 1), which is designed to translate complex, multimodal data into predictions on an individual basis.46 Machine learning is a branch of artificial intelligence (AI) that enables computers to identify interactions among measurements (i.e., patterns) predictive of the outcome directly from data rather than relying on predefined statistical formulas or human-crafted rules. Typically, a machine learning model needs to be trained on a large number of samples, where each sample is described by a set of features (predictor variables) and a target outcome variable of interest (e.g., diagnosis, symptom severity, drinking level, or treatment response). In alcohol research, predictors may span multiple domains, including neuroimaging (e.g., functional connectivity, cortical thickness, regional volume, or microstructural integrity), behavioral and cognitive traits (e.g., measures of impulsivity, decision-making, or sensation seeking), environmental exposures (e.g., peer drinking, family history, or neighborhood stressors), and genetic or physiological markers (e.g., polygenic risk scores, heart rate, sleep disturbance). The goal of the model is to integrate information across these domains to generate a single prediction of the chosen target outcome specific to each individual.
Figure 1. Machine learning methods can transform complex, multimodal data into individualized predictions of alcohol use disorder (AUD) and related outcomes. Data modalities used by studies reviewed here included, among others, behavior, biological specimens, demographics, sensor data, neuroimaging, genes, and mental health assessments. Raw data were either directly analyzed by deep learning models or first extracted into aggregate measurements for use in conventional machine learning approaches, which were used in most studies reviewed. AUD-related outcomes predicted by these methods encompassed AUD diagnosis, consumption patterns, prognosis, relapse, and treatment response. To obtain the predictions, the studies first trained the machine learning models by determining the parameter setting that minimized the difference between predicted and observed outcomes on the training data set. The studies then measured the accuracy of the predictions and the constellation of measurements (i.e., pattern) that drove the predictions, on a separate test set.
During training, the model learns (or is trained) by iteratively adjusting its internal parameters to minimize the prediction error, which is the difference between actual and predicted outcome of the training samples. Once this error is minimized, the parameterized model is evaluated on a set of test samples (i.e., participants and their data not used for training). This evaluation assesses the degree to which the model generalizes to new individuals. Common evaluation metrics47 include accuracy (i.e., quantifying the proportion of correct predictions) and the area under the receiver operating characteristic curve (AUC), which measures how well the model discriminates between outcome classes over the full range of decision thresholds. Lastly, one can identify the pattern driving inference, offering insights into potential mechanisms or risk factors underlying alcohol outcomes.
Conventional machine learning models (e.g., random forest, support vector machine [SVM]) typically confine analysis to tabular data, where summary scores are extracted from raw measurements before training the model. For example, brain MRI studies relied on region-level summaries (such as brain volume or cortical thickness) as input to machine learning models, which confined prediction accuracy to the anatomical granularity of the brain regions defined by human experts beforehand.48 Removing this constraint necessitates performing predictions directly from raw higher-dimensional data (e.g., all voxels of a 3D brain MRI). This is the domain of deep learning models (Figure 1), a subfield of machine learning that gained traction starting in 2012.49 Deep learning models jointly learn to derive measurements (i.e., features) from the raw data and to predict the outcome using large-scale artificial neural networks. Treating feature extraction as part of the optimization process has led to significant improvements in predictive accuracy across numerous domains, fueling growing interest in applying deep learning approaches to data-driven medical research.50
This review provides a systematic overview of how machine learning methods have been applied across the landscape of alcohol research over the past decade. Rather than emphasizing the specific constellations of neurobiological or psychological factors identified by machine learning analyses, the review focuses on the methodological trends, data modalities, and modeling choices that characterize current machine learning practice across alcohol-related research. Specifically, the review examines which populations and clinical outcomes have been most frequently targeted, what level of prediction accuracy has been achieved, what types of data have been used to predict outcomes, and how different machine learning models (including deep learning) have been adopted. The review also highlights the challenges associated with applying these methods to AUD studies and discusses how overcoming those challenges could transform diagnosis and treatment from subjective observations to objective, individualized assessments that lead to accurate precision medicine.
Search Method
In April and July 2025, authors conducted a literature search in PubMed, ScienceDirect, and EBSCO Academic Search Premier for publications that used machine learning to analyze alcohol-related outcomes. Across the three databases, a key word search using (alcohol[Title] OR drinking[Title]) AND (alcohol[Title/Abstract]) AND (“machine learning”[Title/Abstract] OR “deep learning”[Title/Abstract] OR predict[Title/Abstract] OR classify[Title/Abstract]) returned 2,618 manuscripts published from 2015 to April 15, 2025.
A rule-based text-mining Python script (available upon request from the authors) searched for the co-occurrence of prediction-related terms (e.g., predict, model, classify, identify) and alcohol-related terms (e.g., alcohol, drinking, dependence, relapse), while removing articles in which alcohol was a predictor rather than an outcome (e.g., “Alcohol use predicts depression”). This automatic screening reduced the selection to 567 manuscripts. A final manual selection removed eight preprints that had not been peer-reviewed, 21 articles that used animal models, 27 articles that studied fetal alcohol spectrum disorders, 139 articles that did not use machine learning approaches, and 262 articles that did not primarily analyze alcohol consumption behaviors (e.g., withdrawal symptoms, alcohol-related liver diseases, drunk driving). This selection resulted in 110 original peer-reviewed human research studies7,51-161 that tested the model on data that it was not trained on, which is a key difference between machine learning and population-level statistical analysis.
Results of the Literature Search
The search identified 2,618 articles for initial examination. Of those, 2,508 were excluded as described above, and 110 were included in the review. For each selected article, Appendix 1 records the number of subjects, mean age, sex ratio, prediction task (e.g., predicting relapse, AUD diagnosis classification), number of predictors, machine learning model used (e.g., random forest, SVM), and reported model accuracy (e.g., AUC and classification accuracy). If multiple alcohol-related prediction tasks were explored, the average prediction performance was recorded. If multiple machine learning models were explored, the model with the highest performance was recorded.
With respect to demographic factors, 85 of the 110 reviewed articles (or 77%) reported sex ratios: eight studies (9% of 85 articles) focused exclusively on men, 40 manuscripts (47%) were somewhat balanced between the sexes (i.e., the percentage of men was 40% to 60%), and three only studied women. Of the 110 reviewed articles, 89 manuscripts (81%) reported the mean age of cohorts, with most studies studying samples with a mean age from late childhood to middle-aged adulthood: specifically, 82 articles studied cohorts with mean ages of 10 to 50 years. Only seven studies focused on cohorts with a mean age greater than 50 years.
Among all prediction outcomes (Figure 2A), AUD diagnosis and alcohol consumption patterns were the two categories most often evaluated (both 31% of articles). Articles focusing on classifying AUD diagnosis reported a median AUC of 89% (Figure 2B), which was the highest reported among categories with at least two articles. In comparison, articles on predicting alcohol consumption had a median AUC of 78%. Treatment response was the third most frequently predicted outcome but had the lowest median AUC (71%), followed by prognosis (14%, median AUC: 80%). Relapse and alcohol concentration both were assessed in only 4% of studies but had high median AUCs (85% and 96%, respectively). However, the very high AUC (96%) for sensor-based prediction of alcohol concentration requires confirmation as it was only published by one article.
Figure 2. Characteristics of the studies identified by the review. (A) Articles are categorized based on the clinical outcome targeted by the machine learning prediction: more than half the articles focus on predicting alcohol use disorder (AUD) diagnosis or consumption patterns. (B) Area under the curve (AUC) of the models of various outcomes. The most accurate clinical outcomes to predict were alcohol concentration and AUD diagnosis. (C) Number of studies that used a certain type of input predictors. The most common predictors were neuroimaging data, alcohol and other substance use patterns, demographics, mental health symptoms, and behavioral data from neuropsychological testing and self-reports. (D) Number of studies that used a certain type of machine learning model. The most frequently adopted models were random forest, linear regression, and support vector machine. Note. EHR, electronic health record; kNN, k-nearest neighbor; MLP, multilayer perceptron; SVM, support vector machine.
Of the 110 articles, 46 (42%) used more than one type of predictor to predict outcomes. The most frequently used predictors were imaging data (32% of articles), followed by self-reported substance use (27%), demographics (24%), mental health (24%), and behavioral assessment (23%) (Figure 2C). Other data types were rarely used for prediction (10% or less). Conventional machine learning methods162 were the most popular approaches for prediction, including the random forest approach (used in 24% of 110 articles), linear regression (17%), and SVM (16%) (Figure 2D). Other approaches, such as logistic regression, gradient boosting, decision tree, multilayer perceptron, ensemble methods, k nearest neighbor, and clustering were used in 10% of studies or less. Deep learning approaches were used in 3% of investigations. Studies with larger sample sizes tended to use higher-dimensional input features for prediction (i.e., used larger numbers of predictors) (Figure 3A; r = .20, p = .05); they also resulted in lower prediction accuracy (Figure 3B; r = −.26, p = .03).
Figure 3. Correlation of sample size with number of predictors and accuracy. The sample size showed (A) a positive correlation with the number of input features to the machine learning models; and (B) a negative correlation with model prediction accuracy.
Results of the Studies Reviewed
Demographics
Historically, AUD has been mostly diagnosed in men.163 However, in recent years, the prevalence of AUD among women has been rising. This is a concerning trend from a public health standpoint, as the adverse effects of alcohol misuse tend to be more severe in women than in men.164,165 This shift is evident in the articles reviewed herein, with only 9% of studies focusing exclusively on men, and the proportion of male participants in studies published since 2023 (n = 49) significantly lower (p = .032) compared to those published before 2023 (n = 36). Accordingly, the reviewed articles identified sex differences in the correlations of neural substrates and behavioral correlates with alcohol misuse and treatment response.77,97 For example, childhood trauma and microstructural integrity in temporal and motor structures were female-exclusive predictors of alcohol misuse in young adulthood, whereas social recognition ability, personality traits, and microstructural integrity in cerebellar, motor, and occipital structures were more important predictors for young males.77,97 Liver function test results predicted treatment response exclusively in males, whereas mental health symptoms were more informative predictors for females.154 Discovery of these distinctions was performed either by training a single machine learning algorithm on both sexes and comparing important predictors between males and females,77 or by training separate models for males and females and comparing the outcomes of the two models.97
In addition to sex, age is a critical demographic factor in alcohol use.166 Most recent machine learning studies have focused on adolescents through middle-aged adults, a critical window when drinking behaviors typically initiate167,168 and consolidate.169-171 As acknowledged in the reviewed articles, adolescence and early adulthood are critical for brain development60,146 and encompass major life transitions (e.g., transition from high school to college, entry into the workforce, and early parenthood). These transitions often involve shifts in social norms and heightened stress, both of which can drive alcohol use.97,140 Notably, binge drinking (defined as five or more drinks per occasion) peaks during the late teens to early 30s172 and is one of the strongest predictors of developing AUD later in life.173 By contrast, alcohol misuse in adulthood poses the greatest public health burden,77,96 including alcohol-related injuries (e.g., accidents, violence, fatalities)174-176 and economic costs (e.g., lost productivity and increased health care expenditure).165,177 Thus, machine learning studies are particularly valuable for simultaneously analyzing tens or hundreds of factors drawn from many conceptually different domains with the goal of identifying early markers of vulnerability and understanding later-life health and social outcomes to inform prevention and intervention strategies.
Prediction Outcome
Distinguishing individuals with AUD from controls was a prevalent prediction task probed by machine learning analysis; it was addressed in 35 of the 110 articles (31%; Figure 2A). The high proportion may be due to the relatively high accuracy achieved by machine learning approaches (median AUC: 89.0%; Figure 2B). Predicting AUD is relatively simple compared to other clinical outcomes because it is a binary decision process based on clear diagnostic framing (i.e., meets the criteria of the Diagnostic and Statistical Manual of Mental Disorders: yes/no). Furthermore, AUD is typically associated with severe and enduring alterations in brain regions and behavior, as also documented by the reviewed articles.84,129 Accordingly, the prediction is often based on neuroimaging data (20 out of 35 articles). For example, prediction models based on structural and functional MRI have yielded converging evidence of AUD-related abnormalities in prefrontal circuits (executive dysfunction),55,69,96,138 ventral striatum (reward sensitivity),84,96 the cingulate cortex,24,55 default mode network,138,139 and the sensory cortex96,107 (Appendix 1). Although these regions also have been previously reported by literature based on traditional group-level analyses, the effect sizes revealed by machine learning approaches far exceed those identified by univariate statistical tests. For example, a neural network classifier achieved an AUC of 79% in classifying 51 people with AUD and 51 control subjects using whole-brain resting-state functional connectivity.96 In contrast, individual functional connectivity features typically exhibited group differences of t < 4.0, corresponding to an accuracy of only about 0.65.96
Predicting alcohol consumption patterns that did not meet the criteria for AUD was another task frequently investigated by machine learning models (35 studies, or 31% of the articles). Because alcohol misuse is among the strongest predictors of future AUD onset, this task holds significant value for prevention and policy design. Prediction models for this task often involved mixed data sources beyond imaging-based predictors. Common predictors of alcohol misuse identified by machine learning included sociodemographic factors (e.g., male sex or lower socioeconomic status),51,119,130,142 poor executive control,51,119 social behavior change,83,119 psychological dysfunction,83,119 and other substance use.51,83 The accuracy of these predictors in differentiating people who drink heavily from control subjects (median AUC 78%) was substantially higher than the accuracy of single predictors, which typically yielded only marginal improvements above chance.51,61,119 Collectively, these findings position machine learning approaches as a promising and potentially clinically actionable framework for early risk quantification and targeted prevention of AUD.
The third most predicted outcome of machine learning was treatment response. Common predictors identified by the reviewed articles included severity of dependence and craving at baseline,7,98,127 self-efficacy,127,154 and psychiatric comorbidity.7,98,154 Initial evidence also suggests that the quantitative, individualized prediction by machine learning models aggregating multiple predictors was more accurate than predictions made by human experts.127 Despite the potential, this prediction task was associated with the lowest median AUC (71%) among all prediction outcomes. One reason is that treatment response often is assessed using nonstandardized subjective measures (e.g., craving scales and self-reported relapse) that are known to exhibit low test-retest reliability and high individual variability.71 Further increasing variability was the wide difference in treatment duration, from a few days132 to a few years136 resulting in inconsistent timing of follow-up assessments. Thus, clear neurobiological and behavioral correlates with treatment outcomes remain elusive. In addition, studies on treatment outcomes also suffer from only a limited proportion of enrolled individuals completing the full course of treatment (e.g., 78% drop out rate66). This low completion rate contributes to significant class imbalance between individuals who do or do not respond to treatment. Collectively, these factors make it difficult to robustly train machine learning models and hinder their generalizability,178 which is reflected in treatment response emerging as the clinical outcome with the lowest average predictive accuracy (Figure 2B).
Types of Data Used for Training Machine Learning Models
This review revealed that predictors of alcohol-related clinical outcomes encompass a diverse array of data types (Figure 2C), with the most popular one being medical imaging data. As mentioned, chronic heavy alcohol use results in significant volume loss in brain regions55 and disrupts brain function.69,84,96 Neuroimaging can capture these alterations noninvasively, making it appealing for investigating addiction mechanisms.179 Beyond improving the understanding of addiction mechanisms, the objectivity of nonfunctional neuroimaging transcends traditional self-report measures and thus has the potential for providing quantitative biomarkers for identifying individuals at elevated risk for drinking onset, predicting relapse,180 and assessing treatment outcomes.181
However, neuroimaging data also have some disadvantages. They are relatively expensive to acquire, can be susceptible to measurement variability and artifacts (e.g., head motion, scanner differences, and physiological fluctuations), and are associated with small effect sizes (typically explaining < 5% to 10% of the variance),182,183 because they are indirect proxies reflecting intermediate phenotypes (such as functional connectivity or activation patterns) rather than direct measures of behaviors, such as craving or relapse.184,185 In contrast to neuroimaging studies, neuropsychological assessments, self-reports, and demographic variables (e.g., alcohol use history, age of onset, or family history) are more proximally aligned with AUD outcomes and easier to collect, often translating to stronger predictive accuracy. This explains why the 18 studies using only neuroimaging as predictors had a lower median AUC of 79% than the 22 studies using a single type of nonneural predictors (median AUC: 91%).
No single data type (whether neuroimaging, genetic, behavioral, or self-report) fully captures the complexity of AUD and related clinical outcomes. In this review, 42% of studies investigated predictors from multiple modalities, which machine learning models are primed to do because their data-driven search does not require prior knowledge about the modalities they analyze. Each modality—whether it be genetic data, brain imaging, behavioral traits, or environmental exposures—generally captures a distinct yet complementary layer of risk or resilience.26,128,186 By seamlessly integrating these diverse information sources, the reviewed studies consistently indicated that multimodal machine learning models not only provide a more comprehensive representation of individual differences in alcohol-related outcomes but also enhance prediction accuracy and generalizability compared to single-modal models.7,55,119,129
Discussion
Model Design Choices
In general, the choice of machine learning models is heavily influenced by data set characteristics, particularly sample size and number of predictors, which are often strongly correlated (Figure 3A).187 Training highly complex models on a limited number of samples with many predictors can lead to “overfitting,” where the model captures noise patterns in the training data rather than generalizable relationships based on reliable measurements. Therefore, given that most alcohol-related datasets typically only include fewer than 1,000 subjects, conventional machine learning methods (i.e., random forest, SVM, linear regression)162 are more common because they depend on relatively fewer model parameters to learn. In contrast, state-of-the-art deep learning approaches require far larger data sets to generalize.188,189 Empirical studies have shown that these “sophisticated” models often have overestimated accuracy and are easily overfitted on small data sets.189
Moreover, the data in alcohol studies were often structured tabular data (e.g., demographic variables, self-report scores, or brain regional measurements) rather than raw, high-dimensional inputs (e.g., minimally processed structural MRI data). In these tabular-feature scenarios, traditional models often performed on par with or even better than deep learning because the key signals of the precomputed features (e.g., regional brain volumes or cognitive scores) and relatively simple prediction targets (often binary outcomes, such as diagnosis vs. control) did not require complex modeling. Recent evaluations on structured data sets found that tree-based ensembles (such as random forest and XGBoost190) consistently outperformed deep neural networks (which are considered the state-of-the-art in AI) for both purely numerical features and mixed data types.191,192 Thus, although deep learning has only begun to tread in alcohol research, its broader potential will depend on the availability of large, well-curated data sets.
Beyond prediction accuracy, the decision process of conventional models is generally easier to interpret compared with deep learning. This is essential for results from alcohol studies to be useful to clinicians and researchers who need to understand why a model makes a prediction to trust and act on it. For example, random forest can provide ranked feature importance193 to indicate which features are more influential to the model prediction. By contrast, deep neural networks are often regarded as “black boxes” with complex internal computations that defy straightforward explanations.188,194 This opacity can lead to reluctance in adopting deep learning models for medical decisions, because clinicians cannot easily verify what the model has learned or identify potential biases.195
Another important advantage of conventional machine learning methods over deep learning for alcohol researchers is the relatively low barrier to deployment and maintenance.196 Conventional methods are typically easier to set up, requiring fewer design choices and less parameter tuning than deep learning approaches. They handle mixed data types and occasional missing values with minimal preprocessing, whereas deep networks usually demand careful data cleaning and customization for each new task. Conventional machine learning algorithms also run efficiently on standard computational resources, with training of the models often completed in seconds or minutes on a normal personal computer. In contrast, deep learning approaches necessitate high-performance GPUs or cloud infrastructure, which is expensive and harder to operate.194
In summary, the dominance of conventional machine learning approaches in alcohol-related neuroimaging and behavioral prediction can be attributed to a combination of data limitations, interpretability needs, and ease of use. Although this observation might be biased by the search criteria of this review, which were confined to nonspecific key words, such as “machine learning” and “deep learning,” none of the deep learning approaches used in the studies were based on state-of-the-art architectures (e.g., transformers, generative models, and large language models) that recently have been the focus of public discourse.
Key Methodological Challenges and Future Directions for AI in Alcohol Research
Despite the increased use of machine learning analyses in alcohol research, several methodological limitations continue to constrain the full potential of this technology. The limiting factor most commonly mentioned among the reviewed articles is the number of available samples (25 articles). As mentioned, generalizability of machine learning methods relies on training them on heterogeneous data sets, which requires collecting data from a large number of samples. Funding institutions have identified this need and have recently funded several large, multicenter studies, such as NCANDA, ABCD, COGA, and MTF. However, simply increasing sample size does not overcome existing limitations; machine learning methodology also needs to be tailored to study such large samples in alcohol research. This review revealed a negative correlation between sample size and model prediction accuracy (Figure 3B), contradicting the general understanding that the accuracy of machine learning should increase with the size of training data. This paradox reflects a broader challenge in psychiatry increasingly recognized in the literature,197,198 namely the trade-off between predictive accuracy and population heterogeneity.197 Smaller data sets tend to be more homogeneous, allowing models to learn a single predictive pattern with high accuracy. However, this pattern is likely specific to a restricted subpopulation, thus resulting in a lower accuracy in larger heterogenous data sets; this was pointed out by 35 articles of this review. A promising way forward lies in the development of foundation models.199 These are large-scale, general-purpose models that are first trained on broader data sets, including those not necessarily related to alcohol research. Once pretrained, these models can be fine-tuned to specific subpopulations or clinical questions (e.g., predicting relapse in young females or treatment response in people with chronic AUD with liver disease). Creating such models could be essential for moving from subjective, population-level generalizations toward truly individualized, quantitative, and context-sensitive predictions that are the foundational promise of precision psychiatry.
Another challenge in using machine learning to advance alcohol research is the modeling of confounders, which were stringently accounted for by only 27 of the 110 articles (25%) reviewed. When confounding variables (e.g., age, sex, or comorbid mental health conditions) influence both the input features (e.g., neural or behavioral measures) and the outcome (e.g., alcohol consumption or treatment response), models may learn spurious associations that lead to misinterpretation of machine learning findings200 as also acknowledged by 24 articles of this review. For instance, age affects both brain connectivity and drinking patterns.201 If age is not adjusted for, the model may “predict” alcohol misuse by detecting brain maturation, not alcohol effects. Similarly, alcohol consumption patterns significantly differ between males and females; consequently, a machine learning model not controlling for sex might simply detect sex differences in the input predictors. Another type of confound is caused by comorbidity (mentioned by nine articles), because alcohol use often co-occurs with other physical and psychiatric symptoms (e.g., liver disease and depression) and other substance use. The identified predictors hence might not be linked to alcohol outcomes but to other confounding phenotypes. This type of signal leakage caused by confounds202,203 often results in misleading model interpretations204 and inflates accuracy scores (e.g., AUC, accuracy). This is especially problematic in nonrandomized, observational data sets typically encountered in alcohol research.
Mitigation of confounding effects is well established in traditional statistics but often underdeveloped or overlooked in machine learning research, which has historically only focused on maximizing predictive accuracy. In particular, modern deep learning models have the capacity to encode complex nonlinear confounding effects that are hard to detect or adjust for.203 This gap could be addressed by treating modeling confounders as a core component of the machine learning pipeline. Researchers could apply appropriate preprocessing techniques (e.g., residualization, stratification, or harmonization205) before training, and may want to consider models specifically designed to reduce confounding, such as domain-invariant206 or confounder-aware approaches.61,207,208 Also, evaluation of the models could be enhanced by going beyond reporting a single AUC and incorporating fairness metrics209 (e.g., subgroup AUCs) and sensitivity analyses210 (e.g., performance changes with and without confounders). Validation of models on external data sets where the distribution of individual confounders differs would be crucial for ensuring generalizability and robustness.
Beyond modeling confounding effects, six of the 110 articles reviewed here explicitly mentioned the limitation that the adopted machine learning approach could only reveal statistical association between predictors and outcome but could not reveal causality between them. This issue highlights an urgent need for a paradigm shift in machine learning applications within alcohol research. Rather than focusing solely on optimizing prediction accuracy, models could also be designed to reveal dependencies and potential causal pathways among variables. New models designed to discern how external environmental and sociodemographic factors (i.e., those that give rise to population heterogeneity) moderate genetic, neural, and behavioral underpinnings of AUD in a data-driven fashion could move the field forward.211,212 A promising future direction is the application of canonical correlation analysis213 and its variants, which can uncover shared patterns across domains that are not only interpretable but can also serve as input features for downstream phenotype prediction.214 To further capture directional influences, structural equation modeling215 can be integrated to model hypothesized causal pathways and test how upstream factors (e.g., genes, environment) propagate their effects through intermediate phenotypes (e.g., brain function) to shape behavioral outcomes. This causal approach might offer a powerful new avenue in generative modeling212 that could ultimately enable virtual interventions, allowing researchers to test “what-if” scenarios without carrying out real-world in vivo experiments. For example, a generative model could simulate how removing environmental risk, improving modifiable behaviors, such as sleep hygiene, or neural simulation could predict downstream effects on alcohol-related behaviors. Achieving this vision will require alcohol researchers to closely work together with AI methodologists and implementation scientists, such as facilitated by Stanford’s AI for Mental Health Initiative.216
As mentioned, compared to statistical group analysis that only quantifies a population average trend, a strength of machine learning is to condense multivariate patterns across many features to an individualized score that can be mapped directly onto clinical decisions (e.g., risk of AUD onset or likelihood of relapse). However, none of the reviewed studies discussed the deployment of machine learning approaches in clinical settings. This issue is symptomatic for psychiatry in general, whereas other areas of medicine are beginning to integrate AI models into clinical workflows and even clinical trials.217,218 Unlike other fields of medicine that rely on measurable physiological indicators, psychiatry still relies heavily on subjective reports and assessment for diagnosis and treatment. This lack of objective, quantifiable targets poses a major barrier to training and validating clinically actionable machine learning models. Consequently, as of March 2026, none of the FDA-approved AI-enabled medical devices were tailored toward psychiatric conditions.219,220 Unique to the field of alcohol research is that studies soon will be able to replace self-reported alcohol use with real-time quantitative measurements provided by noninvasive alcohol biosensors.221,222 Such real‐time, quantitative monitoring of alcohol use will provide objective, continuous data that can transform model development and clinical validation.
To fully realize this potential, rigorous evaluation standards must be established. The studies reviewed exhibit substantial variability in validation practices: thus, 18% of studies relied on a single train/test split without cross-validating results on all samples, only 12% of studies validated findings on external data sets, and 16% of articles adopted potentially flawed methodological designs, such as double dipping (e.g., feature selection on the full data set before training) and data leakage (e.g., information of test samples were used for training). The establishment of standardized benchmark data sets and data splits, as well as the preregistration of machine learning analysis and evaluation plans,223 including training and testing data construction, can help remedy these concerns. Once quantitative assessments are coupled with stringent model evaluation, researchers will be able to easily test the validity of their models, allowing the field to gain a deeper insight into addiction. Ensuring the responsible deployment of this technology in clinical settings will enable practitioners to diagnose patients based on quantitative markers and objectively assess the progress of alcohol treatments. Finally, prevention programs might be able to accurately determine risk of alcohol misuse in individuals. By doing so, the field of alcohol research could be a trailblazer in psychiatry as it would use AI technology to radically improve the diagnosis and treatment of a psychiatric disease, namely AUD.
Conclusion
This review focused on studies employing machine learning methods that, in contrast to hypothesis-driven analyses, can uncover complex multivariate patterns predictive of alcohol-related outcomes on an individual basis. Most of the 110 reviewed articles trained conventional machine learning approaches on a single data modality to predict alcohol consumption or AUD diagnosis in cohorts with mean ages between 12 and 50 years. Studies were limited by the small number of available samples, not being able to gain insights about causality, and failing to account for confounders. Significantly improving the diagnosis and treatment of AUD will require harnessing the full potential of recent advances in deep learning (such as foundation models).224 Specifically, fostering multidisciplinary research teams to create rigorous and trustworthy models can help analyze large, multimodal data sets that can capture the multifaceted nature of alcohol use and its comorbidities based on quantitative measures. Additionally, setting guidelines for the responsible development and deployment of these machine learning models would help ensure that these approaches will improve precision alcohol treatment.
Acknowledgments
The work was partly supported by the National Institutes of Health grants R00AA028840 (to Q.Z.); R01DA057567, U24AA021697, R01AA010723, R01AA05965, and R01AA017347 (to K.M.P.); the Brain and Behavior Research Foundation Young Investigator Grant (to Q.Z.), and the 2024 Stanford HAI Hoffman-Yee Grant (to K.M.P.). The authors thank Haopeng Xue for helping with formatting the article.
Correspondence
Address correspondence concerning this article to Kilian M. Pohl, 1070 Arastradero Road, Palo Alto, CA 94304. Email: [email protected]
Disclosures
The authors declare no competing financial or nonfinancial interests.
Publisher's note
Opinions expressed in contributed articles do not necessarily reflect the views of the National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health. The U.S. government does not endorse or favor any specific commercial product or commodity. Any trade or proprietary names appearing in Alcohol Research: Current Reviewsare used only because they are considered essential in the context of the studies reported herein.
References
- Substance Abuse and Mental Health Services Administration (SAMHSA). Center for Behavioral Statistics and Quality. National Survey on Drug Use and Health Table 5.9A. 2024. Alcohol use disorder in past year: among people aged 12 or older; by age group and demographic characteristics, numbers in thousands, 2023 and 2024. https://www.samhsa.gov/data/report/2024-nsduh-detailed-tables.
- Bose J, Hedden SL, Lipari RN, Park-Lee E. Key substance use and mental health indicators in the United States: Results from the 2017 National Survey on Drug Use and Health. Substance Abuse and Mental Health Services Administration. 2018. https://www.samhsa.gov/data.
- Pilar MR, Eyler AA, Moreland-Russell S, Brownson RC. Actual causes of death in relation to media, policy, and funding attention: Examining public health priorities. Front Public Health. 2020;8:279. doi:10.3389/fpubh.2020.00279
- Bevilacqua L, Goldman D. Genes and addictions. Clin Pharmacol Ther. 2009;85(4):359-361. doi:10.1038/clpt.2009.6
- Zhou H, Gelernter J. Human genetics and epigenetics of alcohol use disorder. J Clin Invest. 2024;134(16):e172885. doi:10.1172/JCI172885
- Goldman D, Oroszi G, Ducci F. The genetics of addictions: Uncovering the genes. Nat Rev Genet. 2005;6(7):521-532. doi:10.1038/nrg1635
- Kinreich S, Meyers JL, Maron-Katz A, et al. Predicting risk for Alcohol Use Disorder using longitudinal data with multimodal biomarkers and family history: A machine learning study. Mol Psychiatry. 2021;26(4):1133-1141. doi:10.1038/s41380-019-0534-x.
- Konkolÿ Thege B, Horwood L, Slater L, Tan MC, Hodgins DC, Wild TC. Relationship between interpersonal trauma exposure and addictive behaviors: A systematic review. BMC Psychiatry. 2017;17(1):164. doi:10.1186/s12888-017-1323-1
- Simons-Morton B, Haynie DL, Crump AD, Eitel SP, Saylor KE. Peer and parent influences on smoking and drinking among early adolescents. Health Educ Behav. 2001;28(1):95-107. doi:10.1177/109019810102800109
- Morris H, Larsen J, Catterall E, Moss AC, Dombrowski SU. Peer pressure and alcohol consumption in adults living in the UK: A systematic qualitative review. BMC Public Health. 2020;20(1):1014. doi:10.1186/s12889-020-09060-2
- Blanco C, Hasin DS, Wall MM, et al. Cannabis use and risk of psychiatric disorders: Prospective evidence from a U.S. national longitudinal study. JAMA Psychiatry. 2016;73(4):388-395. doi:10.1001/jamapsychiatry.2015.3229
- Kohut SJ. Interactions between nicotine and drugs of abuse: A review of preclinical findings. Am J Drug Alcohol Abuse. 2017;43(2):155-170. doi:10.1080/00952990.2016.1209513
- Benegal V, Antony G, Venkatasubramanian G, Jayakumar PN. Gray matter volume abnormalities and externalizing symptoms in subjects at high risk for alcohol dependence. Addict Biol. 2007;12(1):122-132. doi:10.1111/j.1369-1600.2006.00043.x
- Savage JE, Spit for Science Working Group, Dick DM. Internalizing and externalizing subtypes of alcohol misuse and their relation to drinking motives. Addict Behav. 2023;136:107461. doi:10.1016/j.addbeh.2022.107461
- Merline A, Jager J, Schulenberg JE. Adolescent risk factors for adult alcohol use and abuse: Stability and change of predictive value across early and middle adulthood. Addiction. 2008;103(suppl 1):84-99. doi:10.1111/j.1360-0443.2008.02178.x
- Cusack SE, Wright AW, Amstadter AB. Resilience and alcohol use in adulthood in the United States: A scoping review. Prev Med. 2023;168:107442. doi:10.1016/j.ypmed.2023.107442
- Perini I, Mayo LM, Capusan AJ, et al. Resilience to substance use disorder following childhood maltreatment: Association with peripheral biomarkers of endocannabinoid function and neural indices of emotion regulation. Mol Psychiatry. 2023;28(6):2563-2571. doi:10.1038/s41380-023-02033-y
- Substance Abuse and Mental Health Services Administration. National Survey on Drug Use and Health. 2023. https://www.samhsa.gov/data/report/2023-nsduh-detailed-tables.
- Fan AZ, Chou SP, Zhang H, Jung J, Grant BF. Prevalence and correlates of past-year recovery from DSM-5 alcohol use disorder: Results from National Epidemiologic Survey on Alcohol and Related Conditions-III. Alcohol Clin Exp Res. 2019;43(11):2406-2420. doi:10.1111/acer.14192
- Whelan R, Watts R, Orr CA, et al. Neuropsychosocial profiles of current and future adolescent alcohol misusers. Nature. 2014;512(7513):185-189. doi:10.1038/nature13402
- Friske MM, Torrico EC, Haas MJW, et al. A systematic review and meta-analysis on the transcriptomic signatures in alcohol use disorder. Mol Psychiatry. 2025;30(1):310-326. doi:10.1038/s41380-024-02719-x
- National Institute on Alcohol Abuse and Alcoholism. National Institute on Alcohol Abuse and Alcoholism Strategic Plan: Fiscal Years 2024-2028. https://www.niaaa.nih.gov/sites/default/files/NIAAA-2024-2028-Strategic-Plan.pdf.
- Brown SA, Brumback T, Tomlinson K, et al. The National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA): A multisite study of adolescent development and substance use. J Stud Alcohol Drugs. 2015;76(6):895-908. doi:10.15288/jsad.2015.76.895
- Zhao Q, Sullivan EV, Honnorat N, et al. Association of heavy drinking with deviant fiber tract development in frontal brain systems in adolescents. JAMA Psychiatry. 2021;78(4):407-415. doi:10.1001/jamapsychiatry.2020.4064
- Volkow ND, Koob GF, Croyle RT, et al. The conception of the ABCD Study: From substance use to a broad NIH collaboration. Dev Cogn Neurosci. 2018;32:4-7. doi:10.1016/j.dcn.2017.10.002
- Agrawal A, Brislin SJ, Bucholz KK, et al. The collaborative study on the genetics of alcoholism: Overview. Genes Brain Behav. 2023;22(5):e12864. doi:10.1111/gbb.12864
- Miech RA, Johnston LD, Patrick ME, O’Malley PM. Monitoring the Future national survey results on drug use, 1975-2023: Overview and detailed results for secondary school students. Ann Arbor, MI: University of Michigan. 2024. https://monitoringthefuture.org/wp-content/uploads/2024/01/mtfoverview2024.pdf.
- Patrick ME, Miech RA, Johnston LD, O’Malley PM. Monitoring the Future Panel Study annual report: National data on substance use among adults ages 19 to 65, 1976-2023. Ann Arbor, MI: University of Michigan. 2024. https://monitoringthefuture.org/wp-content/uploads/2024/07/mtfpanel2024.pdf.
- All of Us Research Program Investigators. The “All of Us” research program. N Engl J Med. 2019;381(7):668-676. doi:10.1056/NEJMsr1809937
- Shi M, Luo C, Oduyale OK, Zong X, LoConte NK, Cao Y. Alcohol consumption among adults with a cancer diagnosis in the All of Us research program. JAMA Netw Open. 2023;6(8):e2328328. doi:10.1001/jamanetworkopen.2023.28328
- Tevik K, Bergh S, Selbæk G, Johannessen A, Helvik AS. A systematic review of self-report measures used in epidemiological studies to assess alcohol consumption among older adults. PLOS One. 2021;16(12):e0261292. doi:10.1371/journal.pone.0261292
- Singer A, Kosowan L, Loewen S, Spitoff S, Greiver M, Lynch J. Who is asked about alcohol consumption? A retrospective cohort study using a national repository of Electronic Medical Records. Prev Med Rep. 2021;22:101346. doi:10.1016/j.pmedr.2021.101346
- Gupta I, Dandavate R, Gupta P, Agrawal V, Kapoor M. Recent advances in genetic studies of alcohol use disorders. Curr Genet Med Rep. 2020;8(2):27-34. doi:10.1007/s40142-020-00185-9
- Day AM, Kahler CW, Ahern DC, Clark US. Executive functioning in alcohol use studies: A brief review of findings and challenges in assessment. Curr Drug Abuse Rev. 2015;8(1):26-40. doi:10.2174/1874473708666150416110515
- Zahr NM, Pfefferbaum A. Alcohol’s effects on the brain: Neuroimaging results in humans and animal models. Alcohol Res. 2017;38(2):183-206. doi:10.35946/arcr.v38.2.04
- Haucke M, Heinzel S, Liu S. Social mobile sensing and problematic alcohol consumption: Insights from smartphone metadata. Int J Med Inform. 2024;188:105486. doi:10.1016/j.ijmedinf.2024.105486
- De Rosa O, Menghini L, Kerr E, et al. Exploring the relationship between sleep patterns, alcohol and other substances consumption in young adults: Insights from wearables and Mobile surveys in the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA) cohort. Int J Psychophysiol. 2025;209:112524. doi:10.1016/j.ijpsycho.2025.112524
- Davis-Martin RE, Alessi SM, Boudreaux ED. Alcohol use disorder in the age of technology: A review of wearable biosensors in alcohol use disorder treatment. Front Psychiatry. 2021;12:642813. doi:10.3389/fpsyt.2021.642813
- Piasecki TM. Assessment of alcohol use in the natural environment. Alcohol Clin Exp Res. 2019;43(4):564-577. doi:10.1111/acer.13975
- Stone C, Adams S, Wootton RE, Skinner A. Smartwatch-based ecological momentary assessment for high-temporal-density, longitudinal measurement of alcohol use (AlcoWatch): Feasibility evaluation. JMIR Form Res. 2025;9:e63184. doi:10.2196/63184
- Stevenson BL, Kunicki ZJ, Brick L, Blevins CE, Stein M, Abrantes AM. Using ecological momentary assessments and Fitbit data to examine daily associations between physical activity, affect and alcohol cravings in patients with alcohol use disorder. Int J Behav Med. 2022;29(5):543-552. doi:10.1007/s12529-021-10039-5
- Abrantes AM, Blevins CE, Battle CL, Read JP, Gordon AL, Stein MD. Developing a Fitbit-supported lifestyle physical activity intervention for depressed alcohol dependent women. J Subst Abuse Treat. 2017;80:88-97. doi:10.1016/j.jsat.2017.07.006
- Vilar-Ribó L, Cabana-Domínguez J, Alemany S, et al. Disentangling heterogeneity in substance use disorder: Insights from genome-wide polygenic scores. Transl Psychiatry. 2024;14(1):221. doi:10.1038/s41398-024-02923-x
- Yang W, Singla R, Maheshwari O, Fontaine CJ, Gil-Mohapel J. Alcohol use disorder: Neurobiology and therapeutics. Biomedicines. 2022;10(5):1192. doi:10.3390/biomedicines10051192
- Mattoni M, Fisher AJ, Gates KM, Chein J, Olino TM. Group-to-individual generalizability and individual-level inferences in cognitive neuroscience. Neurosci Biobehav Rev. 2025;169:106024. doi:10.1016/j.neubiorev.2025.106024
- Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science. 2015;349(6245):255-260. doi:10.1126/science.aaa8415
- Jiang T, Gradus JL, Rosellini AJ. Supervised machine learning: A brief primer. Behav Ther. 2020;51(5):675-687. doi:10.1016/j.beth.2020.05.002
- Wu X, Liang C, Bustillo J, et al. The impact of atlas parcellation on functional connectivity analysis across six psychiatric disorders. Hum Brain Mapp. 2025;46(5):e70206. doi:10.1002/hbm.70206
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
- Yang Y, Zhang H, Gichoya JW, Katabi D, Ghassemi M. The limits of fair medical imaging AI in real-world generalization. Nat Med. 2024;30(10):2838-2848. doi:10.1038/s41591-024-03113-4
- O’Halloran L, Pennie B, Jollans L, et al. A combination of impulsivity subdomains predict alcohol intoxication frequency. Alcohol Clin Exp Res. 2018;42(8):1530-1540. doi:10.1111/acer.13779
- Kim SY, Park T, Kim K, Oh J, Park Y, Kim DJ. A deep learning algorithm to predict hazardous drinkers and the severity of alcohol-related problems using K-NHANES. Front Psychiatry. 2021;12:684406. doi:10.3389/fpsyt.2021.684406
- Bonnell LN, Littenberg B, Wshah SR, Rose GL. A machine learning approach to identification of unhealthy drinking. J Am Board Fam Med. 2020;33(3):397-406. doi:10.3122/jabfm.2020.03.190421
- Johnson KA, McDaniel JT, Okine J, et al. A machine learning model for the prediction of unhealthy alcohol use among women of childbearing age in Alabama. Alcohol Alcohol. 2024;59(2):agad075. doi:10.1093/alcalc/agad075
- Guggenmos M, Schmack K, Veer IM, et al. A multimodal neuroimaging classifier for alcohol dependence. Sci Rep. 2020;10(1):298. doi:10.1038/s41598-019-56923-9
- May AC, Jacobus J, Simmons AN, Tapert SF. A prospective investigation of youth alcohol experimentation and reward responsivity in the ABCD study. Front Psychiatry. 2022;13:886848. doi:10.3389/fpsyt.2022.886848
- Fairbairn CE, Han J, Caumiant EP, Benjamin AS, Bosch N. A wearable alcohol biosensor: Exploring the accuracy of transdermal drinking detection. Drug Alcohol Depend. 2025;266:112519. doi:10.1016/j.drugalcdep.2024.112519
- Wyant K, Moshontz H, Ward SB, Fronk GE, Curtin JJ. Acceptability of personal sensing among people with alcohol use disorder: Observational study. JMIR Mhealth Uhealth. 2023;11:e41833. doi:10.2196/41833
- Mulholland PJ, Berto S, Wilmarth PA, McMahan C, Ball LE, Woodward JJ. Adaptor protein complex 2 in the orbitofrontal cortex predicts alcohol use disorder. Mol Psychiatry. 2023;28(11):4766-4776. doi:10.1038/s41380-023-02236-3
- Sun D, Adduru VR, Phillips RD, et al. Adolescent alcohol use is linked to disruptions in age-appropriate cortical thinning: An unsupervised machine learning approach. Neuropsychopharmacology. 2023;48(2):317-326. doi:10.1038/s41386-022-01457-4
- Park SH, Zhang Y, Kwon D, et al. Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals. Sci Rep. 2018;8(1):8297. doi:10.1038/s41598-018-26627-7
- Oszkinat C, Luczak SE, Rosen IG. An abstract parabolic system-based physics-informed long short-term memory network for estimating breath alcohol concentration from transdermal alcohol biosensor data. Neural Comput Appl. 2022;34(21):18933-18951. doi:10.1007/s00521-022-07505-w
- Lin Y, Kranzler HR, Farrer LA, Xu H, Henderson DC, Zhang H. An analysis of the effect of mu-opioid receptor gene (OPRM1) promoter region DNA methylation on the response of naltrexone treatment of alcohol dependence. Pharmacogenomics J. 2020;20(5):672-680. doi:10.1038/s41397-020-0158-1
- Mumtaz W, Vuong PL, Xia L, Malik AS, Rashid RBA. An EEG-based machine learning method to screen alcohol use disorder. Cogn Neurodyn. 2017;11(2):161-171. doi:10.1007/s11571-016-9416-y
- Smink WAC, Sools AM, Postel MG, et al. Analysis of the Emails from the Dutch web-based intervention “Alcohol de Baas”: Assessment of early indications of drop-out in an online alcohol abuse intervention. Front Psychiatry. 2021;12:575931. doi:10.3389/fpsyt.2021.575931
- Collin A, Ayuso-Muñoz A, Tejera-Nevado P, et al. Analyzing dropout in alcohol recovery programs: A machine learning approach. J Clin Med. 2024;13(16):4825. doi:10.3390/jcm13164825
- Zhang Z, Zhang S, Huang J, et al. Association between abnormal plasma metabolism and brain atrophy in alcohol-dependent patients. Front Mol Neurosci. 2022;15:999938. doi:10.3389/fnmol.2022.999938
- Komarnyckyj M, Retzler C, Cao Z, et al. At-risk alcohol users have disrupted valence discrimination during reward anticipation. Addict Biol. 2022;27(3):e13174. doi:10.1111/adb.13174
- Song H, Yang P, Zhang X, et al. Atypical effective connectivity from the frontal cortex to striatum in alcohol use disorder. Transl Psychiatry. 2024;14(1):381. doi:10.1038/s41398-024-03083-8
- Besong OTO, Koo JS, Zhang H. Brain lncRNA-mRNA co-expression regulatory networks and alcohol use disorder. Genomics. 2024;116(5):110928. doi:10.1016/j.ygeno.2024.110928
- Curtis B, Giorgi S, Buffone AEK, et al. Can Twitter be used to predict county excessive alcohol consumption rates? PLOS One. 2018;13(4):e0194290. doi:10.1371/journal.pone.0194290
- Uceta M, Cerro-León AD, Shpakivska-Bilán D, García-Moreno LM, Maestú F, Antón-Toro LF. Clustering electrophysiological predisposition to binge drinking: An unsupervised machine learning analysis. Brain Behav. 2024;14(11):e70157. doi:10.1002/brb3.70157
- Stevely AK, Holmes J, Meier PS. Combinations of drinking occasion characteristics associated with units of alcohol consumed among British adults: An event-level decision tree modeling study. Alcohol Clin Exp Res. 2021;45(3):630-637. doi:10.1111/acer.14560
- Zhu X, Huang J, Huang S, et al. Combining metabolomics and interpretable machine learning to reveal plasma metabolic profiling and biological correlates of alcohol-dependent inpatients: What about tryptophan metabolism regulation? Front Mol Biosci. 2021;8:760669. doi:10.3389/fmolb.2021.760669
- Peng Q, Wilhelmsen KC, Ehlers CL. Common genetic substrates of alcohol and substance use disorder severity revealed by pleiotropy detection against GWAS catalog in two populations. Addict Biol. 2021;26(1):e12877. doi:10.1111/adb.12877
- Pinar-Sanchez J, Bermejo López P, Solís García Del Pozo J, et al. Common laboratory parameters are useful for screening for alcohol use disorder: Designing a predictive model using machine learning. J Clin Med. 2022;11(7):2061. doi:10.3390/jcm11072061
- Li Y, Li G, Yang L, et al. Connectomics modeling of regional networks of white-matter fractional anisotropy to predict the severity of young adult drinking. Quant Imaging Med Surg. 2025;15(3):2405-2419. doi:10.21037/qims-24-2131
- Ebrahimi A, Wiil UK, Mansourvar M, Naemi A, Andersen K, Nielsen AS. Deep neural network to identify patients with alcohol use disorder. Stud Health Technol Inform. 2021;281:238-242. doi:10.3233/SHTI210156
- Miranda O, Fan P, Qi X, et al. DeepBiomarker2: Prediction of alcohol and substance use disorder risk in post-traumatic stress disorder patients using electronic medical records and multiple social determinants of health. J Pers Med. 2024;14(1):94. doi:10.3390/jpm14010094
- Crocamo C, Viviani M, Bartoli F, Carrà G, Pasi G. Detecting binge drinking and alcohol-related risky behaviours from Twitter’s users: An exploratory content- and topology-based analysis. Int J Environ Res Public Health. 2020;17(5):1510. doi:10.3390/ijerph17051510
- Bharat C, Glantz MD, Aguilar-Gaxiola S, et al. Development and evaluation of a risk algorithm predicting alcohol dependence after early onset of regular alcohol use. Addiction. 2023;118(5):954-966. doi:10.1111/add.16122
- Bush NJ, Cushnie AK, Sinclair M, et al. Development of an accelerometer-based wearable sensor approach for alcohol consumption detection. Alcohol Clin Exp Res. 2024;48(12):2341-2351. doi:10.1111/acer.15465
- Lee S. Development of deep learning auto-encoder algorithms for predicting alcohol use in Korean adolescents based on cross-sectional data. Soc Sci Med. 2025;367:117690. doi:10.1016/j.socscimed.2025.117690
- Kamarajan C, Ardekani BA, Pandey AK, et al. Differentiating individuals with and without alcohol use disorder using resting-state fMRI functional connectivity of reward network, neuropsychological performance, and impulsivity measures. Behav Sci. 2022;12(5):128. doi:10.3390/bs12050128
- Chen F, Xiao M, Chen C, et al. Discrimination of alcohol dependence based on the convolutional neural network. PLOS One. 2020;15(10):e0241268. doi:10.1371/journal.pone.0241268
- Liang X, Justice AC, So-Armah K, Krystal JH, Sinha R, Xu K. DNA methylation signature on phosphatidylethanol, not on self-reported alcohol consumption, predicts hazardous alcohol consumption in two distinct populations. Mol Psychiatry. 2021;26(6):2238-2253. doi:10.1038/s41380-020-0668-x
- Derksen M, van Beek M, Blankers M, et al. Effectiveness of machine learning-based adjustments to an eHealth intervention targeting mild alcohol use. Eur Addict Res. 2025;31(1):47-59. doi:10.1159/000543252
- Li R, Balakrishnan GP, Nie J, et al. Estimation of blood alcohol concentration from smartphone gait data using neural networks. IEEE Access. 2021;9:61237-61255. doi:10.1109/access.2021.3054515
- Ariss T, Fairbairn CE, Bosch N. Examining new-generation transdermal alcohol biosensor performance across laboratory and field contexts. Alcohol Clin Exp Res. 2023;47(1):50-59. doi:10.1111/acer.14977
- Marengo D, Azucar D, Giannotta F, Basile V, Settanni M. Exploring the association between problem drinking and language use on Facebook in young adults. Heliyon. 2019;5(10):e02523. doi:10.1016/j.heliyon.2019.e02523
- Lin Y, Sharma B, Thompson HM, et al. External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients. Addiction. 2022;117(4):925-933. doi:10.1111/add.15730
- Schwebel FJ, Wilson AD, Pearson MR, McCool MW, Witkiewitz K. Finding purpose: Integrated latent profile and machine learning analyses identify purpose in life as an important predictor of high-functioning recovery after alcohol treatment. Addict Behav. 2025;165:108273. doi:10.1016/j.addbeh.2025.108273
- Huang T, Elghafari A, Relia K, Chunara R. High-resolution temporal representations of alcohol and tobacco behaviors from social media data. Proc ACM Hum Comput Interact. 2017;1(54):1-26. doi:10.1145/3134689
- Ebrahimi A, Wiil UK, Naemi A, Mansourvar M, Andersen K, Nielsen AS. Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods. BMC Med Inform Decis Mak. 2022;22(1):304. doi:10.1186/s12911-022-02051-w
- Zhu T, Becquey C, Chen Y, Lejuez CW, Li CR, Bi J. Identifying alcohol misuse biotypes from neural connectivity markers and concurrent genetic associations. Transl Psychiatry. 2022;12(1):253. doi:10.1038/s41398-022-01983-1
- Vergara VM, Espinoza FA, Calhoun VD. Identifying alcohol use disorder with resting state functional magnetic resonance imaging data: A comparison among machine learning classifiers. Front Psychol. 2022;13:867067. doi:10.3389/fpsyg.2022.867067
- Zhao Q, Paschali M, Dehoney J, et al. Identifying high school risk factors that forecast heavy drinking onset in understudied young adults. Dev Cogn Neurosci. 2024;68:101413. doi:10.1016/j.dcn.2024.101413
- Cavicchioli M, Calesella F, Cazzetta S, et al. Investigating predictive factors of dialectical behavior therapy skills training efficacy for alcohol and concurrent substance use disorders: A machine learning study. Drug Alcohol Depend. 2021;224:108723. doi:10.1016/j.drugalcdep.2021.108723
- Guleken Z, Sarıbal D, Mırsal H, Cebulski J, Ceylan Z, Depciuch J. Investigating the impact of long-term alcohol consumption on serum chemical changes: Fourier transform infrared spectroscopy for human blood serum. J Biophotonics. 2025;18(5):e202400550. doi:10.1002/jbio.202400550
- Ruberu TLM, Kenyon EA, Hudson KA, et al. Joint risk prediction for hazardous use of alcohol, cannabis, and tobacco among adolescents: A preliminary study using statistical and machine learning. Prev Med Rep. 2022;25:101674. doi:10.1016/j.pmedr.2021.101674
- Morris LS, Kundu P, Baek K, et al. Jumping the gun: Mapping neural correlates of waiting impulsivity and relevance across alcohol misuse. Biol Psychiatry. 2016;79(6):499-507. doi:10.1016/j.biopsych.2015.06.009
- Sania A, Pini N, Nelson ME, et al. K-nearest neighbor algorithm for imputing missing longitudinal prenatal alcohol data. Adv Drug Alcohol Res. 2024;4:13449. doi:10.3389/adar.2024.13449
- Andrade FC, Meyerson WU, Hoyle RH. Large-scale longitudinal analysis of the progression of alcohol use among members of a social media platform: An observational study. Am J Drug Alcohol Abuse. 2024;51(1):116-126. doi:10.1080/00952990.2024.2414324
- Bae SW, Suffoletto B, Zhang T, et al. Leveraging mobile phone sensors, machine learning, and explainable artificial intelligence to predict imminent same-day binge-drinking events to support just-in-time adaptive interventions: Algorithm development and validation study. JMIR Form Res. 2023;7:e39862. doi:10.2196/39862
- Zhang Z, Robinson L, Whelan R, et al. Machine learning models for diagnosis and risk prediction in eating disorders, depression, and alcohol use disorder. J Affect Disord. 2025;379:889-899. doi:10.1016/j.jad.2024.12.053
- Wyant K, Sant’Ana SJ, Fronk GE, Curtin JJ. Machine learning models for temporally precise lapse prediction in alcohol use disorder. J Psychopathol Clin Sci. 2024;133(7):527-540. doi:10.1037/abn0000901
- Zhu T, Wang W, Chen Y, Kranzler HR, Li CR, Bi J. Machine learning of functional connectivity to biotype alcohol and nicotine use disorders. Biol Psychiatry Cogn Neurosci Neuroimaging. 2024;9(3):326-336. doi:10.1016/j.bpsc.2023.08.010
- Symons M, Feeney GFX, Gallagher MR, Young RM, Connor JP. Machine learning vs addiction therapists: A pilot study predicting alcohol dependence treatment outcome from patient data in behavior therapy with adjunctive medication. J Subst Abuse Treat. 2019;99:156-162. doi:10.1016/j.jsat.2019.01.020
- Rezapour M, Niazi MKK, Gurcan MN. Machine learning-based analytics of the impact of the Covid-19 pandemic on alcohol consumption habit changes among United States healthcare workers. Sci Rep. 2023;13(1):6003. doi:10.1038/s41598-023-33222-y
- Afzali MH, Sunderland M, Stewart S, et al. Machine-learning prediction of adolescent alcohol use: A cross-study, cross-cultural validation. Addiction. 2019;114(4):662-671. doi:10.1111/add.14504
- Hinton DJ, Vázquez MS, Geske JR, et al. Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects. Sci Rep. 2017;7(1):2496. doi:10.1038/s41598-017-02442-4
- Kummerfeld E, Anker JA, Rix A, Kushner MG. Methodological advances in the study of hidden variables: A demonstration on clinical alcohol use disorder data. AMIA Annu Symp Proc. 2018;2018:710-719.
- Sangle SB, Kachare PH, Puri DV, Al-Shoubarji I, Jabbari A, Kirner R. Explaining electroencephalogram channel and subband sensitivity for alcoholism detection. Comput Biol Med. 2025;188:109826. doi:10.1016/j.compbiomed.2025.109826
- Anuragi A, Singh Sisodia D. Alcohol use disorder detection using EEG Signal features and flexible analytical wavelet transform. Biomed Signal Process Control. 2019;52:384-393. doi:10.1016/j.bspc.2018.10.017
- Bae S, Chung T, Ferreira D, Dey AK, Suffoletto B. Mobile phone sensors and supervised machine learning to identify alcohol use events in young adults: Implications for just-in-time adaptive interventions. Addict Behav. 2018;83:42-47. doi:10.1016/j.addbeh.2017.11.039
- Grodin EN, Montoya AK, Bujarski S, Ray LA. Modeling motivation for alcohol in humans using traditional and machine learning approaches. Addict Biol. 2021;26(3):e12949. doi:10.1111/adb.12949
- Lee JY, Song MS, Yoo SY, et al. Multimodal-based machine learning approach to classify features of internet gaming disorder and alcohol use disorder: A sensor-level and source-level resting-state electroencephalography activity and neuropsychological study. Compr Psychiatry. 2024;130:152460. doi:10.1016/j.comppsych.2024.152460
- Afshar M, Phillips A, Karnik N, et al. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: Development and internal validation. J Am Med Inform Assoc. 2019;26(3):254-261. doi:10.1093/jamia/ocy166
- Squeglia LM, Ball TM, Jacobus J, et al. Neural predictors of initiating alcohol use during adolescence. Am J Psychiatry. 2017;174(2):172-185. doi:10.1176/appi.ajp.2016.15121587
- Sekutowicz M, Guggenmos M, Kuitunen-Paul S, et al. Neural response patterns during Pavlovian-to-instrumental transfer predict alcohol relapse and young adult drinking. Biol Psychiatry. 2019;86(11):857-863. doi:10.1016/j.biopsych.2019.06.028
- Mohd Nazri AK, Yahya N, Khan DM, et al. Partial directed coherence analysis of resting-state EEG signals for alcohol use disorder detection using machine learning. Front Neurosci. 2024;18:1524513. doi:10.3389/fnins.2024.1524513
- Witkiewitz K, Kirouac M, Baurley JW, McMahan CS. Patterns of drinking behavior around a treatment episode for alcohol use disorder: Predictions from pre-treatment measures. Alcohol Clin Exp Res. 2023;47(11):2138-2148. doi:10.1111/acer.15183
- Marcon G, de Ávila Pereira F, Zimerman A, et al. Patterns of high-risk drinking among medical students: A web-based survey with machine learning. Comput Biol Med. 2021;136:104747. doi:10.1016/j.compbiomed.2021.104747
- Leenaerts N, Soyster P, Ceccarini J, Sunaert S, Fisher A, Vrieze E. Person-specific and pooled prediction models for binge eating, alcohol use and binge drinking in bulimia nervosa and alcohol use disorder. Psychol Med. 2024;54(10):2758-2773. doi:10.1017/S0033291724000862
- Yang JJ, Luo X, Trucco EM, Buu A. Polygenic risk prediction based on singular value decomposition with applications to alcohol use disorder. BMC Bioinform. 2022;23(1):28. doi:10.1186/s12859-022-04566-5
- Soyster PD, Ashlock L, Fisher AJ. Pooled and person-specific machine learning models for predicting future alcohol consumption, craving, and wanting to drink: A demonstration of parallel utility. Psychol Addict Behav. 2022;36(3):296-306. doi:10.1037/adb0000666
- Symons M, Feeney GFX, Gallagher MR, Young RM, Connor JP. Predicting alcohol dependence treatment outcomes: A prospective comparative study of clinical psychologists versus “trained” machine learning models. Addiction. 2020;115(11):2164-2175. doi:10.1111/add.15038
- Kinreich S, McCutcheon VV, Aliev F, et al. Predicting alcohol use disorder remission: A longitudinal multimodal multi-featured machine learning approach. Transl Psychiatry. 2021;11(1):166. doi:10.1038/s41398-021-01281-2
- Kamarajan C, Pandey AK, Chorlian DB, et al. Predicting alcohol-related memory problems in older adults: A machine learning study with multi-domain features. Behav Sci. 2023;13(5):427. doi:10.3390/bs13050427
- Leaks K, Norden-Krichmar T, Brody JP. Predicting moderate drinking behaviors in National Health and Nutrition Examination Survey participants using biochemical and demographical factors with machine learning. Alcohol. 2023;113:1-10. doi:10.1016/j.alcohol.2023.07.005
- Zhang J, Qian S, Su G, Deng C, Yu P. Predicting readmission following hospital treatment for patients with alcohol related diagnoses in an Australian regional health district. Stud Health Technol Inform. 2022;290:1072-1073. doi:10.3233/SHTI220273
- Ramos LA, Blankers M, van Wingen G, de Bruijn T, Pauws SC, Goudriaan AE. Predicting success of a digital self-help intervention for alcohol and substance use with machine learning. Front Psychol. 2021;12:734633. doi:10.3389/fpsyg.2021.734633
- Seo S, Mohr J, Beck A, Wüstenberg T, Heinz A, Obermayer K. Predicting the future relapse of alcohol-dependent patients from structural and functional brain images. Addict Biol. 2015;20(6):1042-1055. doi:10.1111/adb.12302
- Agarwal K, Chaudhary S, Tomasi D, Volkow ND, Joseph PV. Prediction of alcohol intake patterns with olfactory and gustatory brain connectivity networks. Neuropsychopharmacology. 2025;50(7):1167-1175. doi:10.1038/s41386-025-02058-7
- Chung T, Suffoletto B, Feldstein Ewing SW, Bhurosy T, Jiang Y, Valera P. Prediction rules identify which young adults have higher rates of heavy episodic drinking after exposure to 12-week text message interventions. Subst Use Addctn J. 2024;45(1):144-149. doi:10.1177/29767342231206653
- Gueorguieva R, Wu R, Fucito LM, O’Malley SS. Predictors of abstinence from heavy drinking during follow-up in COMBINE. J Stud Alcohol Drugs. 2015;76(6):935-941. doi:10.15288/jsad.2015.76.935
- Wallach JD, Gueorguieva R, Phan H, Witkiewitz K, Wu R, O’Malley SS. Predictors of abstinence, no heavy drinking days, and a 2-level reduction in World Health Organization drinking levels during treatment for alcohol use disorder in the COMBINE study. Alcohol Clin Exp Res. 2022;46(7):1331-1339. doi:10.1111/acer.14877
- Zhu X, Du X, Kerich M, Lohoff FW, Momenan R. Random forest based classification of alcohol dependence patients and healthy controls using resting state MRI. Neurosci Lett. 2018;676:27-33. doi:10.1016/j.neulet.2018.04.007
- Kamarajan C, Ardekani BA, Pandey AK, et al. Random forest classification of alcohol use disorder using fMRI functional connectivity, neuropsychological functioning, and impulsivity measures. Brain Sci. 2020;10(2):115. doi:10.3390/brainsci10020115
- Schwebel FJ, Pearson MR, Richards DK, et al. Regression tree applications to studying alcohol-related problems among college students. Exp Clin Psychopharmacol. 2024;32(5):542-553. doi:10.1037/pha0000718
- Duadi D, Yosovich A, Beiderman M, et al. Remote sensing of alcohol consumption using machine learning speckle pattern analysis. J Biomed Opt. 2025;30(3):037001. doi:10.1117/1.JBO.30.3.037001
- Fede SJ, Grodin EN, Dean SF, Diazgranados N, Momenan R. Resting state connectivity best predicts alcohol use severity in moderate to heavy alcohol users. NeuroImage Clin. 2019;22:101782. doi:10.1016/j.nicl.2019.101782
- Rosato AJ, Chen X, Tanaka Y, et al. Salivary microRNAs identified by small RNA sequencing and machine learning as potential biomarkers of alcohol dependence. Epigenomics. 2019;11(7):739-749. doi:10.2217/epi-2018-0177
- Didier NA, King AC, Polley EC, Fridberg DJ. Signal processing and machine learning with transdermal alcohol concentration to predict natural environment alcohol consumption. Exp Clin Psychopharmacol. 2024;32(2):245-254. doi:10.1037/pha0000683
- Rane RP, de Man EF, Kim J, et al. Structural differences in adolescent brains can predict alcohol misuse. Elife. 2022;11:e77545. doi:10.7554/eLife.77545
- Weidacker K, Kim SG, Buhl-Callesen M, et al. The prediction of resilience to alcohol consumption in youths: Insular and subcallosal cingulate myeloarchitecture. Psychol Med. 2022;52(11):2032-2042. doi:10.1017/S0033291720003852
- Dagnew TM, Tseng CJ, Yoo CH, et al. Toward AI-driven neuroepigenetic imaging biomarker for alcohol use disorder: A proof-of-concept study. iScience. 2024;27(7):110159. doi:10.1016/j.isci.2024.110159
- Rane RP, Musial MPM, Beck A, et al. Uncontrolled eating and sensation-seeking partially explain the prediction of future binge drinking from adolescent brain structure. NeuroImage Clin. 2023;40:103520. doi:10.1016/j.nicl.2023.103520
- Lindner P, Johansson M, Gajecki M, Berman AH. Using alcohol consumption diary data from an internet intervention for outcome and predictive modeling: A validation and machine learning study. BMC Med Res Methodol. 2020;20(1):111. doi:10.1186/s12874-020-00995-z
- Amialchuk A, Sapci O, Elhai JD. Applying machine learning methods to model social interactions in alcohol consumption among adolescents. Addict Res Theory. 2021;29(5):436-443. doi:10.1080/16066359.2021.1887147
- Lee MR, Sankar V, Hammer A, et al. Using machine learning to classify individuals with alcohol use disorder based on treatment seeking status. EClinicalmedicine. 2019;12:70-78. doi:10.1016/j.eclinm.2019.05.008
- Schwebel FJ, Emery NN, Pfund RA, Pearson MR, Witkiewitz K. Using machine learning to examine predictors of treatment goal change among individuals seeking treatment for alcohol use disorder. J Subst Abuse Treat. 2022;140:108825. doi:10.1016/j.jsat.2022.108825
- Walters ST, Businelle MS, Suchting R, Li X, Hébert ET, Mun EY. Using machine learning to identify predictors of imminent drinking and create tailored messages for at-risk drinkers experiencing homelessness. J Subst Abuse Treat. 2021;127:108417. doi:10.1016/j.jsat.2021.108417
- Roberts W, Zhao Y, Verplaetse T, et al. Using machine learning to predict heavy drinking during outpatient alcohol treatment. Alcohol Clin Exp Res. 2022;46(4):657-666. doi:10.1111/acer.14802
- To D, Sharma B, Karnik N, Joyce C, Dligach D, Afshar M. Validation of an alcohol misuse classifier in hospitalized patients. Alcohol. 2020;84:49-55. doi:10.1016/j.alcohol.2019.09.008
- Rehm J, Manthey J, Struzzo P, Gual A, Wojnar M. Who receives treatment for alcohol use disorders in the European Union? A cross-sectional representative study in primary and specialized health care. Eur Psychiatry. 2015;30(8):885-893. doi:10.1016/j.eurpsy.2015.07.012
- Foster S, Gmel G, Mohler-Kuo M. Young Swiss men’s risky single-occasion drinking: Identifying those who do not respond to stricter alcohol policy environments. Drug Alcohol Depend. 2022;234:109410. doi:10.1016/j.drugalcdep.2022.109410
- Kumari D, Swetapadma A. A novel method for predicting time of alcohol use based on personality traits and demographic information. IETE J Res. 2023;69(11):7846-7855. doi:10.1080/03772063.2022.2060874
- Ruiz-España S, Ortiz-Ramón R, Pérez-Ramírez Ú, et al. MRI texture-based radiomics analysis for the identification of altered functional networks in alcoholic patients and animal models. Comput Med Imaging Graph. 2023;104:102187. doi:10.1016/j.compmedimag.2023.102187
- Adeli E, Zahr NM, Pfefferbaum A, Sullivan EV, Pohl KM. Novel machine learning identifies brain patterns distinguishing diagnostic membership of human immunodeficiency virus, alcoholism, and their comorbidity of individuals. Biol Psychiatry Cogn Neurosci Neuroimaging. 2019;4(6):589-599. doi:10.1016/j.bpsc.2019.02.003
- Mumtaz W, Vuong PL, Xia L, Malik AS, Rashid RBA. Automatic diagnosis of alcohol use disorder using EEG features. Knowl-Based Syst. 2016;105:48-59. doi:10.1016/j.knosys.2016.04.026
- Bishop CM. Pattern Recognition and Machine Learning. New York, NY: Springer-Verlag; 2006.
- White AM. Gender differences in the epidemiology of alcohol use and related harms in the United States. Alcohol Res. 2020;40(2):01. doi:10.35946/arcr.v40.2.01
- Carlini LE, Fernandez AC, Mellinger JL. Sex and gender in alcohol use disorder and alcohol-associated liver disease in the United States: A narrative review. Hepatology. 2024;83(1):178-194. doi:10.1097/HEP.0000000000000905
- National Institute on Alcohol Abuse and Alcoholism. Alcohol’s Effects on Health: Women and Alcohol. Updated 2025. https://www.niaaa.nih.gov/publications/brochures-and-fact-sheets/women-and-alcohol.
- Keyes KM. Age, period, and cohort effects in alcohol use in the United States in the 20th and 21st centuries: Implications for the coming decades. Alcohol Res. 2022;42(1):02. doi:10.35946/arcr.v42.1.02
- Zucker RA. Anticipating problem alcohol use developmentally from childhood into middle adulthood: What have we learned? Addiction. 2008;103(suppl 1):100-108. doi:10.1111/j.1360-0443.2008.02179.x
- Dawson DA, Goldstein RB, Chou SP, Ruan WJ, Grant BF. Age at first drink and the first incidence of adult-onset DSM-IV alcohol use disorders. Alcohol Clin Exp Res. 2008;32(12):2149-2160. doi:10.1111/j.1530-0277.2008.00806.x
- Merrill JE, Carey KB. Drinking over the lifespan: Focus on college ages. Alcohol Res. 2016;38(1):103-114. doi:10.35946/arcr.v38.1.13
- Rohde P, Lewinsohn PM, Kahler CW, Seeley JR, Brown RA. Natural course of alcohol use disorders from adolescence to young adulthood. J Am Acad Child Adolesc Psychiatry. 2001;40(1):83-90. doi:10.1097/00004583-200101000-00020
- Sher KJ, Gotham HJ. Pathological alcohol involvement: A developmental disorder of young adulthood. Dev Psychopathol. 1999;11(4):933-956. doi:10.1017/s0954579499002394
- Patrick ME, Terry-McElrath YM, Lanza ST, Jager J, Schulenberg JE, O’Malley PM. Shifting age of peak binge drinking prevalence: Historical changes in normative trajectories among young adults aged 18 to 30. Alcohol Clin Exp Res. 2019;43(2):287-298. doi:10.1111/acer.13933
- Patrick ME, Pang YC, Jang BJ, Arterberry BJ, Terry-McElrath YM. Alcohol Use Disorder Symptoms Reported during Midlife: Results from the Monitoring the Future Study among US Adults at Modal Ages 50, 55, and 60. Subst Use Misuse. 2023;58(3):380-388. doi: 10.1080/10826084.2022.2161826
- Hingson RW, Zha W. Age of drinking onset, alcohol use disorders, frequent heavy drinking, and unintentionally injuring oneself and others after drinking. Pediatrics. 2009;123(6):1477-1484. doi:10.1542/peds.2008-2176
- Chikritzhs T, Livingston M. Alcohol and the risk of injury. Nutrients. 2021;13(8):2777. doi:10.3390/nu13082777
- Institute for Health Metrics and Evaluation. Findings from the Global Burden of Disease Study 2017. 2018. https://www.healthdata.org/sites/default/files/files/policy_report/2019/GBD_2017_Booklet.pdf.
- Manthey J, Hassan SA, Carr S, Kilian C, Kuitunen-Paul S, Rehm J. What are the economic costs to society attributable to alcohol use? A systematic review and modelling study. Pharmacoeconomics. 2021;39(7):809-822. doi:10.1007/s40273-021-01031-8
- Thabtah F, Hammoud S, Kamalov F, Gonsalves A. Data imbalance in classification: Experimental evaluation. Inf Sci. 2020;513:429-441. doi:10.1016/j.ins.2019.11.004
- Ekhtiari H, Sangchooli A, Carmichael O, et al. Neuroimaging biomarkers in addiction. medRxiv. 2024. doi:10.1101/2024.09.02.24312084
- Volkow ND, Baler RD. Brain imaging biomarkers to predict relapse in alcohol addiction. JAMA Psychiatry. 2013;70(7):661-663. doi:10.1001/jamapsychiatry.2013.1141
- Hargreaves TL, McIntyre-Wood C, Vandehei E, et al. Brain structural magnetic resonance imaging predictors of brief intervention response in individuals with alcohol use disorder. Alcohol Alcohol. 2025;60(3):agaf009. doi:10.1093/alcalc/agaf009
- Elliott ML, Knodt AR, Ireland D, et al. What is the test-retest reliability of common task-functional MRI measures? New empirical evidence and a meta-analysis. Psychol Sci. 2020;31(7):792-806. doi:10.1177/0956797620916786
- Marek S, Tervo-Clemmens B, Calabro FJ, et al. Reproducible brain-wide association studies require thousands of individuals. Nature. 2022;603(7902):654-660. doi:10.1038/s41586-022-04492-9
- Tomasi D, Volkow ND. Association between brain activation and functional connectivity. Cereb Cortex. 2019;29(5):1984-1996. doi:10.1093/cercor/bhy077
- Passaro AD, Vettel JM, McDaniel J, Lawhern V, Franaszczuk PJ, Gordon SM. A novel method linking neural connectivity to behavioral fluctuations: Behavior-regressed connectivity. J Neurosci Methods. 2017;279:60-71. doi:10.1016/j.jneumeth.2017.01.010
- Mascarell Maričić L, Walter H, Rosenthal A, et al. The IMAGEN study: A decade of imaging genetics in adolescents. Mol Psychiatry. 2020;25(11):2648-2671. doi:10.1038/s41380-020-0822-5
- Hua J, Xiong Z, Lowey J, Suh E, Dougherty ER. Optimal number of features as a function of sample size for various classification rules. Bioinformatics. 2005;21(8):1509-1515. doi:10.1093/bioinformatics/bti171
- Ebrahimi A, Wiil UK, Schmidt T, et al. Predicting the risk of alcohol use disorder using machine learning: A systematic literature review. IEEE Access. 2021;9:151697-151712. doi:10.1109/ACCESS.2021.3126777
- Zantvoort K, Nacke B, Görlich D, Hornstein S, Jacobi C, Funk B. Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions. NPJ Digit Med. 2024;7(1):361. doi:10.1038/s41746-024-01360-w
- Chen T, Guestrin C. XGBoost: A scalable tree boosting system. Presented at Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13-17, 2016; San Francisco, CA. doi:10.1145/2939672.2939785
- Shwartz-Ziv R, Armon A. Tabular data: Deep learning is not all you need. Inf Fusion. 2022;81:84-90. doi:10.1016/j.inffus.2021.11.011
- Grinsztajn L, Oyallon E, Varoquaux G. Why do tree-based models still outperform deep learning on typical tabular data? Presented at Proceedings of the 36th International Conference on Neural Information Processing Systems. NeurIPS Proceedings. November 28-December 9, 2022; New Orleans, LA. https://proceedings.neurips.cc/paper_files/paper/2022/file/0378c7692da36807bdec87ab043cdadc-Paper-Datasets_and_Benchmarks.pdf.
- Strobl C, Boulesteix AL, Zeileis A, Hothorn T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform. 2007;8:25. doi:10.1186/1471-2105-8-25
- Stihec J. Choose your AI weapon: Deep learning or traditional machine learning. 2024. https://shelf.io/blog/choose-your-ai-weapon-deep-learning-or-traditional-machine-learning.
- Abgrall G, Holder AL, Chelly Dagdia Z, Zeitouni K, Monnet X. Should AI models be explainable to clinicians? Crit Care. 2024;28(1):301. doi:10.1186/s13054-024-05005-y
- Shivashankar K, Al Hajj GA, Martini A. Maintainability and scalability in machine learning: Challenges and solutions. ACM Comput Surv. 2025;57(12):Article 318. doi:10.1145/3736751
- Woo C-W, Chang LJ, Lindquist MA, Wager TD. Building better biomarkers: Brain models in translational neuroimaging. Nat Neurosci. 2017;20(3):365-377. doi:10.1038/nn.4478
- Rashid B, Calhoun V. Towards a brain-based predictome of mental illness. Hum Brain Mapp. 2020;41(12):3468-3535. doi:10.1002/hbm.25013
- Delile J, Mukherjee S, Mueller J, Khalil I, Zhukov L, Meier C. Foundation models in drug discovery: Phenomenal growth today, transformative potential tomorrow? Drug Discov Today. 2025;30(12):104518. doi:10.1016/j.drudis.2025.104518
- Duffy G, Clarke SL, Christensen M, et al. Confounders mediate AI prediction of demographics in medical imaging. NPJ Digit Med. 2022;5(1):188. doi:10.1038/s41746-022-00720-8
- Mayhugh RE, Moussa MN, Simpson SL, et al. Moderate-heavy alcohol consumption lifestyle in older adults is associated with altered central executive network community structure during cognitive task. PLOS One. 2016;11(8):e0160214. doi:10.1371/journal.pone.0160214
- Rosenblatt M, Tejavibulya L, Jiang R, Noble S, Scheinost D. Data leakage inflates prediction performance in connectome-based machine learning models. Nat Commun. 2024;15(1):1829. doi:10.1038/s41467-024-46150-w
- Hamdan S, Love BC, von Polier GG, et al. Confound-leakage: Confound removal in machine learning leads to leakage. GigaScience. 2022;12:giad071. doi:10.1093/gigascience/giad071
- Zhao Q, Adeli E, Pfefferbaum A, Sullivan EV, Pohl KM. Confounder-aware visualization of ConvNets. In: Machine Learning in Medical Imaging. Cham, Switzerland; 2019;11861:328-336. doi:10.1007/978-3-030-32692-0_38
- Eshaghzadeh Torbati M, Minhas DS, Ahmad G, et al. A multi-scanner neuroimaging data harmonization using RAVEL and ComBat. Neuroimage. 2021;245:118703. doi:10.1016/j.neuroimage.2021.118703
- Zhao H, des Combes RT, Zhang K, Gordon GJ. On Learning Invariant Representations for Domain Adaptation. Presented at 36th International Conference on Machine Learning; June 9-15, 2019; Long Beach, CA.
- Zhao Q, Adeli E, Pohl KM. Training confounder-free deep learning models for medical applications. Nat Commun. 2020;11(1):6010. doi:10.1038/s41467-020-19784-9
- Chen RJ, Wang JJ, Williamson DFK, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng. 2023;7(6):719-742. doi:10.1038/s41551-023-01056-8
- Castelnovo A, Crupi R, Greco G, Regoli D, Penco IG, Cosentini AC. A clarification of the nuances in the fairness metrics landscape. Sci Rep. 2022;12(1):4209. doi:10.1038/s41598-022-07939-1
- Chyzhyk D, Varoquaux G, Milham M, Thirion B. How to remove or control confounds in predictive models, with applications to brain biomarkers. GigaScience. 2022;11:giac014. doi:10.1093/gigascience/giac014
- Zhao Q, Nooner KB, Tapert SF, et al. The transition from homogeneous to heterogeneous machine learning in neuropsychiatric research. Biol Psychiatry Glob Open Sci. 2025;5(1):100397. doi:10.1016/j.bpsgos.2024.100397
- Peng W, Bosschieter T, Ouyang J, et al. Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs. Med Image Anal. 2024;98:103325. doi:10.1016/j.media.2024.103325
- Hardoon DR, Szedmak S, Shawe-Taylor J. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 2004;16(12):2639-2664. doi:10.1162/0899766042321814
- Zhao Q, Milecki L, Kuceyeski A, et al. Socioemotional and Executive Control Mismatch in Adolescence Heightens Risks for Initiating Drinking. JAMA Netw Open. 2025; 8(9):e2531378. doi:10.1001/jamanetworkopen.2025.31378
- Ullman JB, Bentler PM. Structural equation modeling. In: Weine I, ed. Handbook of Psychology. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc.; 2012:661-690. doi:10.1002/9781118133880.hop202023
- AI for Mental Health. Stanford. http://ai4mh.stanford.edu.
- Han R, Acosta JN, Shakeri Z, Ioannidis JPA, Topol EJ, Rajpurkar P. Randomised controlled trials evaluating artificial intelligence in clinical practice: A scoping review. Lancet Digit Health. 2024;6(5):e367-e373. doi:10.1016/S2589-7500(24)00047-5
- Wu K, Wu E, Theodorou B, et al. Characterizing the clinical adoption of medical AI devices through U.S. insurance claims. NEJM Ai. 2024;1(1):AIoa2300030. doi:10.1056/AIoa2300030
- Muralidharan V, Adewale BA, Huang CJ, et al. A scoping review of reporting gaps in FDA-approved AI medical devices. NPJ Digit Med. 2024;7(1):273. doi:10.1038/s41746-024-01270-x
- U.S. Food and Drug Administration. Artificial intelligence-enabled medical devices. 2025. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-enabled-medical-devices.
- Sempionatto JR, Brazaca LC, García-Carmona L, et al. Eyeglasses-based tear biosensing system: Non-invasive detection of alcohol, vitamins and glucose. Biosens Bioelectron. 2019;137:161-170. doi:10.1016/j.bios.2019.04.058
- Campbell AS, Kim J, Wang J. Wearable electrochemical alcohol biosensors. Curr Opin Electrochem. 2018;10:126-135. doi:10.1016/j.coelec.2018.05.014
- Gallitto G, Englert R, Kincses B, et al. External validation of machine learning models-registered models and adaptive sample splitting. GigaScience. 2025;14:giaf036. doi:10.1093/gigascience/giaf036
- Wiggins WF, Tejani AS. On the opportunities and risks of foundation models for natural language processing in radiology. Radiol Artif Intell. 2022;4(4):e220119. doi:10.1148/ryai.220119
Appendices
Note. ABCD, Adolescent Brain Cognitive Development; AI, artificial intelligence; AUD, alcohol use disorder; EHR, electronic health record; GFCI, greedy fast causal inference; kNN, k-nearest neighbor; LASSO, least absolute shrinkage and selection operator; LSTM, long short-term memory; MLP, multilayer perceptron; MRI, magnetic resonance imaging; NMF, nonnegative matrix factorization; RSLVQ, robust soft learning vector quantization; RVR, relevant vector regression; sPLS-DA, sparse partial least squares discriminant analysis; SVM, support vector machine.