Psychological Methods

A novel approach to estimate moderated treatment effects and moderated mediated effects with continuous moderators.
Moderation analysis is used to study under what conditions or for which subgroups of individuals a treatment effect is stronger or weaker. When a moderator variable is categorical, such as assigned sex, treatment effects can be estimated for each group resulting in a treatment effect for males and a treatment effect for females. If a moderator variable is a continuous variable, a strategy for investigating moderated treatment effects is to estimate conditional effects (i.e., simple slopes) via the pick-a-point approach. When conditional effects are estimated using the pick-a-point approach, the conditional effects are often given the interpretation of “the treatment effect for the subgroup of individuals….” However, the interpretation of these conditional effects as subgroup effects is potentially misleading because conditional effects are interpreted at a specific value of the moderator variable (e.g., +1 SD above the mean). We describe a simple solution that resolves this problem using a simulation-based approach. We describe how to apply this simulation-based approach to estimate subgroup effects by defining subgroups using a range of scores on the continuous moderator variable. We apply this method to three empirical examples to demonstrate how to estimate subgroup effects for moderated treatment and moderated mediated effects when the moderator variable is a continuous variable. Finally, we provide researchers with both SAS and R code to implement this method for similar situations described in this paper. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Is exploratory factor analysis always to be preferred? A systematic comparison of factor analytic techniques throughout the confirmatory–exploratory continuum.
The number of available factor analytic techniques has been increasing in the last decades. However, the lack of clear guidelines and exhaustive comparison studies between the techniques might hinder that these valuable methodological advances make their way to applied research. The present paper evaluates the performance of confirmatory factor analysis (CFA), CFA with sequential model modification using modification indices and the Saris procedure, exploratory factor analysis (EFA) with different rotation procedures (Geomin, target, and objectively refined target matrix), Bayesian structural equation modeling (BSEM), and a new set of procedures that, after fitting an unrestrictive model (i.e., EFA, BSEM), identify and retain only the relevant loadings to provide a parsimonious CFA solution (ECFA, BCFA). By means of an exhaustive Monte Carlo simulation study and a real data illustration, it is shown that CFA and BSEM are overly stiff and, consequently, do not appropriately recover the structure of slightly misspecified models. EFA usually provides the most accurate parameter estimates, although the rotation procedure choice is of major importance, especially depending on whether the latent factors are correlated or not. Finally, ECFA might be a sound option whenever an a priori structure cannot be hypothesized and the latent factors are correlated. Moreover, it is shown that the pattern of the results of a factor analytic technique can be somehow predicted based on its positioning in the confirmatory–exploratory continuum. Applied recommendations are given for the selection of the most appropriate technique under different representative scenarios by means of a detailed flowchart. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Reliability and omega hierarchical in multidimensional data: A comparison of various estimators.
The current guidelines for estimating reliability recommend using two omega combinations in multidimensional data. One omega is for factor analysis (FA) reliability estimators, and the other omega is for omega hierarchical estimators (i.e., ωh). This study challenges these guidelines. Specifically, the following three questions are asked: (a) Do FA reliability estimators outperform non-FA reliability estimators? (b) Is it always desirable to estimate ωh? (c) What are the best reliability and ωh estimators? This study addresses these issues through a Monte Carlo simulation of reliability and ωh estimators. The conclusions are given as follows. First, the performance differences among most reliability estimators are small, and the performance of FA estimators is comparable to that of non-FA estimators. However, the current, most-recommended estimators, that is, estimators based on the bifactor model and exploratory factor analysis, tend to overestimate reliability. Second, the accuracy of ωh estimators is much lower than that of reliability estimators, so we should perform ωh estimation selectively only on data that meet several requirements. Third, exploratory bifactor analysis is more accurate than confirmatory bifactor analysis only in the presence of cross-loading; otherwise, exploratory bifactor analysis is less accurate than confirmatory bifactor analysis. Fourth, techniques known to improve the Schmid-Leiman (SL) transformation are not superior to SL transformation but have different advantages. This study provides an R Shiny app that allows users to obtain multidimensional reliability and ωh estimates with a few mouse clicks. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Mediation analysis using Bayesian tree ensembles.
We present a general framework for causal mediation analysis using nonparametric Bayesian methods in the potential outcomes framework. Our model, which we refer to as the Bayesian causal mediation forests model, combines recent advances in Bayesian machine learning using decision tree ensembles, Bayesian nonparametric causal inference, and a Bayesian implementation of the g-formula for computing causal effects. Because of its strong performance on simulated data and because it greatly reduces researcher degrees of freedom, we argue that Bayesian causal mediation forests are highly attractive as a default approach. Of independent interest, we also introduce a new sensitivity analysis technique for mediation analysis with continuous outcomes that is widely applicable. We demonstrate our approach on both simulated and real data sets, and show that our approach obtains low mean squared error and close to nominal coverage of 95% interval estimates, even in highly nonlinear problems on which other methods fail. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
A primer on synthesizing individual participant data obtained from complex sampling surveys: A two-stage IPD meta-analysis approach.
The increasing availability of individual participant data (IPD) in the social sciences offers new possibilities to synthesize research evidence across primary studies. Two-stage IPD meta-analysis represents a framework that can utilize these possibilities. While most of the methodological research on two-stage IPD meta-analysis focused on its performance compared with other approaches, dealing with the complexities of the primary and meta-analytic data has received little attention, particularly when IPD are drawn from complex sampling surveys. Complex sampling surveys often feature clustering, stratification, and multistage sampling to obtain nationally or internationally representative data from a target population. Furthermore, IPD from these studies is likely to provide more than one effect size. To address these complexities, we propose a two-stage meta-analytic approach that generates model-based effect sizes in Stage 1 and synthesizes them in Stage 2. We present a sequence of steps, illustrate their implementation, and discuss the methodological decisions and options within. Given its flexibility to deal with the complex nature of the primary and meta-analytic data and its ability to combine multiple IPD sets or IPD with aggregated data, the proposed two-stage approach opens up new analytic possibilities for synthesizing knowledge from complex sampling surveys. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Everything has its price: Foundations of cost-sensitive machine learning and its application in psychology.
Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (N = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Troubleshooting Bayesian cognitive models.
Using Bayesian methods to apply computational models of cognitive processes, or Bayesian cognitive modeling, is an important new trend in psychological research. The rise of Bayesian cognitive modeling has been accelerated by the introduction of software that efficiently automates the Markov chain Monte Carlo sampling used for Bayesian model fitting—including the popular Stan and PyMC packages, which automate the dynamic Hamiltonian Monte Carlo and No-U-Turn Sampler (HMC/NUTS) algorithms that we spotlight here. Unfortunately, Bayesian cognitive models can struggle to pass the growing number of diagnostic checks required of Bayesian models. If any failures are left undetected, inferences about cognition based on the model’s output may be biased or incorrect. As such, Bayesian cognitive models almost always require troubleshooting before being used for inference. Here, we present a deep treatment of the diagnostic checks and procedures that are critical for effective troubleshooting, but are often left underspecified by tutorial papers. After a conceptual introduction to Bayesian cognitive modeling and HMC/NUTS sampling, we outline the diagnostic metrics, procedures, and plots necessary to detect problems in model output with an emphasis on how these requirements have recently been changed and extended. Throughout, we explain how uncovering the exact nature of the problem is often the key to identifying solutions. We also demonstrate the troubleshooting process for an example hierarchical Bayesian model of reinforcement learning, including supplementary code. With this comprehensive guide to techniques for detecting, identifying, and overcoming problems in fitting Bayesian cognitive models, psychologists across subfields can more confidently build and use Bayesian cognitive models in their research. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Linear mixed models and latent growth curve models for group comparison studies contaminated by outliers.
The linear mixed model (LMM) and latent growth model (LGM) are frequently applied to within-subject two-group comparison studies to investigate group differences in the time effect, supposedly due to differential group treatments. Yet, research about LMM and LGM in the presence of outliers (defined as observations with a very low probability of occurrence if assumed from a given distribution) is scarce. Moreover, when such research exists, it focuses on estimation properties (bias and efficiency), neglecting inferential characteristics (e.g., power and type-I error). We study power and type-I error rates of Wald-type and bootstrap confidence intervals (CIs), as well as coverage and length of CIs and mean absolute error (MAE) of estimates, associated with classical and robust estimations of LMM and LGM, applied to a within-subject two-group comparison design. We conduct a Monte Carlo simulation experiment to compare CIs and MAEs under different conditions: data (a) without contamination, (b) contaminated by within-subject outliers, (c) contaminated by between-subject outliers, and (d) both contaminated by within- and between-subject outliers. Results show that without contamination, methods perform similarly, except CIs based on S, a robust LMM estimator, which are slightly less close to nominal values in their coverage. However, in the presence of both within- and between-subject outliers, CIs based on robust estimators, especially S, performed better than those of classical methods. In particular, the percentile CI with the wild bootstrap applied to the robust LMM estimators outperformed all other methods, especially with between-subject outliers, when we found the classical Wald-type CI based on the t statistic with Satterthwaite approximation for LMM to be highly misleading. We provide R code to compute all methods presented here. (PsycInfo Database Record (c) 2025 APA, all rights reserved)
Inference with cross-lagged effects—Problems in time.
The interpretation of cross-effects from vector autoregressive models to infer structure and causality among constructs is widespread and sometimes problematic. I describe problems in the interpretation of cross-effects when processes that are thought to fluctuate continuously in time are, as is typically done, modeled as changing only in discrete steps (as in e.g., structural equation modeling)—zeroes in a discrete-time temporal matrix do not necessarily correspond to zero effects in the underlying continuous processes, and vice versa. This has implications for the common case when the presence or absence of cross-effects is used for inference about underlying causal processes. I demonstrate these problems via simulation, and also show that when an underlying set of processes are continuous in time, even relatively few direct causal links can result in much denser temporal effect matrices in discrete-time. I demonstrate one solution to these issues, namely parameterizing the system as a stochastic differential equation and focusing inference on the continuous-time temporal effects. I follow this with some discussion of issues regarding the switch to continuous-time, specifically regularization, appropriate measurement time lag, and model order. An empirical example using intensive longitudinal data highlights some of the complexities of applying such approaches to real data, particularly with respect to model specification, examining misspecification, and parameter interpretation. (PsycInfo Database Record (c) 2025 APA, all rights reserved)

Psychological Methods - Vol 30, Iss 1