Psychological Assessment Psychological Assessment publishes mainly empirical articles concerning clinical assessment. Papers that fall within the domain of the journal include research on the development, validation, application, and evaluation of psychological assessment instruments. Diverse modalities (e.g., cognitive, physiologic, and motoric) and methods of assessment (e.g., questionnaires, interviews, natural environment and analog environment observation, self-monitoring, participant observation, physiological measurement, instrument-assisted and computer-assisted assessment) are within the domain of the journal, especially as they relate to clinical assessment. Also included are topics on clinical judgment and decision making (including diagnostic assessment), methods of measurement of treatment process and outcome, and dimensions of individual differences (e.g., race, ethnicity, age, gender, sexual orientation, economic status) as they relate to clinical assessment.
Copyright 2017 American Psychological Association
  • The importance and acceptability of general and maladaptive personality trait computerized assessment feedback.
    Personality traits are a useful component of clinical assessment, and have been associated with positive and negative life outcomes. Assessment of both general and maladaptive personality traits may be beneficial practice, as they may complement each other to comprehensively and accurately describe one’s personality. Notably, personal preferences regarding assessment feedback have not been studied. The current study examined the acceptability of personality assessment feedback from the perspective of the examinee. Treatment-seeking participants from a university (n = 72) and MTurk (n = 101) completed measures of the 5-factor model and the DSM–5 alternative model of personality disorder, and were then provided feedback on their general and maladaptive personality traits. Individuals then provided feedback on which aspects they found most useful. Results demonstrated strong participant agreement that the personality trait feedback was accurate and relevant. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • The Inventory of Psychotic-Like Anomalous Self-Experiences (IPASE): Development and validation.
    Anomalous self-experiences (ASEs) are among the first symptoms to appear in the prodrome, predict the development of psychosis over and above clinical symptoms, and are common in people with schizophrenia. Although there are well-validated phenomenological interviews for assessing ASEs, there are no self-report measures. The current research describes 4 studies designed to develop and validate a new scale to assess ASEs: the Inventory of Psychotic-Like Anomalous Self-Experiences (IPASE). In Study 1, an overinclusive item pool was generated based on phenomenological descriptions of ASEs, and items were kept or discarded based on factor loadings in an exploratory factor analysis. Five factors were extracted including disturbances in Cognition, Consciousness, Self-Awareness and Presence, Somatization, and Transitivism/Demarcation. The 5-factor structure was confirmed in Study 2, and the scale showed measurement invariance between sexes. IPASE scores were correlated with self-report and task measures of self-processing including self-concept clarity, self-consciousness, and self-esteem as well as measures of psychotic-like experiences. In Study 3, people with positive schizotypy had higher IPASE scores than a negative schizotypy and comparison group. In Study 4, people with schizophrenia had higher IPASE scores than healthy controls. Overall, the IPASE displayed strong psychometric qualities and is a brief alternative to resource-intensive phenomenological interviews in clinical, at-risk, and general population samples. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Does staff see what experts see? Accuracy of front line staff in scoring juveniles’ risk factors.
    Although increasingly complex risk assessment tools are being marketed, little is known about “real world” practitioners’ capacity to score them accurately. In this study, we assess the extent to which 78 staff members’ scoring of juveniles on the California-Youth Assessment and Screening Instrument (CA-YASI; Orbis Partners, Inc., 2008) agree with experts’ criterion scores for those cases. There are 3 key findings. First, at the total score level, practitioners manifest limited agreement (M ICC = .63) with the criterion: Only 59.0% of staff scores the tool with “good” accuracy. Second, at the subscale level, practitioners’ accuracy is particularly weak for treatment-relevant factors that require substantial judgment—like procriminal attitudes (M ICC = .52)—but good for such straightforward factors as legal history (M ICC = .72). Third, practitioners’ accuracy depended on their experience—relatively new staff’s scores were more consistent with the criterion than those with greater years of experience. Results suggest that attention to parsimony (for tools) and meaningful training and monitoring (for staff) are necessary to realize the promise of risk assessment for informing risk reduction. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • A daily diary study on adolescent emotional experiences: Measurement invariance and developmental trajectories.
    Adolescence is an important time for emotional development. Recently, daily diary methods are increasingly employed in research on emotional development and are used to explore the development of and sex differences in emotions during adolescence. However, before drawing conclusions about sex differences and developmental trends, one needs to ensure that the same construct is measured across sex and time. The present study tested measurement invariance of daily emotion assessments across sex, short-term (days within weeks) and long-term periods (days across years) in a sample of 394 adolescents (55.6% male) that were followed from ages 13 to 18. Moreover, the study examined the developmental trajectories of adolescent emotional experiences. Adolescents rated their daily emotions (happiness, anger, sadness, anxiety) during each day of a normal school week (Monday to Friday) for 3 weeks per year for 5 years (i.e., 15 weeks × 5 days = 75 assessments in total). Measurement invariance analyses suggest that the measurement of adolescent daily mood was invariant between boys and girls and across shorter and longer time intervals. Moreover, latent growth curve analyses showed that happiness decreased from early to middle adolescence, whereas anger, sadness, and anxiety increased. Anger returned to baseline toward late adolescence. In contrast, the decrease of happiness and the increase of anxiety leveled off without reversing, whereas sadness continued to increase. The discussion highlights the implications of measurement invariance in research on individual and developmental differences and discusses the findings in light of normative emotional development. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Ratings of Everyday Executive Functioning (REEF): A parent-report measure of preschoolers’ executive functioning skills.
    Executive functioning (EF) facilitates the development of academic, cognitive, and social-emotional skills and deficits in EF are implicated in a broad range of child psychopathologies. Although EF has clear implications for early development, the few questionnaires that assess EF in preschoolers tend to ask parents for global judgments of executive dysfunction and thus do not cover the full range of EF within the preschool age group. Here we present a new measure of preschoolers’ EF—the Ratings of Everyday Executive Functioning (REEF)—that capitalizes on parents’ observations of their preschoolers’ (i.e., 3- to 5-year-olds) behavior in specific, everyday contexts. Over 4 studies, items comprising the REEF were refined and the measure’s reliability and validity were evaluated. Factor analysis of the REEF yielded 1 factor, with items showing strong internal reliability. More important, children’s scores on the REEF related to both laboratory measures of EF and another parent-report EF questionnaire. Moreover, reflecting divergent validity, the REEF was more strongly related to measures of EF as opposed to measures of affective styles. The REEF also captured differences in children’s executive skills across the preschool years, and norms at 6-month intervals are reported. In summary, the REEF is a new parent-report measure that provides researchers with an efficient, valid, and reliable means of assessing preschoolers’ executive functioning. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Preliminary validation of the Rating of Outcome Scale and equivalence of ultra-brief measures of well-being.
    Three brief psychotherapy outcome measures were assessed for equivalence. The Rating of Outcome Scale (ROS), a 3-item patient-reported outcome measure, was evaluated for interitem consistency, test–retest reliability, discriminant validity, repeatability, sensitivity to change, and agreement with the Outcome Rating Scale (ORS) and Outcome Questionnaire (OQ) in 1 clinical sample and 3 community samples. Clinical cutoffs, reliable change indices, and Bland-Altman repeatability coefficients were calculated. Week-to-week change on each instrument was compared via repeated-measures-corrected effect size. Community-normed T scores and Bland-Altman plots were generated to aid comparisons between instruments. The ROS showed good psychometric properties, sensitivity to change in treatment, and discrimination between outpatients and nonpatients. Agreement between the ROS and ORS was good, but neither the agreement between these nor that between ultrabrief instruments and the OQ were as good as correlations might suggest. The ROS showed incremental advantages over the ORS: improvements in concordance with the OQ, better absolute reliability, and less oversensitivity to change. The ROS had high patient acceptance and usability, and scores showed good reliability, cross-instrument validity, and responsiveness to change for the routine monitoring of clinical outcomes. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Measurement invariance across administration mode: Examining the Posttraumatic Stress Disorder (PTSD) Checklist.
    The Posttraumatic Stress Disorder (PTSD) Checklist (PCL) is commonly used to screen for PTSD in clinical and research contexts. While the PCL is utilized within numerous settings and populations, research has not yet established the extent to which individuals respond similarly across different modes of administration. The use of both telephone and web survey administration modes has numerous potential benefits, including data quality improvement, but may introduce an additional source of measurement error. The current study examined the psychometric properties, including factor structure and measurement invariance, of the PCL across telephone and web administration modes among 455 wounded, ill, or injured airmen who were medically retired or undergoing evaluation for disability caused by injuries and illnesses of a physical or psychological nature. Findings suggest the properties of the PCL were invariant with regard to the mode of administration, such that the overall scale structure and size of the loadings were similar across groups. Corrections were applied to the computation of probable PTSD diagnosis to account for partial scalar invariance. The lack of complete invariance did not affect probable PTSD diagnosis. Finally, differences in latent means across the telephone and web group were nonsignificant and modest in magnitude. These results indicate that although the PCL only achieved partial scalar invariance across administration modes, the practical impact of this difference on rates of probable PTSD is negligible. The practical benefits of administering the PCL over the telephone and on the web do not appear to be outweighed by the potential cost of additional measurement error. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • The cultural fairness of the 12-item General Health Questionnaire among diverse adolescents.
    The 12-item general health questionnaire (GHQ-12) was used in the Longitudinal Study of Young People in England (LSYPE; N = 15,770) to collect measures on adolescent mental health. Given the debate in current literature regarding the dimensionality of the GHQ-12, this study examined the cultural sensitivity of the instrument at the item level for each of the 7 major ethnic groups within the database. This study used a hybrid approach of ordinal logistic regression and item response theory (IRT) to examine the presence of differential item functioning (DIF) on the questionnaire. Results demonstrated that uniform, nonuniform, and overall DIF were present on items between White and Asian adolescents (7 items), White and Black Caribbean adolescents (1 item), and White and Black African adolescents (7 items), however all McFadden’s pseudo R² effect size estimates indicated that the DIF was negligible. Overall, there were cumulative small scale level effects for the Mixed/Biracial, Asian, and Black African groups, but in each case the bias was only marginal. Findings demonstrate that the GHQ-12 can be considered culturally sensitive for adolescents from diverse ethnic groups in England, but follow-up studies are necessary. Implications for future education and health policies as well as the use of IR-based approaches for psychological instruments are discussed. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Test order in teacher-rated behavior assessments: Is counterbalancing necessary?
    Counterbalancing treatment order in experimental research design is well established as an option to reduce threats to internal validity, but in educational and psychological research, the effect of varying the order of multiple tests to a single rater has not been examined and is rarely adhered to in practice. The current study examines the effect of test order on measures of student behavior by teachers as raters utilizing data from a behavior measure validation study. Using multilevel modeling to control for students nested within teachers, the effect of rating an earlier measure on the intercept or slope of a later behavior assessment was statistically significant in 22% of predictor main effects for the spring test period. Test order effects had potential for high stakes consequences with differences large enough to change risk classification. Results suggest that researchers and practitioners in classroom settings using multiple measures evaluate the potential impact of test order. Where possible, they should counterbalance when the risk of an order effect exists and report justification for the decision to not counterbalance. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Factorial structure and long-term stability of the Autonomy Preference Index.
    The autonomy preference index scale (API) has been designed to measure patient preference for 2 dimensions of autonomy: Their desire to take part in making medical decisions (decision making, [DM]) and their desire to be informed about their illness and the treatment (information seeking; [IS]). The DM dimension is measured by 6 general items together with 9 items related to 3 clinical vignettes (3 × 3 items). The IS dimension is measured by 8 items. While the API is widely used, a review of literature has identified several inconsistencies in the way it is scored. The first aim of this study was to determine the best scoring structure of the API on the basis of validity and reliability evidence. The second aim was to investigate the long-term stability of API scores. Two-hundred and 85 patients with a diagnosis of psychosis were assessed as they were about to be discharged from involuntary psychiatric hospitalization and they were reassessed after 6 and 12 months. Confirmatory factor analysis (CFA) revealed that a 3-factor solution was most adequate and that 2 distinct DM subscales should be preferred to 1 total DM score. While internal consistency estimates of the 3 subscales were good, the long-term stability of API scores was only modest. Multigroup-CFA revealed scalar invariance indicating API scores kept the same meaning longitudinally. In conclusion, a 3-factor structure seemed to be most adequate for the API scale. Long-term stability estimates suggested that clinicians should regularly assess patients’ preferences for autonomy because API scores fluctuate over time. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
  • Evaluating empathy in Colombian ex-combatants: Examination of the internal structure of the Interpersonal Reactivity Index (IRI) in Spanish.
    The Republic of Colombia has a long-standing history of internal armed conflict, further complicated by the ideological assumptions underlying their war. In recent years, its government designed the Program for Reincorporation to Civilian Life (Programa para la Reincorporación a la Vida Civil, PRVC), aiming demobilization of thousands of insurgents who were involved in guerilla and paramilitary forces. One PRVC goal involves the psychological characterization of its reincorporated members, aiming the informed design of effective and efficacious interventions to improve their adjustment. We are interested in the examination of empathy in this population. Empathy refers to the ability to predict, understand, and experience other’s feelings. Empathy appears to have an effect on level of aggressive behavior. The Interpersonal Reactivity Index (IRI; Davis, 1980, 1983) is a well-established 28-item self-report tool for the assessment of empathy, including 4 scales: Perspective Taking, Fantasy, Empathic Concern, and Personal Distress. Versions in Spanish were validated in Spain and Chile, but no norms for Colombians exist. We examined the factorial structure of the IRI in a sample of 548 (83.4% males) members of the PRVC. Ten items with low factor loadings were eliminated following a series of confirmatory factor analyses (CFA). The final 4-factor model (Model 2) reached an acceptable fit (e.g., CFI = .898). A second-order CFA demonstrated that empathic concern correlated too high with a common “empathy” latent factor. With these results at hand, our 18-item IRI version in Spanish achieved a factorial structure comparable to that previously validated for Spanish speakers from other countries. (PsycINFO Database Record (c) 2016 APA, all rights reserved)
