PsyResearch
ψ   Psychology Research on the Web   



Couples needed for online psychology research


Help us grow:




Psychological Assessment - Vol 26, Iss 3

Random Abstract
Quick Journal Finder:
Psychological Assessment Psychological Assessment publishes mainly empirical articles concerning clinical assessment. Papers that fall within the domain of the journal include research on the development, validation, application, and evaluation of psychological assessment instruments. Diverse modalities (e.g., cognitive, physiologic, and motoric) and methods of assessment (e.g., questionnaires, interviews, natural environment and analog environment observation, self-monitoring, participant observation, physiological measurement, instrument-assisted and computer-assisted assessment) are within the domain of the journal, especially as they relate to clinical assessment. Also included are topics on clinical judgment and decision making (including diagnostic assessment), methods of measurement of treatment process and outcome, and dimensions of individual differences (e.g., race, ethnicity, age, gender, sexual orientation, economic status) as they relate to clinical assessment.
Copyright 2014 American Psychological Association
  • A comparison of the predictive properties of nine sex offender risk assessment instruments.
    Sex offender treatment is most effective when tailored to risk-need-responsivity principles, which dictate that treatment levels should match risk levels as assessed by structured risk assessment instruments. The predictive properties, missing values, and interrater agreement of the scores of 9 structured risk assessment instruments were compared in a national sample of 397 Dutch convicted sex offenders. The instruments included the Rapid Risk Assessment for Sexual Offense Recidivism, Static-99, Static-99R, a slightly modified version of Static-2002 and Static-2002R, Structured Anchored Clinical Judgments Minimum, Risk Matrix 2000, Sexual Violence Risk 20, and a modified version of the Sex Offender Risk Appraisal Guide; sexual and violent (including sexual) recidivism was assessed over 5- and 10-year fixed and variable follow-up periods. In general, the instrument scores showed moderate to large predictive accuracy for the occurrence of reoffending and the number of reoffenses in this sample. Predictive accuracy regarding latency showed more variability across instrument scores. Static-2002R and Static-99R scores showed a slight but consistent advantage in predictive properties over the other instrument scores across outcome measures and follow-up periods in this sample. The results of Sexual Violence Risk 20 and Rapid Risk Assessment for Sexual Offense Recidivism scores were the least positive. A positive association between predictive accuracy and interrater agreement at the item level was found for both sexual recidivism (r = .28, p = .01) and violent (including sexual) recidivism (r = .45, p <.001); no significant association was found between predictive accuracy and missing values at the item level. Results underscore the feasibility and utility of these instruments for informing treatment selection according to the risk-need-responsivity principles. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • The heritability of psychopathic personality in 14- to 15-year-old twins: A multirater, multimeasure approach.
    Until now, no study has examined the genetic and environmental influences on psychopathic personality across different raters and method of assessment. Participants were part of a community sample of male and female twins born between 1990 and 1995. The Child Psychopathy Scale and the Antisocial Process Screening Device were administered to the twins and their parents when the twins were 14–15 years old. The Psychopathy Checklist: Youth Version (PCL:YV) was administered and scored by trained testers. Results showed that a 1-factor common pathway model was the best fit for the data. Genetic influences explained 69% of the variance in the latent psychopathic personality factor, while nonshared environmental influences explained 31%. Measurement-specific genetic effects accounted for between 9% and 35% of the total variance in each of the measures, except for PCL:YV, where all genetic influences were in common with the other measures. Measure-specific nonshared environmental influences were found for all measures, explaining between 17% and 56% of the variance. These findings provide further evidence of the heritability in psychopathic personality among adolescents, although these effects vary across the ways in which these traits are measured, in terms of both informant and instrument used. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Childhood Trauma Questionnaire: Factor structure, measurement invariance, and validity across emotional disorders.
    To study the psychometric properties of the Childhood Trauma Questionnaire–Short Form (CTQ–SF), we determined its dimensional structure, measurement invariance across presence of emotional disorders, the association of the CTQ–SF with an analogous interview-based measure (CTI) across presence of emotional disorders, and the incremental value of combining both instruments in determining associations with severity of psychopathology. The sample included 2,308 adults, ages 18–65, consisting of unaffected controls and chronically affected and intermittently affected persons with an emotional disorder at Time 0 (T0) or 4 years later at T4. Childhood maltreatment was measured at T0 with an interview and at T4 with the CTQ–SF. At each wave, patients were assessed for Diagnostic and Statistical Manual of Mental Disorders (4th ed., or DSM–IV; American Psychiatric Association, 1994)-based emotional disorders (Composite Interview Diagnostic Instrument) and symptom severity (Inventory of Depressive Symptomatology, Beck Anxiety Inventory, Fear Questionnaire). Besides the correlated original 5-factor solution, an indirect higher order and direct bifactorial model also showed a good fit to the data. The 5-factor solution proved to be invariant across disordered–control comparison groups. The CTQ–SF was moderately associated with the CTI, and this association was not attenuated by disorder status. The CTQ–SF was more sensitive in detecting emotional abuse and emotional neglect than the CTI. Combined CTQ–SF/CTI factor scores showed a higher association with severity of psychopathology. We conclude that although the original 5-factor model fits the data well, results of the hierarchical analyses suggest that the total CTQ scale adequately captures a broad dimension of childhood maltreatment. A 2-step measurement approach in the assessment of childhood trauma is recommended in which screening by a self-report questionnaire is followed by a (semi-)structured diagnostic interview. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Validating new summary indices for the Childhood Trauma Interview: Associations with first onsets of major depressive disorder and anxiety disorders.
    Childhood and adolescent adversity is of great interest in relation to risk for psychopathology, and interview measures of adversity are thought to be more reliable and valid than their questionnaire counterparts. One interview measure, the Childhood Trauma Interview (CTI; Fink et al., 1995), has been positively evaluated relative to similar measures, but there are some psychometric limitations to an existing scoring approach that limit the full potential of this measure. We propose several new summary indices for the CTI that permit examination of different types of adversity and different developmental periods. Our approach creates several summary indices: one sums the severity scores of adversities endorsed; another utilizes the number of minor and major (moderate to severe) adversities. The new indices were examined in association with first onsets of major depressive disorder (MDD) and anxiety disorders across a 5-year period using annual clinical diagnostic interviews (Structured Clinical Interview for DSM–IV–TR). Summary scores derived with the previously used approach were also examined for comparison. Data on 332 participants came from the Youth Emotion Project, a longitudinal study of risk for emotional disorders. Results support the predictive validity of the proposed summary scoring methods and indicate that several forms of major (but typically not minor) adversity are significantly associated with first onsets of MDD and anxiety disorders. Finally, multivariate regression models show that, in many instances, the new indices contributed significant unique variance predicting disorder onsets over and above the previously used summary indices. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Examining the latent structure of anxiety sensitivity in adolescents using factor mixture modeling.
    Anxiety sensitivity has been implicated as an important risk factor, generalizable to most anxiety disorders. In adults, factor mixture modeling has been used to demonstrate that anxiety sensitivity is best conceptualized as categorical between individuals. That is, whereas most adults appear to possess normative levels of anxiety sensitivity, a small subset of the population appears to possess abnormally high levels of anxiety sensitivity. Further, those in the high anxiety sensitivity group are at increased risk of having high levels of anxiety and of having an anxiety disorder. This study was designed to determine whether these findings extend to adolescents. Factor mixture modeling was used to examine the best fitting model of anxiety sensitivity in a sample of 277 adolescents (M age = 11.0 years, SD = 0.81). Consistent with research in adults, the best fitting model consisted of 2 classes, 1 containing adolescents with high levels of anxiety sensitivity (n = 25) and another containing adolescents with normative levels of anxiety sensitivity (n = 252). Examination of anxiety sensitivity subscales revealed that the social concerns subscale was not important for classification of individuals. Convergent and discriminant validity of anxiety sensitivity classes were found in that membership in the high anxiety sensitivity class was associated with higher mean levels of anxiety symptoms, controlling for depression and externalizing problems, and was not associated with higher mean levels of depression or externalizing symptoms controlling for anxiety problems. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Validity of the Short Mood and Feelings Questionnaire in late adolescence.
    Studies examining the validity of the Short Mood and Feelings Questionnaire (SMFQ; Angold, Costello, & Messer, 1995) have largely focused on selected or clinical samples in childhood (6–11 years) or early to midadolescence (12–16 years) and have not investigated misclassifications or how the SMFQ relates to adult depression measures. Using data from the Avon Longitudinal Study of Parents and Children (2012), we assessed the validity of the SMFQ in relation to an adult depression measure administered in late adolescence (age 17–18 years). We also investigated sociodemographic and clinical variables previously shown to affect misclassification on short self-administered questionnaires compared with more detailed assessments of depression. We assessed construct validity using factor and item response theory analysis. To investigate content validity, we tabulated SMFQ items against the International Classification of Diseases (ICD–10; World Health Organization, 1992) and Diagnostic and Statistical Manual of Mental Disorders (4th ed.; American Psychiatric Association, 1994) depressive symptoms. Criterion validity was examined using receiver operating characteristic (ROC) analysis. Potential misclassifications were investigated using logistic regression and multiple-indicator multiple-cause modeling. Factor analysis produced high loadings, low residual variances, and appropriate model fit indices. Seven of the 10 ICD–10 depressive symptoms were covered by at least 1 SMFQ item. The discriminatory ability of the SMFQ for meeting ICD–10 diagnostic criteria for depression was very high (area under ROC curve = 0.90). Individuals with anxiety symptoms, females, and less well-educated individuals overreported depressive symptoms on the SMFQ in relation to ICD–10 depression. We conclude the SMFQ is a valid instrument capturing a latent trait of depression in a community-based sample in late adolescence. Further work should be carried out to increase understanding of variables associated with misclassification. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Rural parents’ perceived stigma of seeking mental health services for their children: Development and evaluation of a new instrument.
    The purpose of our research was to examine the validity of score interpretations of an instrument developed to measure parents’ perceptions of stigma about seeking mental health services for their children. The validity of the score interpretations of the instrument was tested in 2 studies. Study 1 employed confirmatory factor analysis (CFA), using a split half approach, and construct and criterion validity on data from the entire sample of parents in rural Appalachia whose children were experiencing psychosocial concerns (N = 347), while Study 2 employed CFA, construct and criterion validity, and predictive validity of the scores on data from a general sample of parents in rural Appalachia (N = 184). Results of exploratory and confirmatory factor analyses revealed support for a 2-factor model of parents’ perceived stigma, which represented both self and public forms of stigma associated with seeking mental health services for their children, and correlated with existing measures of stigma and other psychosocial variables. Further, the new self and public stigma scale significantly predicted parents’ willingness to seek services for children. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Developing a fluid intelligence scale through a combination of Rasch modeling and cognitive psychology.
    Ability testing has been criticized because understanding of the construct being assessed is incomplete and because the testing has not yet been satisfactorily improved in accordance with new knowledge from cognitive psychology. This article contributes to the solution of this problem through the application of item response theory and Susan Embretson’s cognitive design system for test development in the development of a fluid intelligence scale. This study is based on findings from cognitive psychology; instead of focusing on the development of a test, it focuses on the definition of a variable for the creation of a criterion-referenced measure for fluid intelligence. A geometric matrix item bank with 26 items was analyzed with data from 2,797 undergraduate students. The main result was a criterion-referenced scale that was based on information from item features that were linked to cognitive components, such as storage capacity, goal management, and abstraction; this information was used to create the descriptions of selected levels of a fluid intelligence scale. The scale proposed that the levels of fluid intelligence range from the ability to solve problems containing a limited number of bits of information with obvious relationships through the ability to solve problems that involve abstract relationships under conditions that are confounded with an information overload and distraction by mixed noise. This scale can be employed in future research to provide interpretations for the measurements of the cognitive processes mastered and the types of difficulty experienced by examinees. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Comparing Cattell–Horn–Carroll factor models: Differences between bifactor and higher order factor models in predicting language achievement.
    Previous research using the Cattell–Horn–Carroll (CHC) theory of cognitive abilities has shown a relationship between cognitive ability and academic achievement. Most of this research, however, has been done using the Woodcock-Johnson family of instruments with a higher order factor model. For CHC theory to grow, research should be done with other assessment instruments and tested with other factor models. This study examined the relationship between different factor models of CHC theory and the factors’ relationships with language-based academic achievement (i.e., reading and writing). Using the co-norming sample for the Wechsler Intelligence Scale for Children—4th Edition and the Wechsler Individual Achievement Test—2nd Edition, we found that bifactor and higher order models of the subtests of the Wechsler Intelligence Scale for Children—4th Edition produced a different set of Stratum II factors, which, in turn, have very different relationships with the language achievement variables of the Wechsler Individual Achievement Test—2nd Edition. We conclude that the factor model used to represent CHC theory makes little difference when general intelligence is of major interest, but it makes a large difference when the Stratum II factors are of primary concern, especially when they are used to predict other variables. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Correction to Patrick et al. (2013).
    Reports an error in "Optimizing efficiency of psychopathology assessment through quantitative modeling: Development of a brief form of the Externalizing Spectrum Inventory" by Christopher J. Patrick, Mark D. Kramer, Robert F. Krueger and Kristian E. Markon (Psychological Assessment, 2013[Dec], Vol 25[4], 1332-1348). A line of data from Table 2, “Marijuana Problems,” was missing. The completed table is provided. (The following abstract of the original article appeared in record 2013-42967-001.) The Externalizing Spectrum Inventory (ESI; Krueger, Markon, Patrick, Benning, & Kramer, 2007) provides for integrated, hierarchical assessment of a broad range of problem behaviors and traits in the domain of deficient impulse control. The ESI assesses traits and problems in this domain through 23 lower order facet scales organized around 3 higher order dimensions, reflecting general disinhibition, callous aggression, and substance abuse. The full-form ESI contains 415 items, and a shorter form would be useful for questionnaire screening studies or multimethod research protocols. In the current work, we employed item response theory and structural modeling methods to create a 160-item brief form (ESI–BF) that provides for efficient measurement of the ESI’s lower order facets and quantification of its higher order dimensions either as scale-based factors or as item-based composites. The ESI–BF is recommended for use in research on psychological or neurobiological correlates of problems such as risk-taking, delinquency, aggression, and substance abuse, and studies of general and specific mechanisms that give rise to problems of these kinds. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Evidence of convergent and discriminant validity of the Student School Engagement Measure.
    The purpose of this study was to investigate the convergent and discriminant validity of the Student School Engagement Measure (SSEM) with 3 other measures of student well-being: (a) the School Engagement Scale, (b) the Student Engagement Instrument, and (c) the Student Life Satisfaction Survey. The data were analyzed from 370 8th-grade students from 3 middle schools in an urban school district. As hypothesized, strong and significant positive correlations (.80) were found between the SSEM and the 2 measures of engagement (the School Engagement Measure and the Student Engagement Instrument). Also as hypothesized, a weak but significant positive correlation (.35) was found between the SSEM and a measure of life satisfaction (the Student Life Satisfaction Survey). These findings provide additional support for using the SSEM as a valid measure of adolescents’ engagement with school. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Development and validation of the Overall Depression Severity and Impairment Scale.
    The need to capture severity and impairment of depressive symptomatology is widespread. Existing depression scales are lengthy and largely focus on individual symptoms rather than resulting impairment. The Overall Depression Severity and Impairment Scale (ODSIS) is a 5-item, continuous measure designed for use across heterogeneous mood disorders and with subthreshold depressive symptoms. This study examined the psychometric properties of the ODSIS in outpatients in a clinic for emotional disorders (N = 100), undergraduate students (N = 566), and community-based adults (N = 189). Internal consistency, latent structure, item response theory, classification accuracy, convergent and discriminant validity, and differential item functioning analyses were conducted. ODSIS scores exhibited excellent internal consistency, and confirmatory factor analyses supported a unidimensional structure. Item response theory results demonstrated that the ODSIS provides more information about individuals with high levels of depression than those with low levels of depression. Responses on the ODSIS discriminated well between individuals with and without a mood disorder and depression-related severity across clinical and subclinical levels. A cut score of 8 correctly classified 82% of outpatients as with or without a mood disorder; it evidenced a favorable balance of sensitivity and specificity and of positive and negative predictive values. The ODSIS demonstrated good convergent and discriminant validity, and results indicate that items function similarly across clinical and nonclinical samples. Overall, findings suggest that the ODSIS is a valid tool for measuring depression-related severity and impairment. The brevity and ease of use of the ODSIS support its utility for screening and monitoring treatment response across a variety of settings. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Probing the implicit suicidal mind: Does the Death/Suicide Implicit Association Test reveal a desire to die, or a diminished desire to live?
    Assessment of implicit self-associations with death relative to life, measured by a death/suicide implicit association test (d/s-IAT), has shown promise in the prediction of suicide risk. The current study examined whether the d/s-IAT reflects an individual’s desire to die or a diminished desire to live and whether the predictive utility of implicit cognition is mediated by life-oriented beliefs. Four hundred eight undergraduate students (285 female; Mage = 20.36 years, SD = 4.72) participated. Participants completed the d/s-IAT and self-report measures assessing 6 indicators of suicide risk (suicide ideation frequency and intensity, depression, nonsuicidal self-harm thoughts frequency and intensity, and nonsuicidal self-harm attempts), as well as survival and coping beliefs and history of prior suicide attempts. The d/s-IAT significantly predicted 5 out of the 6 indicators of suicide risk above and beyond the strongest traditional indicator of risk, history of prior suicide attempts. However, the effect of the d/s-IAT on each of the risk indicators was mediated by individuals’ survival and coping beliefs. Moreover, the distribution of d/s-IAT scores primarily reflected variability in self-associations with life. Implicit suicide-related cognition appears to reflect a gradual diminishing of the desire to live, rather than a desire to die. Contemporary theories of suicide and risk assessment protocols need to account for the dynamic relationship between both risk and life-oriented resilience factors, and intervention strategies aimed at enhancing engagement with life should be a routine part of suicide risk management. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • The use of immersive virtual reality (VR) to predict the occurrence 6 months later of paranoid thinking and posttraumatic stress symptoms assessed by self-report and interviewer methods: A study of individuals who have been physically assaulted.
    Presentation of social situations via immersive virtual reality (VR) has the potential to be an ecologically valid way of assessing psychiatric symptoms. In this study we assess the occurrence of paranoid thinking and of symptoms of posttraumatic stress disorder (PTSD) in response to a single neutral VR social environment as predictors of later psychiatric symptoms assessed by standard methods. One hundred six people entered an immersive VR social environment (a train ride), presented via a head-mounted display, 4 weeks after having attended hospital because of a physical assault. Paranoid thinking about the neutral computer-generated characters and the occurrence of PTSD symptoms in VR were assessed. Reactions in VR were then used to predict the occurrence 6 months later of symptoms of paranoia and PTSD, as assessed by standard interviewer and self-report methods. Responses to VR predicted the severity of paranoia and PTSD symptoms as assessed by standard measures 6 months later. The VR assessments also added predictive value to the baseline interviewer methods, especially for paranoia. Brief exposure to environments presented via virtual reality provides a symptom assessment with predictive ability over many months. VR assessment may be of particular benefit for difficult to assess problems, such as paranoia, that have no gold standard assessment method. In the future, VR environments may be used in the clinic to complement standard self-report and clinical interview methods. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Psychometric properties of the Coping Inventory for Stressful Situations (CISS) in patients with acquired brain injury.
    Information on the psychometric properties of the Coping Inventory for Stressful Situations (CISS) in acquired brain injury (ABI) is currently unavailable. Therefore, we investigated the construct and discriminant, convergent, and divergent validity of the CISS in a Dutch adult sample with newly ABI (N = 139). Patients were recruited at the start of outpatient neurorehabilitation (time since diagnosis ≤ 4 months) or after discharge home from hospital or inpatient neurorehabilitation. The original 3-factor solution of the CISS (Task-Oriented, Emotion-Oriented, Avoidance) showed a borderline fit, which slightly improved after removal of 3 problematic items. We found borderline support for a 4-factor model. Internal consistency was good. Discriminant validity was only partial as we found a moderate correlation between the Task-Oriented and Avoidance scales. Emotion-Oriented Coping correlated strongly with the anxiety and depression subscale of the Hospital Anxiety and Depression Scale. Of the 2 scales of the Assimilative/Accommodative Coping Questionnaire, Tenacious Goal Pursuit correlated strongest with Task-Oriented Coping, whereas Flexible Goal Adjustment correlated negatively with Emotion-Oriented Coping. In summary, the psychometric properties of the CISS in patients with ABI ranged from acceptable to good. The classical 3-factor structure is appropriate, but some items might be problematic in patients with ABI. Replication of the restricted 3-factor model in larger samples is needed, together with further exploration of discriminant validity and the relationship of the CISS with other coping measures, but for now we recommend using the original CISS in patients with ABI. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Validating indicators of treatment response: Application to trichotillomania.
    Different studies of the treatment of trichotillomania (TTM) have used varying standards to determine the proportion of patients who obtain clinically meaningful benefits, but there is little information on the similarity of results yielded by these methods or on their comparative validity. Data from a stepped-care (Step 1: Web-based self-help; Step 2: Individual behavior therapy; N = 60) treatment study of TTM were used to evaluate 7 potential standards: complete abstinence, ≥25% symptom reduction, recovery of normal functioning, and clinical significance (recovery + statistically reliable change), each of the last 3 being measured by self-report (Massachusetts General Hospital Hairpulling Scale; MGH–HPS) or interview (Psychiatric Institute Trichotillomania Scale). Depending on the metric, response rates ranged from 25 to 68%. All standards were significantly associated with one another, though less strongly for the 25% symptom reduction metrics. Concurrent (with deciding to enter Step 2 treatment) and predictive (with 3-month follow-up treatment satisfaction, TTM-related impairment, quality of life, and diagnosis) validity results were variable but generally strongest for clinical significance as measured via self-report. Routine reporting of the proportion of patients who make clinically significant improvement on the MGH–HPS, supplemented by data on complete abstinence, would bolster the interpretability of TTM treatment outcome findings. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Accuracy of self-reported versus actual online gambling wins and losses.
    This study is the first to compare the accuracy of self-reported with actual monetary outcomes of online fixed odds sports betting, live action sports betting, and online casino gambling at the individual level of analysis. Subscribers to bwin.party digital entertainment’s online gambling service volunteered to respond to the Brief Bio-Social Gambling Screen and questions about their estimated gambling results on specific games for the last 3 or 12 months. We compared the estimated results of each subscriber with his or her actual betting results data. On average, between 34% and 40% of the participants expressed a favorable distortion of their gambling outcomes (i.e., they underestimated losses or overestimated gains) depending on the time period and game. The size of the discrepancy between actual and self-reported results was consistently associated with the self-reported presence of gambling-related problems. However, the specific direction of the reported discrepancy (i.e., favorable vs. unfavorable bias) was not associated with gambling-related problems. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Psychometric properties for the Balanced Inventory of Desirable Responding: Dichotomous versus polytomous conventional and IRT scoring.
    [Correction Notice: An Erratum for this article was reported in Vol 26(3) of Psychological Assessment (see record 2014-16017-001). The mean, standard deviation and alpha coefficient originally reported in Table 1 should be 74.317, 10.214 and .802, respectively. The validity coefficients in the last column of Table 4 are affected as well. Correcting this error did not change the substantive interpretations of the results, but did increase the mean, standard deviation, alpha coefficient, and validity coefficients reported for the Honesty subscale in the text and in Tables 1 and 4. The corrected versions of Tables 1 and Table 4 are shown in the erratum.] Item response theory (IRT) models were applied to dichotomous and polytomous scoring of the Self-Deceptive Enhancement and Impression Management subscales of the Balanced Inventory of Desirable Responding (Paulhus, 1991, 1999). Two dichotomous scoring methods reflecting exaggerated endorsement and exaggerated denial of socially desirable behaviors were examined. The 1- and 2-parameter logistic models (1PLM, 2PLM, respectively) were applied to dichotomous responses, and the partial credit model (PCM) and graded response model (GRM) were applied to polytomous responses. For both subscales, the 2PLM fit dichotomous responses better than did the 1PLM, and the GRM fit polytomous responses better than did the PCM. Polytomous GRM and raw scores for both subscales yielded higher test–retest and convergent validity coefficients than did PCM, 1PLM, 2PLM, and dichotomous raw scores. Information plots showed that the GRM provided consistently high measurement precision that was superior to that of all other IRT models over the full range of both construct continuums. Dichotomous scores reflecting exaggerated endorsement of socially desirable behaviors provided noticeably weak precision at low levels of the construct continuums, calling into question the use of such scores for detecting instances of “faking bad.” Dichotomous models reflecting exaggerated denial of the same behaviors yielded much better precision at low levels of the constructs, but it was still less precision than that of the GRM. These results support polytomous over dichotomous scoring in general, alternative dichotomous scoring for detecting faking bad, and extension of GRM scoring to situations in which IRT offers additional practical advantages over classical test theory (adaptive testing, equating, linking, scaling, detecting differential item functioning, and so forth). (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Race/ethnicity and measurement equivalence of the Everyday Discrimination Scale.
    The present study examines the effect of race/ethnicity on measurement equivalence of the Everyday Discrimination Scale (EDS; Williams, Yu, Jackson, & Anderson, 1997). Drawn from the Collaborative Psychiatric Epidemiology Surveys (CPES; Alegría, Jackson, Kessler, & Takeuchi, 2008), adults aged 18 and older from four racial/ethnic groups were selected for analyses: 884 non-Hispanic Whites, 4,950 Blacks, 2,733 Hispanics/Latinos, and 2,089 Asians. Multiple-group confirmatory factor analyses were conducted. After adjusting for age and gender, the underlying construct of the EDS was invariant across four racial/ethnic groups, with Item 7 (“People act as if they’re better than you are”) associated with lower intercepts for the Hispanic/Latino and Asian groups relative to the non-Hispanic White and Black groups. In terms of latent factor differences, Blacks tended to score higher on the latent construct compared to other racial/ethnic groups, whereas Asians tended to score lower on the latent construct compared to Whites and Hispanics/Latinos. Findings suggest that although the EDS in general assesses the underlying construct of perceived discrimination equivalently across diverse racial/ethnic groups, caution is needed when Item 7 is used among Hispanics/Latinos or Asians. Implications are discussed in cultural and methodological contexts. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Correspondence between psychometric and clinical high risk for psychosis in an undergraduate population.
    Despite the common use of either psychometric or clinical methods for identifying individuals at risk for psychosis, previous research has not examined the correspondence and extent of convergence of these 2 approaches. Undergraduates (n = 160), selected from a larger pool, completed 3 self-report schizotypy scales: the Magical Ideation Scale, the Perceptual Aberration Scale, and the Revised Social Anhedonia Scale. They were administered the Structured Interview for Prodromal Syndromes. First, high correlations were observed for self-report and interview-rated psychotic-like experiences (rs between .48 and .61, p <.001). Second, 77% of individuals who identified as having a risk for psychosis with the self-report measures reported at least 1 clinically meaningful psychotic-like experience on the Structured Interview for Prodromal Syndromes. Third, receiver operating characteristic curve analyses showed that the self-report scales can be used to identify which participants report clinically meaningful positive symptoms. These results suggest that mostly White undergraduate participants who identify as at risk with the psychometric schizotypy approach report clinically meaningful psychotic-like experiences in an interview format and that the schizotypy scales are moderately to strongly correlated with interview-rated psychotic-like experiences. The results of the current research provide a baseline for comparing research between these 2 approaches. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Diagnostic reliability of MMPI-2 computer-based test interpretations.
    Reflecting the common use of the MMPI-2 to provide diagnostic considerations, computer-based test interpretations (CBTIs) also typically offer diagnostic suggestions. However, these diagnostic suggestions can sometimes be shown to vary widely across different CBTI programs even for identical MMPI-2 profiles. The present study evaluated the diagnostic reliability of 6 commercially available CBTIs using a 20-item Q-sort task developed for this study. Four raters each sorted diagnostic classifications based on these 6 CBTI reports for 20 MMPI-2 profiles. Two questions were addressed. First, do users of CBTIs understand the diagnostic information contained within the reports similarly? Overall, diagnostic sorts of the CBTIs showed moderate inter-interpreter diagnostic reliability (mean r = .56), with sorts for the 1/2/3 profile showing the highest inter-interpreter diagnostic reliability (mean r = .67). Second, do different CBTIs programs vary with respect to diagnostic suggestions? It was found that diagnostic sorts of the CBTIs had a mean inter-CBTI diagnostic reliability of r = .56, indicating moderate but not strong agreement across CBTIs in terms of diagnostic suggestions. The strongest inter-CBTI diagnostic agreement was found for sorts of the 1/2/3 profile CBTIs (mean r = .71). Limitations and future directions are discussed. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Factor analysis of the Achievement of Therapeutic Objectives Scale (ATOS) in short-term dynamic psychotherapy and cognitive therapy.
    This study examined the factor structure of the Achievement of Therapeutic Objectives Scale (ATOS; McCullough, Larsen, et al., 2003) in short-term dynamic psychotherapy (STDP) and cognitive therapy (CT). The ATOS is a process scale that has shown promise as a measure of patients’ achievements of treatment objectives in STDP and CT and is conceptualized as comprising 7 subscales hypothesized to cluster according to 3 main treatment objectives (defense restructuring, affect restructuring, and restructuring of sense of self and others). However, the factor structure of the ATOS has not been examined empirically previously. Data were derived from ratings of videotaped therapy sessions from a randomized controlled trial, comparing STDP and CT for patients with Cluster C personality disorders. The model fit of a 2- and 3-factor solution was examined in the combined patient sample, as well as in each treatment separately, utilizing structural equation modeling. Both a 2- and 3-factor model provided acceptable fit to the data. The results add to the psychometric soundness of the ATOS as an innovative observer-based instrument for examining process in STDP and CT. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Assessment of nonsuicidal self-injury: Development and initial validation of the Non-Suicidal Self-Injury–Assessment Tool (NSSI-AT).
    Research tools for assessing nonsuicidal self-injury (NSSI) epidemiology in community populations are few and are either limited in the scope of NSSI characteristics assessed or included as part of suicide assessment. Though these surveys have been immensely useful in establishing the presence of NSSI and in documenting basic epidemiological characteristics, they have been less useful in describing secondary NSSI features such as NSSI context, habituation, or perceived life impact. The aim of the current study was to examine the reliability of the test scores and validity of test score interpretations in a university population for the Non-Suicidal Self-Injury–Assessment Tool (NSSI-AT), a web-based measure of NSSI designed to assess primary (such as form, frequency, and function) and secondary (including but not limited to NSSI habituation; contexts in which NSSI is practiced; and NSSI perceived life interference, treatment, and impacts) NSSI characteristics for research purposes. Data for these analyses were drawn from 3 samples, all of which were originally part of a 2007 study of randomly selected students from 8 northeast and midwest public and private universities that participated in a web-based study entitled the Survey of Student Wellbeing. Overall, results provide support for the reliability of NSSI-AT test scores (as assessed by test–retest) and validity of NSSI-AT test score interpretations for the behavior and frequency modules (as assessed using concurrent, convergent, and discriminant evidence) in this population. Implications for research as well as next steps are discussed. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Feasibility of text messaging for ecological momentary assessment of marijuana use in college students.
    Measuring self-reported substance use behavior is challenging due to issues related to memory recall and patterns of bias in estimating behavior. Limited research has focused on the use of ecological momentary assessment (EMA) to evaluate marijuana use. This study assessed the feasibility of using short message service (SMS) texting as a method of EMA with college-age marijuana users. Our goals were to evaluate overall response/compliance rates and trends of data missingness, response time, baseline measures (e.g., problematic use) associated with compliance rates and response times, and differences between EMA responses of marijuana use compared to timeline followback (TLFB) recall. Nine questions were texted to participants on their personal cell phones 3 times a day over a 2-week period. Overall response rate was high (89%). When examining predictors of the probability of data missingness with a hierarchical logistic regression model, we found evidence of a higher propensity for missingness for Week 2 of the study compared to Week 1. Self-regulated learning was significantly associated with an increase in mean response time. A model fit at the participant level to explore response time found that more time spent smoking marijuana related to higher response times, while more time spent studying and greater “in the moment” academic motivation and craving were associated with lower response times. Significant differences were found between the TLFB and EMA, with greater reports of marijuana use reported through EMA. Overall, results support the feasibility of using SMS text messaging as an EMA method for college-age marijuana users. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • A comparison of the criterion validity of popular measures of narcissism and narcissistic personality disorder via the use of expert ratings.
    The growing interest in the study of narcissism has resulted in the development of a number of assessment instruments that manifest only modest to moderate convergence. The present studies adjudicate among these measures with regard to criterion validity. In the 1st study, we compared multiple narcissism measures to expert consensus ratings of the personality traits associated with narcissistic personality disorder (NPD; Study 1; N = 98 community participants receiving psychological/psychiatric treatment) according to the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM–IV–TR; American Psychiatric Association, 2000) using 5-factor model traits as well as the traits associated with the pathological trait model according to the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association, 2013). In Study 2 (N = 274 undergraduates), we tested the criterion validity of an even larger set of narcissism instruments by examining their relations with measures of general and pathological personality, as well as psychopathology, and compared the resultant correlations to the correlations expected by experts for measures of grandiose and vulnerable narcissism. Across studies, the grandiose dimensions from the Five-Factor Narcissism Inventory (FFNI; Glover, Miller, Lynam, Crego, & Widiger, 2012) and the Narcissistic Personality Inventory (Raskin & Terry, 1988) provided the strongest match to expert ratings of DSM–IV–TR NPD and grandiose narcissism, whereas the vulnerable dimensions of the FFNI and the Pathological Narcissism Inventory (Pincus et al., 2009), as well as the Hypersensitive Narcissism Scale (Hendin & Cheek, 1997), provided the best match to expert ratings of vulnerable narcissism. These results should help guide researchers toward the selection of narcissism instruments that are most well suited to capturing different aspects of narcissism. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Activation as an overlooked factor in the BDI–II: A factor model based on core symptoms and qualitative aspects of depression.
    An adequate assessment of depression has been of concern to many researchers over the last half-century. These efforts have brought forth a manifold of depression rating scales, of which the Beck Depression Inventory (BDI) is 1 of the most commonly used self-assessment scales. Since its revision, the item structure of the BDI–II has been examined in many factor analytic studies, yet it has not been possible to achieve a consensus about the underlying factor structure. Recent findings from a nonmetric multidimensional scaling (NMDS) analysis (Bühler, Keller, & Läge, 2012) of the German norming sample of the BDI–II emphasized a structure with different qualitative aspects of depression, which suggested that the existing factor models do not adequately represent the data. The NMDS results were reviewed, and on the basis of these findings, a different factor model is proposed. In contrast to the common factor models in the literature, the presented model includes an additional factor, which is associated with the activation level of the BDI–II symptoms. The model was evaluated with a 2nd sample of patients diagnosed with a primary affective disorder (N = 569) and obtained good fit indices that even exceeded the fit of the most reliable factor model (Ward, 2006) described in the literature so far. Furthermore, emphasis is placed on the methodological question of how factor models may be derived from the results of NMDS analyses. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Psychometric evaluation of the Short Form 36 Health Survey (SF-36) and the World Health Organization Quality of Life Scale Brief Version (WHOQOL-BREF) for patients with schizophrenia.
    Quality-of-life (QoL) instruments measure the overall health status of people with schizophrenia, for whom the activities of daily life are often difficult. However, information on the psychometric properties of scores from the Short Form 36 Health Survey (SF-36) and the World Health Organization Quality of Life Scale Brief Version (WHOQOL-BREF), 2 commonly used generic QoL instruments in this population, is limited. Thus, we used a multitrait–multimethod analysis plus confirmatory factor analysis (CFA) to examine their psychometric properties. To test the reliability of their scores, we used methods of absolute reliability (standard error of measurement [SEM] and smallest real difference [SRD]) and relative reliability (i.e., intraclass correlation coefficient [ICC]). We recruited 100 patients with schizophrenia from a psychiatric hospital in southern Taiwan. All participants filled out the SF-36 and the WHOQOL-BREF at baseline and 2 weeks later. The participants’ QoL scores were lower than those of the Taiwan general population (ps <.01), and CFA indicated that the constructs of QoL scores for the SF-36 (comparative fit index [CFI] = .918; incremental fit index [IFI] = .919; Tucker–Lewis index [TLI] = .885) and the WHOQOL-BREF (CFI = .967; IFI = .967; TLI = .900) were acceptable. The SEM and SRD analyses suggested that the total scores of the SF-36 (SEM% = 10.03%; SRD% = 27.80%) and of the WHOQOL-BREF (SEM% = 5.55%; SRD% = 15.40%) were reliable. Also, our results demonstrated that the WHOQOL-BREF scores were more reliable and valid than the SF-36 scores for assessing people with schizophrenia. The scores of both questionnaires were valid and reliable and detected different aspects of QOL in the population with schizophrenia. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Short-Term Assessment of Risk and Treatability (START): Systematic review and meta-analysis.
    This article describes a systematic review of the psychometric properties of the Short-Term Assessment of Risk and Treatability (START) and a meta-analysis to assess its predictive efficacy for the 7 risk domains identified in the manual (violence to others, self-harm, suicide, substance abuse, victimization, unauthorized leave, and self-neglect) among institutionalized patients with mental disorder and/or personality disorder. Comprehensive terms were used to search 5 electronic databases up to January 2013. Additional articles were located by examining references lists and hand-searching. Twenty-three papers were selected to include in the narrative review of START’s properties, whereas 9 studies involving 543 participants were included in the meta-analysis. Studies about the feasibility and utility of the tool had positive results but lacked comparators. START ratings demonstrated high internal consistency, interrater reliability, and convergent validity with other risk measures. There was a lack of information about the variability of START ratings over time. Its use in an intervention to reduce violence in forensic psychiatric outpatients was not better than standard care. START risk estimates demonstrated strong predictive validity for various aggressive outcomes and good predictive validity for self-harm. Predictive validity for self-neglect and victimization was no better than chance, whereas evidence for the remaining outcomes is derived from a single, small study. Only 3 of the studies included in the meta-analysis were rated to be at a low risk of bias. Future research should aim to investigate the predictive validity of the START for the full range of adverse outcomes, using well-designed methodologies, and validated outcome tools. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Psychometrically improved, abbreviated versions of three classic measures of impulsivity and self-control.
    Self-reported impulsivity confers risk factor for substance abuse. However, the psychometric properties of many self-report impulsivity measures have been questioned, thereby undermining the interpretability of study findings using these measures. To better understand these measurement limitations and to suggest a path to assessing self-reported impulsivity with greater psychometric stability, we conducted a comprehensive psychometric evaluation of the Barratt Impulsiveness Scale–11 (BIS–11), the Behavioral Inhibition and Activation Scales (BIS/BAS), and the Brief Self-Control Scale (BSCS) using data from 1,449 individuals who participated in substance use research. For each measure, we evaluated (a) latent factor structure, (b) measurement invariance, (c) test-criterion relationships between the measures, and (d) test-criterion relations with drinking and smoking outcomes. Notably, we could not replicate the originally published latent structure for the BIS, BIS/BAS, or BSCS or any previously published alternative factor structure (English language). Using exploratory and confirmatory factor analysis, we identified psychometrically improved, abbreviated versions of each measure: 8-item, 2-factor BIS–11 (root-mean-square error of approximation [RMSEA] = .06, comparative fit index [CFI] = .95); 13-item, 4-factor BIS/BAS (RMSEA = .04, CFI = .96); and 7-item, 2-factor BSCS (RMSEA = .05, CFI = .96). These versions evidenced (a) stable, replicable factor structures, (b) scalar measurement invariance, ensuring our ability to make statistically interpretable comparisons across subgroups of interest (e.g., sex, race, drinking/smoking status), and (c) test-criterion relationships with each other and with drinking/smoking. This study provides strong support for using these psychometrically improved impulsivity measures, which improve data quality directly through better scale properties and indirectly through reducing response burden. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • The impact of ambiguous response categories on the factor structure of the GHQ–12.
    Previous research has suggested multiple factor structures for the 12-item General Health Questionnaire (GHQ–12), with contradictory evidence arising across different studies on the validity of these models. In the present research, it was hypothesized that these inconsistent findings were due to the interaction of 3 main methodological factors: ambiguous response categories in the negative items, multiple scoring schemes, and inappropriate estimation methods. Using confirmatory factor analysis with appropriate estimation methods and scores obtained from a large (n = 27,674) representative Spanish sample, we tested this hypothesis by evaluating the fit and predictive validities of 4 GHQ–12 factor models—unidimensional, Hankins’ (2008a) response bias model, Andrich and Van Schoubroeck’s (1989) 2-factor model, and Graetz’s (1991) 3-factor model—across 3 scoring methods: standard, corrected, and Likert. In addition, the impact of method effects on the reliability of the global GHQ–12 scores was also evaluated. The combined results of this study support the view that the GHQ–12 is a unidimensional measure that contains spurious multidimensionality under certain scoring schemes (corrected and Likert) as a result of ambiguous response categories in the negative items. Therefore, it is suggested that the items be scored using the standard method and that only a global score be derived from the instrument. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Wording effects and the factor structure of the 12-item General Health Questionnaire (GHQ-12).
    The 12-item version of the General Health Questionnaire (GHQ-12) has become a popular screening instrument with which to measure general psychological health in different settings. Previous studies into the factorial structure of the GHQ-12 have mainly supported multifactor solutions, and only a few recent works have shown that the GHQ-12 was best represented by a single substantive factor when method effects associated with negatively worded items were considered. Confirmatory factor analysis was applied to compare competing measurement models from previous research, including correlated traits–correlated methods and correlated traits–correlated uniquenesses approaches, to obtain further evidence about the factorial structure of the GHQ-12. This goal was achieved with data from 3,050 participants who completed the GHQ-12 included in the Catalonian Survey of Working Conditions (Catalonian Labor Relations and Quality of Work Department, 2012). The results showed additional evidence that the GHQ-12 has a unidimensional structure after controlling for method effects associated with negatively worded items. Furthermore, we found evidence for our hypothesis about the spurious nature of the 3-factor solution in Graetz’s (1991) model after comparing its fit with that found for alternative models resulting from different combinations of the negatively worded items. An implication of our results is that future research about the factor structure of the GHQ-12 should take method effects associated with negative wording into account in order to avoid reaching inaccurate conclusions about its dimensionality. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Resilience in a sample of Mexican American adolescents with substance use disorders.
    Resolving the many tasks of adolescent development requires resilience. However, understanding the role that resilience plays in adolescent development involves adequate measurement of the construct. The Connor–Davidson Resilience Scale (CD-RISC) is a widely used measure of resilience, but a stable latent factor structure has not been identified across studies. The measure has typically been examined in adult samples while little attention has been given to its use with adolescents in general and ethnic minority adolescents in particular. The primary purpose of the current study is to identify a latent factor structure of the CD-RISC in a sample of primarily Mexican American adolescents (N = 106). Two competing model structures were tested via confirmatory factor analysis and results supported a 7-item unidimensional factor model. Support was also found for the construct validity of the measure in relation to ethnic identity and depressive symptoms for adolescents in this sample. Implications of the study findings for adolescents and avenues of future research are discussed. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Underreporting on the MMPI–2–RF in a high-demand police officer selection context: An illustration.
    Positive response distortion is common in the high-demand context of employment selection. This study examined positive response distortion, in the form of underreporting, on the Minnesota Multiphasic Personality Inventory—2—Restructured Form (MMPI–2–RF). Police officer job applicants completed the MMPI–2–RF under high-demand and low-demand conditions, once during the preemployment psychological evaluation and once without contingencies after completing the police academy. Demand-related score elevations were evident on the Uncommon Virtues (L-r) and Adjustment Validity (K-r) scales. Underreporting was evident on the Higher-Order scales Emotional/Internalizing Dysfunction and Behavioral/Externalizing Dysfunction; 5 of 9 Restructured Clinical scales; 6 of 9 Internalizing scales; 3 of 4 Externalizing scales; and 3 of 5 Personality Psychopathology 5 scales. Regression analyses indicated that L-r predicted demand-related underreporting on behavioral/externalizing scales, and K-r predicted underreporting on emotional/internalizing scales. Select scales of the MMPI–2–RF are differentially associated with different types of underreporting among police officer applicants. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • An item response theory analysis of the Psychological Inventory of Criminal Thinking Styles: Comparing male and female probationers and prisoners.
    An item response theory (IRT) analysis of the Psychological Inventory of Criminal Thinking Styles (PICTS) was performed on 26,831 (19,067 male and 7,764 female) federal probationers and compared with results obtained on 3,266 (3,039 male and 227 female) prisoners from previous research. Despite the fact male and female federal probationers scored significantly lower on the PICTS thinking style scales than male and female prisoners, discrimination and location parameter estimates for the individual PICTS items were comparable across sex and setting. Consistent with the results of a previous IRT analysis conducted on the PICTS, the current results did not support sentimentality as a component of general criminal thinking. Findings from this study indicate that the discriminative power of the individual PICTS items is relatively stable across sex (male, female) and correctional setting (probation, prison) and that the PICTS may be measuring the same criminal thinking construct in male and female probationers and prisoners. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Optimists or optimistic? A taxometric study of optimism.
    Although most researchers have assumed that optimism exists on a continuum, it is not uncommon for researchers to dichotomize their data into optimists and pessimists, thus treating optimism as a categorical or taxonic variable. To address the question of whether optimism is dimensional or taxonic, the authors performed a set of taxometric analyses on 3 indicators derived from measures of hope and optimism using data from 510 college students. The results provided consistent evidence that optimism is dimensional. Colloquially, people may speak of optimists and pessimists, but researchers should avoid dichotomizing this continuous variable. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source

  • Correction to Vispoel and Kim (2014).
    Reports an error in "Psychometric properties for the Balanced Inventory of Desirable Responding: Dichotomous versus polytomous conventional and IRT scoring" by Walter P. Vispoel and Han Yi Kim (Psychological Assessment, Advanced Online Publication, Apr 7, 2014, np). The mean, standard deviation and alpha coefficient originally reported in Table 1 should be 74.317, 10.214 and .802, respectively. The validity coefficients in the last column of Table 4 are affected as well. Correcting this error did not change the substantive interpretations of the results, but did increase the mean, standard deviation, alpha coefficient, and validity coefficients reported for the Honesty subscale in the text and in Tables 1 and 4. The corrected versions of Tables 1 and Table 4 are shown in the erratum. (The following abstract of the original article appeared in record 2014-12154-001.) Item response theory (IRT) models were applied to dichotomous and polytomous scoring of the Self-Deceptive Enhancement and Impression Management subscales of the Balanced Inventory of Desirable Responding (Paulhus, 1991, 1999). Two dichotomous scoring methods reflecting exaggerated endorsement and exaggerated denial of socially desirable behaviors were examined. The 1- and 2-parameter logistic models (1PLM, 2PLM, respectively) were applied to dichotomous responses, and the partial credit model (PCM) and graded response model (GRM) were applied to polytomous responses. For both subscales, the 2PLM fit dichotomous responses better than did the 1PLM, and the GRM fit polytomous responses better than did the PCM. Polytomous GRM and raw scores for both subscales yielded higher test–retest and convergent validity coefficients than did PCM, 1PLM, 2PLM, and dichotomous raw scores. Information plots showed that the GRM provided consistently high measurement precision that was superior to that of all other IRT models over the full range of both construct continuums. Dichotomous scores reflecting exaggerated endorsement of socially desirable behaviors provided noticeably weak precision at low levels of the construct continuums, calling into question the use of such scores for detecting instances of “faking bad.” Dichotomous models reflecting exaggerated denial of the same behaviors yielded much better precision at low levels of the constructs, but it was still less precision than that of the GRM. These results support polytomous over dichotomous scoring in general, alternative dichotomous scoring for detecting faking bad, and extension of GRM scoring to situations in which IRT offers additional practical advantages over classical test theory (adaptive testing, equating, linking, scaling, detecting differential item functioning, and so forth). (PsycINFO Database Record (c) 2014 APA, all rights reserved)
    Citation link to source



Back to top


Back to top