Reprinted with permission from the American Psychology Law Society News (APLSN). Written by Michael Gamache, Private Practice / Tampa, Florida. Created by Richard I. Frederick, PhD. Available from Pearson Assessments at 800.627.7271, ext. 3225.
Overview and Background
Richard Rogers is generally credited with stimulating a dramatic increase in efforts to detect malingering and deception in the context of psychological and neuropsychological examinations. Since the publication of the first edition of Clinical Assessment and Malingering (Rogers, 1988, 1997), there has been a concerted effort to establish new methods, criteria and technology for the detection of effort in general and malingering in particular. Rogers, Harrell & Liff (1993) later encouraged the development of specific measures to detect malingering in neuropsychological examination and identified six strategies for detecting feigned neuropsychological impairment, including: floor effects, symptom validity testing, performance curves, magnitude of error, atypical presentation, and psychological sequelae.
The importance of specific examination of malingering, deception and effort in neuropsychological examination is underscored by recent research suggesting that the base rate for malingering or deception among individuals claiming cognitive impairment may range from 20% (Griffin et al., 1996) to 40% (Greiffenstein, Baker & Gola, 1994) with some studies suggesting much higher rates of dissimulation in select populations. Multimodal assessment of effort, motivation and deception is a necessary component of evaluations involving presenting complaints of a cognitive or neuropsychological nature where there is the potential for secondary gain.
The Validity Indicator Profile (VIP) was designed to be incorporated as one element of a multimodal assessment strategy that would enable the examiner to conduct a general assessment of response style, including invalid response styles in addition to outright malingering or intentional deception. By design, it incorporates floor effect, symptom validity testing, performance curve and atypical presentation strategies.
Early work by Richard Frederick (Frederick & Foster, 1991), the author of the VIP test, building on the forced choice technique for detecting malingering served as the foundation for the Validity Indicator Profile. Rather than relying strictly on binomial theorem and less than chance performance as is typically the case with Symptom Validity Testing, Frederick & Foster set out to establish multiple measures or indices of malingering utilizing the forced choice paradigm.
In 1994, Frederick (Frederick et al., 1994) published a validation study of response bias using a forced choice modification of the Test of Nonverbal Intelligence (Brown, Sherbenou & Johnsen, 1982). This forced choice modification of the TONI, following the general paradigm of Symptom Validity Testing, became the foundation for the nonverbal subtest of the VIP test. To complement the nonverbal portion of the test, the authors simultaneously developed a verbal, forced-choice subtest following the same basic principles for administration and assessment. The verbal and nonverbal subtests, together or independently, are...intended to provide a broad spectrum of information about an individual's performance on an assessment battery to indicate whether the testing should be considered representative of his or her true overall capacities. The Validity Indicator Profile is promoted as an...objective measure to evaluate an individual's motivation and effort during cognitive testing.
In formulating the VIP test, Frederick conceptualizes the factors affecting cognitive test validity to vary along two dimensions—motivation and effort. Frederick described motivation as varying from the intentional motivation to fail testing at one extreme, to the motivation to excel on testing at the other. Effort was characterized as ranging from low to high. Thus, the combination of these two dimensions could result in one of four classifications of response style, with only one classification being considered valid. The valid classification involves a combination of motivation to excel and high effort. It is only under these circumstances that an examiner can put full faith in the validity of the test results as an indication of the examinee's true functional abilities. When motivation is present but effort is poor, the patient is classified as careless. By contrast, a test-taking set whereby the examinee is highly motivation to appear impaired is classified as malingering. An irrelevant response style involves both poor effort and motivation are poor, and can occur for a variety of reasons, including apathy, lack of examiner/examinee rapport, confusion, random responding, or patterned responding.
Hall & Pritchard (1966), among others, have criticized the traditional application of Symptom Validity Testing that uses cutoffs based exclusively on statistically worse than chance performance. This method has been criticized because it may result in excessively high false negative rates, missing a substantial portion of the population of individuals who may employ variable test-taking strategies, or who may not exaggerate their impairment sufficiently to be detected by this method. The VIP test attempts to overcome this criticism by incorporating multiple measures of test performance, rather than relying simply on an index of performance in comparison with pure chance. For example, the VIP test incorporates both verbal and nonverbal measures of response style to maximize opportunities to detect invalid responding, which may vary depending on the test administered. The VIP test also incorporates a gradient of difficulty such that within both the verbal and nonverbal subtests, there is a full range of item difficulty, ranging from easy items where correct performance is typically well above 80%, even for genuinely impaired individuals with documented head injury or brain damage, to much more difficult items where the rate of correct responding in compliant individuals deteriorates to 30% due to a built-in pull toward incorrect responses in compliant subjects. The gradient of difficulty helps to reveal the noncompliant, random responding of the poorly motivated subject or the examinee making less than a reasonable effort.
The VIP test is a 178-item instrument administered in pencil and paper format. There is a 100 non-verbal item subtest and a 78 verbal item subtest, which can be administered individually or in combination. Scoring and generation of interpretive reports is available through Pearson Assessments via mail-in scoring, fax scoring, and the MICROTEST™ Q Assessment System software for on-site scoring and interpretation. Price ranges from $13 to $16 per report depending on volume and scoring style. A word of warning for the hurried, cost-conscious psychologist—the scoring and interpretive report price more than triples for the quick turnaround that can be accomplished by fax transmission, and these added costs are not identified in their promotional literature.
The VIP test is presented to the test subject as a measure of cognitive ability. The spiral-bound test booklet contains two practice items and the 100 forced choice items making up the nonverbal subtest. These items were borrowed from the TONI, which essentially involves picture matrix items. The directions for this portion of the test advise the examinee that it is a test of nonverbal ability, and that he / she will be solving picture puzzles by choosing the correct picture or pictures that best complete the puzzle. Examinees are to select the correct answer from each of two choices, and warned that they may not know the answers to each of the 100 items, but when they do not they should at least make their best guess. Examinees are discouraged from leaving any items blank and are not given any feedback regarding the accuracy of their responses. The manual suggests that the nonverbal subtest typically takes approximately thirty minutes to complete.
Upon completing the nonverbal subtest, the spiral test booklet is removed and on the backside of the four-page answer sheet appear the instructions for the verbal subtest and each of the 78 verbal subtest items. For each verbal subtest item, the examinee is presented with a stimulus word ranging from very familiar to very difficult. For each stimulus word the patient is instructed to choose one of two adjacent words, that is more similar in meaning to the first word. The 78 item verbal subtest is estimated to take approximately twenty minutes to complete. Thus it can be reasonably expected that administration of the entire VIP test will take approximately one hour.
The VIP test is partly derived from the Symptom Validity Test approach developed by Pankratz in the late 1970's, and its development was stimulated in part by the recommendations of Faust et al (1988). The author's objective was to extend the SVT approach beyond the limitations of assessment and interpretation confined to sub-chance or below-chance responding, and to increase the sensitivity rates for identification of malingering or other invalid response styles.
The initial development of the VIP test involved a total sample of 1,048 subjects, of which 104 were classified as clinical participants and the remaining 944 were classified as non-clinical participants and were obtained from populations of college students and employees of the test publisher. The age range of the development sample was from 15 years of age to 71, however, the vast majority of subjects studied were relatively young, and in the age range of 18 to 25. The so-called clinical participants were described as adults undergoing neuro-psychological evaluation, some of whom were actively involved in litigation at the time of the evaluation. These subjects took the VIP test in the context of a larger battery of cognitive and neuropsychological tests. By contrast, the non-clinical group, for the most part, took the VIP test only, and the vast majority of subjects (N=909) took only the nonverbal portion of the VIP test.
The non-clinical subjects were randomly assigned to either compliant or noncompliant criterion groups. They instructed the compliant criterion group to give their best effort on the test battery, whereas the noncompliant criterion group was given specific instructions on faking believable cognitive impairment and were provided with strategies to avoid detection. The sample of so-called coached normals was supplemented by a sample of computer generated cases intended to represent entirely random responding to test items. They further subdivided the coached normals into naive malingering and informed malingering groups, with the naive malingering group being instructed to fake believable cognitive impairment without being obvious, and the informed malingering group being provided with specific strategies to avoid detection. The clinical participants were subdivided based on a combination of subjective and objective assessment of their compliance. Of the 110 clinical participants, 49 were classified as suspected malingerers based on the individual's performance on three screening tests of malingering (specifically the Rey Memory Test, the Rey Word Recognition Test, and the Rey Dot Counting Test; Lezak, 1995) and A priori clinician ratings regarding the likelihood that the patient would malinger on tests, based on subjective clinician evaluations of the potential for malingering. Ultimately, the clinical participants were classified as suspected malingerers if either the clinician evaluation or the classification from the screening test was invalid.
Based on this initial validation study, the authors developed six measures or indices that they referred to as Primary Validity Indicators, and corresponding cut scores from the samples that were used to discriminate between valid and invalid test performance. The cut scores were set so that 90% of the compliant participants would be correctly classified by the VIP test (i.e., a 10% false positive rate).
Following the initial validation study, the authors conducted a cross validation study utilizing 152 non-clinical participants, 61 patients with well documented traumatic brain injury, 49 suspected malingerers, and 40 adults with diagnosed mental retardation. Clinical participants were paid between $25 and $40 for completing the study, and a $40 contribution was made to the school of the non-clinical participants. To boost motivation of the non-compliant criterion group in the cross validation study, these subjects were offered an opportunity to earn an additional $50 contribution to their school if they could fake believable impairment and not be detected by the VIP test.
The results of the validation study are reported in the manual for individual subtest, for agreement between subtests, for diagnostic efficiency, and in comparison with other measures of malingering. The results indicate that 93% of the non-clinical participants whom they instructed to respond honestly were correctly classified by the VIP test as exhibiting a valid response style on the nonverbal subtest. Of the sample of clinical participants, they correctly classified 74% as valid responders, however, the authors add that 14 of the remaining 16 clinical participants whom they incorrectly classified as invalid responders were classified as such due to a careless response style, as opposed to a malingered or irrelevant response style. The implication is that genuinely brain injured subjects can respond carelessly, perhaps as a combination of genuine impairment and lack of effort, or genuine impairment causing lack of effort.
In the noncompliant group, all 50 of the computer generated random responses were correctly classified as invalid, with 96% of these being more specifically classified as product of an irrelevant response style. Seventy-five percent of the coached normal protocols (i.e., protocols of those subjects instructed how to feign impairment) were correctly classified as invalid. The least effective discrimination was among the remaining 49 subjects who were suspected malingerers, among which they correctly classified only 45% as invalid on the nonverbal subtest, with most of them classified as responding in a careless style.
The results on the verbal subtest were quite similar to the nonverbal subtest for all classifications, except for the suspected malingerer group. The VIP verbal subtest proved to be even less effective than the nonverbal subtest in correctly classifying this group, with only 27% of the suspected malingerers producing results classified as invalid. The overall agreement between the subtests was 85.1%, resulting in a coefficient of correlation of .70.
The overall efficiency of the VIP test in correctly classifying the initial validation sample as either valid or invalid test responders resulted in a 73.5% sensitivity statistic for the nonverbal subtest and 85.7% specificity. The VIP nonverbal subtest correctly classified a total of 79.8% of the sample. Sensitivity for the verbal subtest was 67.3% and specificity was 83.1% resulting in a correct classification rate of 75.5%.
The VIP test results were compared with several other measures of malingering, including the Portland Digit Recognition Test (Binder, 1993), the Rey Memory Test (Rey, 1958), the Word Recognition Test (Rey, 1941), and the Dot Counting Test (Rey, 1941). Although the correlations were generally low (.07 to .20 between the VIP nonverbal subtests and alternate measures of malingering, and .04 to .16 between the VIP verbal subtests and alternate measures of malingering), the agreement between test classifications of valid versus invalid was reasonably high, ranging from 69% to 73% agreement for each of the VIP subtests.
The low rate of agreement between the VIP subtests and other measures of malingering is dismissed by the author on the grounds that these alternative measures may be specific in identifying malingering, but are quite weak in terms of sensitivity. This does appear to be one of the unique advantages of the VIP test in that its sensitivity ranges from 67.3% to 73.5% versus sensitivity of 17% or less for the alternate measures that were used in this study. However, it is worth pointing out that the alternate measures of malingering used in this validation study do not necessarily reflect the state of the science in terms of tests of malingering. Other recent, more sophisticated, measures may have much greater sensitivity (i.e., Word Memory Test, Green, Astner & Allen, 1995; Colorado Malingering Test, Davis et al., 1994; Test of Memory Malingering, Tombaugh, 1996, see review in this issue).
The manual for the VIP test recommends that test results first be interpreted via the Primary Validity Indicators, which allow for the initial classification of the test response style as valid or invalid. The Primary Validity Indicators include three consistency measures and three interaction measures, for each of which there is a derived score and classification as either pass or fail. The consistency measures include a consistency ratio, nonconformity index, and individual consistency index. The interaction measures include score by correlation, slope by consistency ratio, and curvature.
The manual directs that each of these Primary Validity Indicators was designed to detect departures from an expected progression of responses. When the test taker responds with good effort and motivation to perform well throughout the entire test, performance is high for items within his or her range of ability and random for items outside the range of ability. Each of these measures captures a different element of the deviations from this expected pattern.
It is recommended that if, on the basis of the Primary Validity Indicators, the protocol is classified as invalid, it is then appropriate to analyze the other characteristics, including performance curve measures, which may reflect on the particular style of responding which resulted in the invalid classification.
The scoring options are limited to the mail-in service and computerized scoring so it is essentially impossible for the individual examiner to score and generate an interpretation on their own. While this may not be especially desirable, the interpretive report generated by Pearson Assessments is concise, appropriate, and offers sufficient information for the examiner to incorporate the VIP test results into the overall diagnosis and interpretation of the rest of the neuropsychological battery.
A particularly strong and useful part of the interpretive report is the graph plotting the examinee's performance curve with the proportion of correct responses on the Y axis and the running mean serial position on the X axis. The plot of the examinee's curve versus the expected performance curve is informative and would be useful, particularly in forensic settings, for illustrating deviations from normal, expected performance.
In summary, the VIP test represents a well conceived, sophisticated, and invisible instrument for the assessment of response style and validity of test-taking attitude. It is appropriate for use in the context of neuropsychological and cognitive assessment. It is well designed, easy to administer, and generally cost effective, assuming that rush processing can be avoided. The VIP test is particularly useful in identifying the exact response style utilized by examinees in their reaction to testing, and allows for solid inferences regarding the results of neuropsychological testing. It seems to avoid or circumvent some of the problems with other measures based on Symptom Validity Testing that are so easy as to be readily recognized by sophisticated or coached malingerers, diminishing their overall effectiveness. As with any of the other instruments that are now available for the assessment of malingering, it is recommended that the VIP test not be used alone or as the sole measure for determining response style, validity, or dissimulation. In conjunction with other instruments intended for this purpose, it offers a potentially very valuable addition to the armamentarium of the careful and competent clinician and deserves strong consideration for inclusion in future research concerning the validity of neuropsychological testing.
About the author of this review
Michael Gamache was awarded a doctoral degree in clinical psychology from the University of Missouri at Columbia. He has a private practice in Tampa focused on forensic psychology and neuropsychology. He can be reached at firstname.lastname@example.org or 500 N. Westshore Blvd., Suite 520, Tampa, FL 33609.
About the American Psychology Law Society
Comments or questions regarding the American Psychology Law Society or its newsletter can be directed to Randy Otto at: email@example.com
Binder, L. (1993). Assessment of malingering after mild head trauma with the Portland Digit Recognition Test. Journal of Clinical and Experimental Neuropsychology, 15, 170–182.
Brown, L., Sherbenou, R., & Johnsen, S. (1982). Test of Nonverbal Intelligence: A language-free measure of cognitive ability. Austin, TX: Pro-Ed.
Davis, H., King, J., Bajszar, G., & Squire, L. (1994). Colorado Malingering Test. Colorado Springs, CO: Colorado Neuropsychology Tests, Co.
Faust, D., Hart, K., Guilmette, T., & Arkes, H. (1988). Neuropsychologists capacity to detect adolescent malingerers. Professional Psychology: Research and Practice, 19, 508–515.
Frederick, R., Sarfaty, S., Johnston, J., & Powel, J. (1994). Validation of a detector of response bias on a forced-choice test of nonverbal ability. Neuropsychology, 8, 118–125.
Frederick, R., & Foster, H. (1991). Multiple measures of malingering on a forced-choice test of cognitive ability. Psychological Assessment, 3, 596–602.
Green, W., Astner, K., Allen, L. (1995). The Word Memory Test: A manual for the oral and computer-administered forms. Durham, NC: Cognisyst, Inc.
Greiffenstein, M., Baker, W., & Gola, T. (1994). Validation of malingered amnesia measures with a large clinical sample. Psychological Assessment, 6, 218–224.
Griffin, G., Normington, J., May, R., & Glassmire, D. (1996). Assessing dissimulation among Social Security disability income claimants. Journal of Consulting and Clinical Psychology, 64, 1425–1430.
Hall, H., & Pritchard, D. (1996). Detecting malingering and deception: Forensic distortion analysis. Florida: St. Lucie Press.
Lezak, M. (1995). Neuropsychological Assessment (Third Edition). New York: Oxford University Press.
Pankratz, L., Fausti, S., & Peed, S. (1975). A forced-choice technique to evaluate deafness in the hysterical or malingering patient. Journal of Consulting and Clinical Psychology, 43, 421–422.
Rey, A. (1941). L examen psychologie dans les cas d encephalopathie traumatique. Archives de Psychologie, 28, 286–340.
Rey, A. (1958). L Examen clinique en psychologie. Paris: Presses Universitaires de France.
Rogers, R., Harrell, Ernest H., & Liff, Christine, D. (1993). Feigning neuropsychological impairment: A critical review of methodological and clinical considerations. Clinical Psychology Review, 13, 255–274.
Rogers, R. (1988). Clinical assessment of malingering and deception. New York: The Guilford Press.
Tombaugh, T. (1996). Test of Memory Malingering. Toronto, Canada: Multi-Health Systems.