Neuropsychological assessment is more than just administering a number of tests in a patient. Although each test is presented and scored in a standardized manner, raw test results do not have any meaning. Only few tests measure functions at a ceiling level, allowing immediate interpretation. However, ceiling effects in test performance often result in a lack of sensitivity. Subtle impairments are easily missed, resulting in a high proportion of false-negative cases (people classified as cognitively unimpaired where in fact they are). Hence, good neuropsychological tests are sensitive to a range of functioning, both at very low levels of cognitive functioning as well as in people with above-average cognitive abilities. In addition, test results are always affected by confounding factors such as age, intelligence, education level, sex, cultural background, and native language. Consequently, the performance on neuropsychological test cannot be classified as impaired using a simple cut-off score, but normative data are required that adjust for these confounding factors. Extensive normative datasets are available for the widely used neuropsychological tests, either published in test manuals or in comprehensive textbooks (1, 2, 61). Basically, two types of norm sets exist, stratified norms and regression-based norms. In stratified norm sets, an individual test score is compared to the mean performance of a matched norm group, for example, people of comparable age and education level. In contrast, in regression-based norms the individual's expected score is computed based on a number of potentially confounding variables, such as age, IQ, and gender, by means of a regression formula. The difference between the individual's expected and actual score (the residue score) is then compared to a frequency table to determine the probability that this residue score is found in a normal population.
Both stratified and regression-based norms convert the raw score that an individual obtains on a specific task to a standard score that is corrected for factors, such as age, education level, intelligence, and sex. This standard score can subsequently be interpreted using the normal distribution, often referred to as the bell curve (see Fig. 4), which indicates the probability that a given performance is to occur in a normal population. For example, z-scores have a mean of 0 and a standard deviation (SD) of 1, which means that a performance that equals a z-score of -1.5 is 1.5 standard deviations below the normative mean. Figure 4 shows that this equals the 7th percentile, indicating that 7% of the normative group (which represents the normal population) obtains this (or a lower) score. Clinically, the cut-off point that is used for determining whether a given performance is "impaired" depends on this percentile range. There is, however, not a strict rule for which cut-off point should be used. In most cases, a performance over 2 SDs below the mean is used, which equals a percentile of 2.3 (1) but less conservative cutoff points are also widely used, such as 1.5 SD or 1.65 SD (62), the latter being the generally accepted probability of incorrectly rejecting a statistical hypothesis (equaling a percentile range of 5). Test scores that fall below 1 SD of the normative mean, but above the cut-off score for an impaired
performance are typically classified as below average (but not impaired). Whether below-average performances have a clinical significance cannot be determined only on the basis of statistical analyses. As a rule of thumb applied by most clinical neuropsychologists, below-average test scores have a clinical significance if they (1) represent a consistent pattern (e.g., below-average scores on various memory tests and average scores on other tests or a below-average performance on one executive function test and impaired scores on other tests of executive functioning), (2) have a relation with the subjective complaints by either the patient or his/her significant other(s) (e.g., below-average performance on tests of speed of information processing in a patient who experiences a decline in mental speed), (3) can be linked to typical neuropsychological findings in patients with a specific disease or syndrome (e.g., a below-average performance on attention tests in a patient with multiple sclerosis or a below-average performance on executive functioning in an older patient with type 2 diabetes). In addition, the premorbid performance level of a patient also provides information about a possible decline in cognitive function, which can be estimated using crystallized intelligence tests, such as the NART, or education level, professional history, and socioeconomic background.
Apart from using test results to classify a patient at an individual level, group differences can also be measured. Where statistical testing is performed to indicate whether two-sample means significantly differ, a statistically significant finding does not immediately mean that such a between-group difference is clinically meaningful. The size of the group differences is relevant here: the larger the effect, the more relevant it will be for clinical practice. However, given the variety between the various neuropsychologi-cal tests and the variables that are used (e.g., reaction times, number correct, % incorrect), raw test scores are not suitable to assess the magnitude of an effect. For the interpretation of a difference between two groups, a standardized effect size (d) is mostly used, expressed as the differences between the two means divided by the pooled standard deviation. In practice, a somewhat arbitrary classification is used, in which effect sizes are considered small if d = 0.2, medium if d is about 0.5, and large if d is approximately 0.8 (63). Consequently, small effect sizes would probably not be clinically significant, in that it is unlikely that many individuals in the group would be classified as "impaired" at an individual level, whereas large effect sizes would indicate clinically meaningful differences, i.e., resulting in an "impaired" performance in many of the individuals.
Was this article helpful?