We conducted a large, single-center prospective study of the performance characteristics of CSF 14-3-3, tau and S100B in a clinical population with low average pre-test probability of sCJD (~13%). A number of our findings are relevant to the needs of the practising clinician who may need to make diagnostic decisions in a range of clinical contexts with varying amounts of pre-existing information, and we discuss some of these implications below. Methodologically speaking however, we first note that although sensitivity, specificity, and predictive values are convenient summary statistics to understand the accuracies of diagnostic tests, a more general, versatile and practical approach employs likelihood ratios (LRs) as conversion factors between pre- and post-test odds, which can easily be converted in turn to diagnostic probabilities [20, 21]. LRs offer a number of advantages, as they (i) make use of all of the information in a 2 × 2 table; (ii) are less dependent than predictive values on disease prevalence; (iii) enable sequential modification of diagnostic probabilities for individual patients in light of accruing information; and (iv) support the more complete extraction of diagnostic information from quantitative test data by calculating multi-level (interval) LRs [21, 28]. They can also serve as convenient generic indices for performance comparisons among diagnostic tests, even those with fundamentally different principles or formats, with LR+ values in the ranges 1-2, 2-5, 5-10, and > 10 representing non-useful, low, moderate and high diagnostic power respectively; for LR- values the corresponding ranges are 0.5-1, 0.2-0.5, 0.1-0.2, and < 0.1 .
A number of groups have studied CSF 14-3-3, tau and S100B in sCJD patients and non-CJD controls [13–15, 29–36], and some use has been made of LRs . However, to our knowledge this is the first use of LRs to estimate and compare the power of these markers to screen for sCJD in a large, low-prevalence patient population. Our estimates indicated clearly that a positive result for 14-3-3, with its LR+ estimate of 3.1 (95% CI: 2.8-3.6), offers low diagnostic power versus tau and S100B with their LR+ estimates of 7.4 (95% CI: 6.9-7.8) and 6.6 (95% CI: 6.1-7.1), respectively. This result is attributable to the lower specificity (0.72) of a positive 14-3-3 test result in our patient population. A similar estimate of lower specificity (0.74) for 14-3-3 was also reported recently in a clinical population with ~4.5-fold higher pre-test probability (i.e., prevalence) of sCJD (~59%) . Noting this, and because our estimates of sensitivity for tau and S100B (0.91 and 0.87, respectively) were comparable to those of 14-3-3 but specificities of these two markers (0.88 and 0.87, respectively) were significantly higher in the same patient population, we interpret the observed performance disparities to be more likely due to inherent differences among the markers than to study-specific patient selection, technical factors or choice of scoring thresholds.
We also presented quantitative evidence that tau and S100B each show moderate power to modify diagnostic probabilities of sCJD at their respective optimal cutoff thresholds, and high power when the quantitative assay result is taken into account with the use of interval likelihood ratios calculated at different thresholds. Thus, the extremely high CSF tau concentrations (> 10,000 pg/mL) observed in 34% (41/120) of the sCJD patients tested yielded an interval LR of 56.4, which converts a pre-test sCJD probability of 0.13 to a post-test probability of 0.89. Conversely, very low CSF tau concentrations (< 500 pg/mL) corresponded to an interval LR of 0.06, which converts the same pre-test probability to a post-test probability of 0.01. Regarding the choice of threshold value, we found that in our patient population total diagnostic accuracy as defined by a maximized Youden Index was robustly optimal at values of 976 pg/mL for tau, and 2.5 ng/mL for S100B. Both of these values differ significantly from the widely used consensus thresholds of 1300 pg/mL and 4.2 ng/mL for tau and S100B respectively using the same ELISA kits. However, selection of optima that would be generally applicable among laboratories requires further study.
Strikingly, tau and S100B values jointly above their respective optimal thresholds yielded a joint LR+ of 18.0, significantly higher than achieved by either individual marker at the same thresholds. This effect may reflect the combined severity of distinct underlying pathogenetic processes in sCJD, as accumulation of CSF tau and 14-3-3 proteins is believed to indicate rapid neuronal death , while S100B is a largely extraneuronal protein actively secreted from glial cells, suggesting it is primarily a marker of astrogliosis . Point estimates of LR- were also lowered by combining tau and S100B, decreasing from 0.10 and 0.15 for tau and S100B respectively at their individual optimal thresholds to 0.03 at the corresponding bivariate threshold. Similar effects were observed in another recent study , although these were not quantified in terms of LRs.
All studies of diagnostic test performance are potentially subject to limitations of design or execution that can lead to imprecision or bias, and thus to an inability to interpret or generalize results . We believe that our large, prospective, sequential, autopsy-based design reduced many of these potential effects, but two specific questions merit discussion. One of these concerns accuracy of case classification. For example, for 55 of our 127 patients with autopsy-confirmed prion disease, despite a CJD-like clinicopathological phenotype and no family history of a similar disease, DNA sequencing information was not available. Conceivably, some of these may have been cases of genetic prion disease rather than sCJD. However, we note that 3 of 77 genetically analyzed cases with a CJD phenotype proved to carry a PRNP mutation (E200K in each case); application of the resulting estimate of gCJD prevalence in the study population [3/77 = 0.039 (95% CI: 0.013-0.108)] to the 55 cases without genetic information yields an expected number of unrecognized gCJD cases of 2.1 (95% CI: 0.7-5.9), or < 5% (5.9/127) of the sCJD group. Noting also that diagnostic sensitivity and specificity of CSF 14-3-3, tau and S100B in gCJD are similar to those seen in sCJD , we believe that this residual uncertainty is unlikely to detract significantly from our main conclusions. Similarly, the low autopsy rate in our non-sCJD group suggests that some of these 873 cases may indeed have had prion disease. However, we also expect this proportion to be small. More specifically, during the 6 calendar years (2004-2009) overlapping the current study interval for which surveillance data are complete, the CJDSS reported annual sCJD mortality rates in Canada of 1.32, 1.30, 1.20, 0.96, 1.21 and 1.36 per million, respectively (mean, 1.23), suggesting a low, albeit undetermined, number of undetected sCJD cases.
A second potential issue has to do with the limited control that a reference laboratory such as ours has over pre-analytic factors related to sample collection, storage and handling that can in principle affect analytic results. Previous studies have suggested that CSF 14-3-3 and tau proteins are unusually stable in CSF, yielding highly comparable results with the same methods we have employed when samples were subjected to ambient temperatures and/or repeated freeze-thaw cycles [29, 40]. Similar methodological studies on the stability of S100B in CSF appear to be lacking, but our estimates of diagnostic sensitivity in our patient population (ca. 90% for all three of the studied markers) suggests that losses of sample reactivity caused by suboptimal pre-analytic conditions did not have a large deleterious effect on the interpretability of our study. With this said however, it is conceivable that suboptimal pre-analytic sample handling may have had quantitative effects in some cases, perhaps explaining some false-negative results or even lowering the estimated optimum cutoff thresholds. Future studies should address these questions.
Lastly, if characteristics of a study population do not adequately reflect those of the patient population to which the clinician wishes to apply the resulting information, it can be difficult to generalize to clinical practice - an effect sometimes called "spectrum bias" , or simply "spectrum effect" . Because we studied a heterogeneous population of patients sharing a broad common rationale for CSF testing (i.e., suspicion of sCJD), we suggest that our overall results, which represent a weighted average of test performance characteristics for all constituent patient subgroups , could prove useful over a broad range of clinical situations. However, we have provided one illustrative example of how the clinician, who can sometimes place the suspected sCJD patient into a particular well-represented subgroup - for example according to membership in an etiologically defined disease category (e.g., neurodegenerative disease) - and thus refine pre-test probability, can further enhance the diagnostic power of CSF markers. Using this subgroup criterion with tau protein as the example, we demonstrated how, using nearly identical test cutoff thresholds, PTP+ values rose from 0.52 (95% credible interval: 0.45-0.58) to 0.85 (95% credible interval: 0.79-0.91) for patients judged to have neurodegenerative dementia. Given that clinical examination and diagnostic investigations commonly undertaken for subacute encephalopathies should often enable the placement of a particular patient into such a subgroup , this type of illustrative example may prove relevant to clinical practice by helping to better define the meaning of "appropriate clinical context" in relation to use of CSF markers to diagnose sCJD.
As we found that 14-3-3 performed least well among the 3 individual markers studied and that the combination of tau and S100B yielded as much information as all 3 markers combined, focusing on tau and S100B among existing CSF markers may be sufficient for most clinical investigations of sCJD. It may also be appropriate to consider formally incorporating tau and S100B into enhanced WHO surveillance case definitions for sCJD . Apart from diagnostic power, another important criterion of marker utility is availability of a suitable technical format. This is particularly relevant for 14-3-3 proteins, which continue to be assayed using immunoblot methods [18, 43] that are inherently difficult to control, optimize and standardize in comparison with ELISA. Although ELISA-format 14-3-3 immunoassays with demonstrated diagnostic utility have been developed [32, 44–46], these have not yet seen widespread use or commercial distribution.