Avoiding Catch-22: validating the PainDETECT in a in a population of patients with chronic pain

Background Neuropathic pain is defined as pain caused by a lesion or disease of the somatosensory nervous system and is a major therapeutic challenge. Several screening tools have been developed to help physicians detect patients with neuropathic pain. These have typically been validated in populations pre-stratified for neuropathic pain, leading to a so called “Catch-22 situation:” “a problematic situation for which the only solution is denied by a circumstance inherent in the problem or by a rule”. The validity of screening tools needs to be proven in patients with pain who were not pre-stratified on basis of the target outcome: neuropathic pain or non-neuropathic pain. This study aims to assess the validity of the Dutch PainDETECT (PainDETECT-Dlv) in a large population of patients with chronic pain. Methods A cross-sectional multicentre design was used to assess PainDETECT-Dlv validity. Included where patients with low back pain radiating into the leg(s), patients with neck-shoulder-arm pain and patients with pain due to a suspected peripheral nerve damage. Patients’ pain was classified as having a neuropathic pain component (yes/no) by two experienced physicians (“gold standard”). Physician opinion based on the Grading System was a secondary comparison. Results In total, 291 patients were included. Primary analysis was done on patients where both physicians agreed upon the pain classification (n = 228). Compared to the physician’s classification, PainDETECT-Dlv had a sensitivity of 80% and specificity of 55%, versus the Grading System it achieved 74 and 46%. Conclusion Despite its internal consistency and test-retest reliability the PainDETECT-Dlv is not an effective screening tool for a neuropathic pain component in a population of patients with chronic pain because of its moderate sensitivity and low specificity. Moreover, the indiscriminate use of the PainDETECT-Dlv as a surrogate for clinical assessment should be avoided in daily clinical practice as well as in (clinical-) research. Catch-22 situations in the validation of screening tools can be prevented by not pre-stratifying the patients on basis of the target outcome before inclusion in a validation study for screening instruments. Trial registration The protocol was registered prospectively in the Dutch National Trial Register: NTR 3030. Electronic supplementary material The online version of this article (10.1186/s12883-018-1094-4) contains supplementary material, which is available to authorized users.


Background
The International Association for the Study of Pain defines neuropathic pain as "pain caused by a lesion or disease of the somatosensory nervous system" and states that "neuropathic pain is not a medical diagnosis but a clinical description which requires a demonstrable lesion or a disease that satisfies established neurological diagnostic criteria" [1]. In the clinical context it is better to speak of a present or an absent neuropathic pain component (present-or absent-NePC) with respect to so called mixed-pain conditions [2,3] in which neuropathic pain and nociceptive pain both exist. Clinically, NePC is considered to manifest specific symptoms and signs [4,5]. The classification of NePC is usually based on history and physical examination including (bedside-) sensory testing [6,7]. The correct classification of NePC is important for patients because NePC has a considerable impact on the quality of daily life [8] and for physicians since the treatment differs strongly from that of patients without NePC [6,9,10].
An easy to use and validated screening tool for clinical triage and epidemiological purposes could aid uniform classification and quantification of NePC and hence lead to better therapy, particularly when used by non-specialists [6][7][8][11][12][13][14][15].
The PainDETECT is such a patient friendly screening tool for the screening for neuropathic pain. It was originally developed and validated in Germany [2] based on two groups of patients (patients with pain of predominantly neuropathic origin or of predominantly nociceptive origin) with at least a 40% score on a visual analogue scale for pain (VAS;0-100). The gold standard used in this study was the assessment of the pain type based on the examination by two experienced pain specialists. This resulted in a percentage of correctly identified patients of 83% for neuropathic pain, a sensitivity of 85 and 80% specificity [2]. Subsequently, validation studies were performed in Spain [16], Turkey [17], Japan [18], India (Hindi) [19] and Korea [20]. Since the introduction of the PainDETECT this instrument has been used in many clinical and epidemiological studies [21]. In a Danish study, based on PainDETECT outcome, NePC was present [22] in about 40% of the patients with musculoskeletal pain.
In the above-mentioned validation studies [2,[16][17][18][19][20], the validity of the PainDETECT as a screening tool was performed in pre-stratified groups of patients based on the target outcome (pain of predominantly neuropathic origin or of predominantly nociceptive origin and limitation to pain scores). The inclusion of only patients with a known pain classification on forehand might lead to a prerequisite for the determination of validity of the Pain-DETECT. For this situation, the term "Catch-22" is used in the English language for" a problematic situation for which the only solution is denied by a circumstance inherent in the problem or by a rule" [23]. It was firstly described in Joseph Heller's novel Catch-22 which describes a general situation in which an individual has to accomplish two actions that are mutually dependent on the other action that must be completed first.
The objective of this study is to further validate the Pain-DETECT as a screening tool for use in daily outpatient practices for detecting a NePC. The current validation study is being conducted in a general patient population having common chronic pain syndromes, not pre-stratified on the target outcome: low back with leg pain (LBLP), neck-shoulder-arm pain (NSA pain) or a suspected peripheral nerve damage pain (suspected PND pain).

Methods
The study was conducted in a cross-sectional, observational, research design with two weeks and three months follow up to study the clinimetric quality (i.e. reliability and validity) of the PainDETECT. This study, to detect a NePC in patients suffering from chronic pain, was approved by the medical and ethical review board Committee on Research Involving Human Subjects region Arnhem-Nijmegen, Nijmegen, the Netherlands, Dossier number: 2008/348; NL 25343.091.08 and conducted in accordance with the declaration of Helsinki and the declaration of the World Medical Association. As required, written informed consent was obtained from patients prior to study participation. The protocol is registered in the Dutch National Trial Register: NTR 3030. The PainDETECT was translated and cross-culturally adapted into the Dutch language (PainDETECT -Dlv ) (© Pfizer Pharma GmbH 2005, Pfizer bv 2008. Cappelle a/d IJssel, the Netherlands) in a separate study [24] before the commencement of the present validation study. In this study, the same methodology was used as in the previously published protocol [25] and as employed in a simultaneous study regarding the validity of the DN4 [26] .

Patients
The patients were recruited from October 2009 until July 2013. Multicenter recruitment took place in the Netherlands in three academic centers specialized in pain medicine, three non-academic centers specialized in pain medicine and one non-academic department of neurology. The question to participate in the study was asked by the patients' own physician. At that moment they only had a provisional diagnosis: LBLP, NSA pain or pain due to a suspected PND (Conditions associated with a lesion of the peripheral somatosensory system). These three groups of patients include a majority of the patients referred towards an academic or peripheral pain clinic from the general practitioner. Patients had to be diagnosed for the initial cause of the pain as classified according to the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10)-2015-WHO Version 2015 [27]. Importantly, patients were not pre-stratified on the target outcome: the existence of NePC yes or no [28]. Patients, when willing to participate, were included when they met the following inclusion criteria: Male or female adult patients (> 18 years of age) with chronic (≥3 months) LBLP or NSA pain radiating into leg (s) or arm (s) respectively or patients with chronic pain due to a suspected PND. Exclusion criteria were: Patients diagnosed with an active malignant disorder, compression fractures, patients with diffuse pains (pains with an origin in muscles, bones or joints: such as fibromyalgia or ankylosing spondylitis), severe mental illness, chronic alcoholism or substance abuse, inability to fill in the questionnaire adequately or incapable of understanding the Dutch language.

Physicians' assessment
Patients were examined for the presence of NePC by two physicians which was considered to be the "gold standard" in this study. The physicians (pain specialists, pain specialist in training or neurologists always operated in differently composed pairs) worked independently from each other and were blinded to the classification made by the other physician. The physicians were not selected on basis of age, years of experience as a physician or other criteria. A full medical history was taken followed by a thorough clinical examination. A bedside examination (touch, pin prick, pressure, cold, heat and temporal summation) to assess patients' pain [25] was based on the European Federation of Neurological Societies (EFNS) guidelines [29,30], the IASP Neuropathic Pain Special Interest Group (NeuPSIG) guidelines on neuropathic pain assessment [6] and the guidelines for assessment of neuropathic pain in primary care [7]. Patients' pain was classified by the physician as pain with present-or absent-NePC. The NeuPSIG Grading System for neuropathic pain as proposed by Treede et al. [31] was used as a secondary comparison with the outcome of the PainDETECT -Dlv . The assessment of the Grading System was implemented in the standardized assessment protocol and thus included in the diagnostic work-up of the patients [25]. The outcomes "probable" and "definite" were regarded as "present-NePC". "Unlikely" and "possible" were rated as "absent-NePC" [32][33][34]. All participating physicians underwent standard medical training, belonging to the classic medical curriculum, and examination of the (central) nervous system in particular. To achieve standardization of history and assessment of NePC presence in patients included in this study all participating physicians underwent a training in the performance of the clinical examination of the patients (including sensory (bedside) examination and use of the NeupSIG Grading System) [25]. Training of the physicians took place at the participating center. During the execution of the study, the study coordinator (HT) visited the participating centers on a regularly basis to answer questions, to see if the necessary equipment was always available and to keep an eye on the inclusion of patients. Based on the order of assessment, the physician who performed the first assessment was called physician A and the physician who performed the assessment as a second physician was named physician B. However, the order of the physicians was based on availability during the study.

PainDETECT -Dlv and other questionnaires
The PainDETECT -Dlv (© Pfizer Pharma GmbH 2005, Pfizer bv 2008. Cappelle a/d IJssel, the Netherlands) [2,24] was designed as a simple, patient self-administered screening tool to screen for the presence of neuropathic pain without physical examination. This instrument consists of one item about the pain course pattern, one about radiating pain and seven items about the gradation of pain. An overall score is generated and ranges between − 1 and 38. Additionally, there are three items about pain severity (current, worst and average pain) included in the PainDE-TECT. For the original German version [2] the outcome was as follows: '-1 -12: negative' , neuropathic pain is unlikely; 13-18: 'unclear'; result is ambiguous, however neuropathic pain can be present; 19-38 'positive' , neuropathic pain is likely.
The patient completed five questionnaires (including the PainDETECT -Dlv directly after the clinical assessment by the participating physicians but without any interference by the physicians. The researcher (HT) was available for help via telephone or in person when it was not clear how to fill in the questionnaires. Besides screening for NePC via the PainDETECT -Dlv [24], the disability of the patient was assessed via the Disability Rating Index (DRI) [35]. The existence of an anxiety disorder and/or depression were assessed via the Hospital Anxiety Depression Scale (HADS) [36][37][38] and the Pain Attribution Scale (PAS) was used to assess patients attribution of his or hers pain. Quality of life was determined via the RAND 36-item Health Survey (RAND-36) [39][40][41]. Two weeks and three months after the initial visit the follow-up questionnaires (the Patients Global Impression of Change (PGIC) [42][43][44] and the PainDETECT -Dlv ) were sent to the patient by mail.

Data
All data gathered from patients and physicians was collected on paper and stored at the Radboudumc, Nijmegen, The Netherlands. Data management and monitoring were performed within MACRO (MACRO, version 4.1.1.3720, Infermed, London, United Kingdom).

Statistical methods
Power calculation for this study was based on a expected NePC prevalence of 37% in an unselected cohort of patients with chronic low back pain [2]. Sensitivity and specificity of the PainDETECT were assessed in the original validation study as respectively 85 and 80% [2]. The sensitivity and specificity of the PainDETECT -Dlv was, on forehand, expected to be 80% with a prevalence of 37%. The lower 95% confidence limit was required to be > 0.55. According to the calculations following the formulas by Flahault et al. [45] 132 patients with LBLP, NSA pain or suspected PND pain were needed so that the sample size contained a sufficient numbers of cases and controls [25].
Qualitative variables were presented as frequencies and percentages. The quantitative variables were presented as mean and standard deviation (SD) or as median and inter quartile range (IQR). Based on the classifications of the two physicians, all patients were categorized as absent-NePC, NePC or 'undetermined' (i.e. the classification by both physicians jointly was not equal).
One-way ANOVA (with additional Tukey's studentized range post-hoc test) or Kruskal-Wallis test were used to study differences between the three groups (NePC, absent-NePC, Undetermined).
Intraclass correlation (ICC) was used to assess reproducibility ('test-retest reliability') of the PainDETECT -Dlv between the fixed time points (baseline versus two weeks & baseline versus three months) . The ICC and responsiveness of the PainDETECT -Dlv were assessed between each point of measurement.
A receiver operating characteristic (ROC) curve was calculated and the area under the curve (AUC) with 95% confidence interval is presented to indicate the discriminatory power of the PainDETECT -Dlv to discriminate patients classified as with or without a NePC. The classification was based on the physicians' assessment outcome or based on the Grading System outcome, respectively. The theoretical maximum of the AUC is 100%, indicating a perfect discrimination and 50% is equal to tossing a coin. The optimal cut-off point of the PainDETECT -Dlvsum score was calculated under the condition of equal-costs of misclassification, using the Youden-index. Sensitivity, specificity, positive and negative predictive values and the likelihood ratio in the population in this study was calculated at this cut-off point. Also, the 'number needed to diagnose (NND)' was assessed [46] by use of the formula: NND = 1/[Sensitivity -(1-specificity)]. A clinical screening tool for the demonstration of a neuropathic pain component is considered valid if it has a high sensitivity, specificity and a high positive predictive value. For the measurement of the usefulness of the screening tool the likelihood ratio will be used [47].
The agreement between the pain classification by the physicians, the NeuPSIG Grading Systems and the Pain-DETECT -Dlv (yes: ≥11, no:< 11) outcome was evaluated by using Cohen's kappa (K), prevalence index (Pi) and percentage of pair wise agreement (PA) [25]. A Κ ≥ 0.40 and a PA ≥ 70% is considered indicative of interobserver reliability which is acceptable for use in clinical practice [48].
Data analysis and statistics were performed by use of Statistical Package for the Social Sciences (SPSS version 20.0, SPSS Inc., Chicago, Illinois, USA). Two-tailed p-value below 0.05 was considered statistically significant.

Patient population
In this study 330 patients, not pre-stratified on the target outcome, with chronic LBLP, NSA pain or suspected PND pain were assessed for eligibility. Two patients did not give their informed consent. Exclusion (n = 37) was due to not fulfilling the in-and exclusion criteria (n = 13); not returning the baseline questionnaires by the patient (n = 16); missing pain classification by one physician (n = 5) or both physicians (n = 3). In eight patients the assessment of the grading system (secondary comparison) was missing by one or both physicians. Finally, 291 patients participated in the study between October 2009 and July 2013. According to the international classification of diseases (ICD-10, version 2015) [27] these patients were classified as follows: 8 patients suffered from pain related to endocrine, nutritional and metabolic diseases (chapter IV); 75 patients from diseases of the nervous system (chapter VI); 1 patient from diseases of the circulatory system (chapter IX);189 patients from diseases of the musculoskeletal system and connective tissue (chapter XIII); 1 patient from diseases of the genito-urinary system (chapter XIV); 3 patients from symptoms, signs and ill-defined conditions, and 14 patient from injury, poisoning or other consequences of external causes.
Numbers of recruitment in the different participating hospitals (all in the Netherlands) were as follows: Reinier de Graaf Gasthuis, Delft n = 86; ErasmusMC, Rotterdam n = 62; Radboudumc, Nijmegen n = 59; Bernhoven Ziekenhuis, Oss n = 56; Rijnstate Ziekenhuis, Arnhem n = 15; St. Anna ziekenhuis, Geldrop n = 12 and UMC Utrecht, Utrecht n = 1. 132 patients had LBLP with radiation in one or two legs (45.4%), 51 NSA pain with radiation into one or both arms (17.5%) and 108 (37.1%) had suspected PND pain. The group of patients with suspected PND consisted of 86 patients with pain who were treated because of breast cancer (surgery and/or radiation and/or chemotherapy and/or hormonal therapy). The remaining 22 patients had pain because of various reasons: peripheral nerve damage (n = 12), polyneuropathy (n = 3), central post stroke pain (n = 2), Complex Regional Pain Syndrom (n = 2) and spinal radicular pain (n = 3).
After assessment by physicians A and physicians B, 170 patients were classified as having present-NePC, 58 as absent-NePC. In 63 patients the two physicians made a non-concordant pain classification, so the outcome based on the physicians assessment was classified as 'undetermined'. Based on the NeuPSIG Grading System in 139 patients NePC was classified as present, in 93 patients NePC was absent and in 51 patients the two physicians made a non-concordant pain classification in which the outcome was classified as 'undetermined' (see Fig. 1: Flow Diagram).
Social-demographic and clinical details of the 291 patients were analyzed and divided from each other based on the pain classification (see Table 1). No statistically significant differences were found between absent-NePC, present-NePC and undetermined for gender, age, height, weight, body mass index (BMI), education, medication, duration of pain, quality of life, disability, pain attribution, anxiety disorder and depression. Moreover, no statistically significant differences were observed between absent-NePC and present-NePC for pain (current, worst and average pain).

Physicians
During this study 62 various physicians (pain specialist, pain specialist-fellow or neurologist), from seven different hospitals, assessed all included patients. All patients were assessed two times by two different physicians. Of all participating physicians, 21 physicians assessed ≤2 patients during the execution of the study, 23 physicians saw ≤9 patients, 10 physicians saw ≥10 patients and 8 physicians saw ≥20 patients.

Evaluation of the PainDETECT -Dlv
The mean score of the PainDETECT -Dlv (Range − 1;38) for patients classified as absent-NePC was 10.7 (SD ± 5.7); for patients classified as present-NePC it was 15.7 (SD ± 6.3) and for patients with an undetermined outcome it was 11.8 (SD ± 5). As calculated based on a one-way ANOVA with Tukey's studentized range post-hoc test, there was a statistical significant difference between absent-NePC and present-NePC (P < 0.001) and between present-NePC and undetermined (P < 0.001). No significant difference was seen between absent-NePC and undetermined (P = 0.57). Patients pain course pattern and if the pain was radiating to other regions of the body were not significantly different between the three groups. Pain descriptors (burning, tingling or prickling, painful light touching, sudden pain attacks, temperature evoked pain, numbness sensation and pressure evoked pain) were all statistically significant discriminators for the presence of NePC (P ≤ 0.005) except for pressure evoked pain (P = 0.07). See Table 2 for the PainDE-TECT -Dlv outcomes divided according to the pain classification by the physicians (present-NePC, absent-NePC or undetermined) (See Table 2).

Validity
The gold standard for presence of the NePC in this study was the concordant opinion of both physicians. On basis of this gold standard, patients with an identical pain classification were included in the initial analysis (n = 228): 58 patients were classified as absent-NePC (25.4%) and 170 were classified as present-NePC (74.6%)(see Table 3 and Additional file 1: Table S1). A ROC-curve was constructed for PainDETECT -Dlv (see Fig. 2). Based on the gold standard, PainDETECT -Dlv sensitivity was (at maximal Youden-index) 80%, specificity 55.2%, positive predictive value 84% and the positive likelihood ratio was 1.78. Based on the neuropathic pain Grading System, the sensitivity was 74.1%, specificity 46.2%, positive predictive value 67.3%, and positive likelihood ratio of 1.38. We also constructed ROC-curves for the classification by solely physicians A or B and according to the neuropathic pain Grading System by physicians A or B and all the combinations. Except for classification of patients' pain based on the description of physicians A and the outcome of the Grading System by physicians B all cut-off scores were calculated at 11-points out of 38: The sensitivity ranges from 57.6-86.1% and specificity from 43.9-59.2%. The classification of patients' pain based on the classification of physicians A resulted in a cut-off score of 9-points: Sensitivity 86.1% and specificity 45.8%. The classification of patients' pain based on the Grading System according to physicians B resulted in a cut-off score of 14-points: Sensitivity of 57.6% and specificity of 59.2%. In Table 3 and Additional file 1: Table S1 we present the number of patients with LBLP, NSA pain or suspected PND pain, in total and per group based on physicians' assessment and/or the Grading System. Values of the AUC, cut-off value, sensitivity and specificity are provided (see also Additional file 1: Table S1 for a more detailed analysis of the diagnostic accuracy of the PainDETECT -Dlv : AUC, cut-off value, sensitivity, specificity positive and negative predictive values, positive likelihood ratios and the number needed to diagnose (NND).
Patients were screened on a NePC (positive outcome) by two physicians, two times the Grading System, and the patient completed the PainDETECT -Dlv . All the possible outcome combinations were computed based on the outcome: Is a NePC present, or not? In 283 patients all the five outcome variables were available and are displayed in a Venn-diagram [49] (see Fig. 3). In 92 patients (32.5%), five times a positive outcome variable was found, indicating presence of NePC. In 23 patients all outcome variables were negative (8.1%), thus indicating absence of NePC. One positive outcome was detected in 39 patients (13.8%), two positive outcomes in 28 patients (9.9%), three in 49 patients (17.3%), and four in 52 patients (18.4%).

Reliability
To determine the interobserver reliability between the physicians, the Grading System and the outcome of the PainDETECT -Dlv for the classification of a (absent-) NePC, Cohen's kappa (K) and percentage of pair wise agreement (PA) were assessed (see Table 4 Stability and responsiveness of the PainDETECT -Dlv over time was assessed over a period of two weeks. The mean sum score of the PainDETECT -Dlv at baseline for the total group was 13.8 ± 6.3. The mean sum score after two weeks was 14.1 ± 6.1. Test-retest reliability via ICC was 0.83 (95%CI 0.79-0.87; n = 268). Taking into consideration the fact that patients' pain should not have changed (outcome based on the PGIC), because otherwise the ICC would not reflect the consistency of the

Discussion
This study demonstrates the clinimetric quality of the PainDETECT -Dlv , a screening instrument for the presence of a NePC, on a large population of patients, with chronic pain due to low back with leg pain, neck-shoulder-arm pain or pain due to a suspected peripheral nerve damage as normally present in a physician's daily practice. Because the patients were included without pre-stratification on the target outcome, previous Catch-22 situations in the assessment of the validity of screening instruments were avoided. Under these conditions, the PainDETECT -Dlv failed to be predictive for the existence of a NePC due to a moderate sensitivity and low specificity, irrespective of comparison with the expert opinion via the classification by two physicians (gold standard) as well as with the outcome of the NeuPSIG Grading System. Moreover, the predictive values were also not indicating that the PainDETECT-Dlv is a valid screening tool for the assessment of a NePC. The likelihood ratios were also not suggestive for the usefulness of this instrument.

Validation studies with patients pre-stratified for NePC
We found an optimal cut-off score for the PainDE-TECT -Dlv of ≥11 points corresponding to a sensitivity of 80% and a specificity of 55%. In the original development and validation study of the PainDETECT by Freynhagen et al. [2] a sensitivity and specificity of 84% was found. The gold standard in their study was the examination by two experienced pain specialists. The study was performed at ten different specialized pain centers. Only patients with 'typical' neuropathic or nociceptive entities (i.e. no 'unclear' outcome) and only patients with a VAS of > 40 mm (0 -100 mm) were included. In the Spanish validation study by De Andrés et al. [16] only patients with a VAS ≥40 mm and a known classification (by one experienced specialist) of neuropathic pain, mixed pain or nociceptive pain were included. It revealed a sensitivity and specificity of 81% when patients with the classification of neuropathic pain or nociceptive pain were included. The inclusion of patients with mixed pain in the neuropathic pain group resulted in a sensitivity of 84% and specificity of 78%. The Korean version of the PainDETECT [20] was validated based on the study by De Andrés [16] in patients with chronic pain and with a NRS ≥ 3 (NRS 0-10). The gold standard was the independent diagnosis of the patient by two experienced pain physicians. It revealed a sensitivity of 82% and a specificity of 92% based on a cut-off score of ≥19 (range − 1; 38). In the validation of the Turkish version of the PainDETECT [17] patients were included with the classification of pain type (i.e. NePC) being assessed beforehand (based on the opinion of two expert pain physicians) and patients suffering from pain of three centimeters or more (VAS 0-10 cm). Sensitivity and specificity were respectively 78 and 83%. The Hindi version of the PainDETECT [19] was validated in patients with neuropathic and in patients with non-neuropathic pain based on a conventional single assessment by one physician. At a optimal cut off point of ≥18 sensitivity was 83% and the specificity was 84%.
In a cohort of patients with a spinal cord injury for more than one year, pain lasting more than six months and a pain intensity of than more three on a NRS (0-10) a sensitivity was found of 68% and specificity of 83% [50].
The present study included patients with chronic pain without limits to the minimal pain intensity or other limitations. At the moment of inclusion in the study our patients had only a provisional diagnosis (LBLP, NSA pain, suspected PND pain) established in primary or secondary care without further refinement or confirmation. Then, after referral to a (non-) academic pain clinic, they were assessed as to their complaints for the first time at study inclusion. Thus this was a 'real-life' clinical out-patient population. Avoiding patient selection due to pre-stratification to the outcome target makes our study unique and clinically more relevant as compared to other studies on the same topic and is crucial for the validation of a screening instrument.

Validation studies with patients not pre-stratified for NePC
In a study by Gauffin et al. [51] in patients diagnosed with fibromyalgia (n = 158) a cut-off score of 17 was found with a sensitivity of 79% and specificity of 53% (Gold standard: the classification by one experienced physician). This study, like ours, did not pre-stratify patients according to the pain classification either, and patients were not excluded because of a low pain level. The outcome of Gauffin's sensitivity analyses in this fibromyalgia study was comparable to our study. Tampin et al. [34] found, based on the examination by a physical therapist, a sensitivity and specificity of the PainDETECT of respectively 64 and 62% (cut-off score 18.5) in a population of patients with neck/upper limb pain (n = 122). In our study, the outcome for patients with neck-shoulder-arm pain was 83 and 44% respectively (cut-off score of ≥9). Classification of NePC is based on physicians' assessment of the patients and on the Grading Systems n = number of patients in the analysis; K = Cohen's kappa coefficient; PA (%) = percentage of agreement between two outcome variables; Pi = Prevalence index

Grading system
In this study the physicians assessed patients for the presence of a NePC according to the Grading System [31]. Probable neuropathic pain and definite neuropathic pain were combined as present-NePC, and non neuropathic pain and possible neuropathic pain were combined in absent-NePC. Sensitivity and specificity of the PainDETECT -Dlv resulted to be 74 and 46% respectively (Cut-off score 11 out of 38, n = 232). Using the classification of patients' pain based solely on the Grading System by one physician results in a lower validity than based on the physicians assessment. This might suggest that the classification of patients' pain based on the Grading System is less accurate than the classification based on the physicians' assessment in respect to the outcome of the PainDETECT -Dlv . However, the grading system was assessed by the same physician who also performed the physician's assessment so it is also possible that the physician had difficulties to classify patients pain based on the Grading System or vice versa. When using the physicians' assessments as well as the Grading Systems of both physicians (n = 161), sensitivity was 78%, specificity was 53% and the cut-off score was 11: The same poor result as for the gold standard. In the papers by Vaegter et al. [33] and Tampin et al. [34] the PainDE-TECT was also compared with the NeuPSIG Grading System. In both papers, like in ours, the outcome of the Grading System was not comparable to the outcome of the PainDETECT. As stated by Finnerup et al. [52] the PainDETECT (and other screening tools for the assessment of neuropathic pain) is only to alert the physician to further assess the patient who may have a NePC.

NePC classification
The initial classification of patients' pain in our study was based on an interview and (clinical/physical) examination by trained (pain-) physicians. There is a lack of consensus with respect to the classification of a NePC in patients with pain of different origins [53]. Moreover, a lack of standardization of assessment methods increases the number of undetected or poorly classified patients which leads to a variation in the classification accuracy (i.e. sensitivity and specificity) of screening tools caused by differences in strategy and patient population [15,54]. Bouhassira and Attal recently stated that neuropathic pain is "a consistent clinical entity, but it is multidimensional in terms of its clinical expression, with different sensory profiles, potentially reflecting specific pathophysiological mechanisms" [55]. As stated by Scholz et al. [53] physical tests are more useful to identify patients with neuropathic back pain than interview questions. To reach a more unified classification system to differentiate between present-NePC and absent-NePC a standardized assessment of symptoms and signs is necessary [53]. However, these tests are not able to confirm the relation between the potential lesion or disease of the nerve and the pain directly: The classification of neuropathic pain should be based on clinical examination and the interpretation should be placed in the clinical context of patients' pain [55]. In this study we used a mandatory standardized assessment [25] in addition to the medical history and physical examination which were performed according to the physicians' standards. The clinical assessment and the use of the Grading System showed that in 18-22% of the patients a non-consistent assessments was present resulting in an 'undetermined' status. In Freynhagens paper [2] it was 5%. This difference might be due to the inclusion of patients with a less clear absent or present NePC in our study which might reflect what happens in the assessment of a NePC in usual clinical care. Moreover, this also might occur in the treatment of patients with chronic pain. Based on both the physician's assessments, almost 75% of the patients in this study had a neuropathic pain component. This might be due to several facts. (1) Patients with LBLP or NSA pain were only included when the pain was radiating into the leg(−s) respectively the arm(−s) and were not removed from this study when they had mixed pain. Moreover, patients with radiating pain are more suspected to have a NePC.
(2) There is a possibility that neuroplastic changes are interpreted as neuropathy in patients with chronic LBLP.
(3) Patients were recruited in secondary and tertiary pain clinics. This might have led to a inclusion of patients who were more difficult to treat in primary care and (4) we included 108 patients with suspected peripheral nerve damage. Almost 60% of the patients after treatment for breast cancer has pain [56]. Based on the recent review by Ilhan et al. [57], in patients who reported pain following breast cancer treatment the pooled prevalence of neuropathic pain from screening questionnaires ranged from 32.6 to 58.2%. Following the NeuPSIG Grading System the prevalence ranged from 29.5 to 57.1%. Based on these numbers, patients after breast cancer can be regarded as patients suspected of neuropathic pain due to peripheral nerve damage. However, the PainDETECT -Dlv (compared to the gold standard and the NeuPSIG Grading System) as used in our study seems not valid for the assessment of patients with neuropathic pain based on a suspected PND in which the majority of patients was suffering of pain after treatment for breast cancer.

Strengths and weaknesses
There are several strengths in this study. Firstly, we included a large population of patients with diagnoses who are regularly seen in daily clinical practice. Secondly, there was no pre-stratification on the target outcome, clear inclusion criteria and almost no exclusion criteria.
Thirdly, we used the NeuPSIG Grading System [31,52] as a secondary comparison. The main purpose of the Grading System is to help in the classification of the pain as neuropathic [52]. In our study, the Grading System was added to the standardized assessment form which had to be filled in by the physician. There are also some weaknesses in this study. The use of the Grading System within the clinical assessment (including bed-side examination) is a strong aspect of our study, but the outcome of the clinical examination as well as the outcome of the Grading System might be influenced by each other. However, combining the physicians' assessment with the Grading System might have made the 'gold standard' even stronger but also might have led to a cross-contamination. Secondly, diagnosing NePC by assessing patients' pain by two separate physicians in our and in other studies is considered as the 'Gold Standard'. However, classifying patients' pain may be done more objectively by establishing a damaged nerve and by diagnosticating in a more detailed clinical way. Moreover, the breakdown of clinical grounds for in-and exclusion could also have been assessed and captured in more detail. Thirdly, 62 physicians participated. This might have led to the inclusion of younger, less clinical experienced physicians. However, it reflects 'real life' practice and limits the risk of systematic bias in the classification of patients' pain and bias based on assumptions about the existence of a NePC. Moreover, all physicians followed the standardized training as described. Fourthly, almost only patients with peripheral causes of pain were selected. This can be considered as a methodological drawback. Moreover, because we did not include patients with, by example, low back pain without irradiation to the leg who would probably be diagnosed as absent-NePC the specificity might decrease. Fifthly, there is an apparent lack of objective tests to determine whether the somatosensory fibers were affected, in particular the small fibers. This can be seen as crucial since objective data are mandatory to reach a definite neuropathic pain classification in the grading system. Lastly, in a following study we would collect data from the patients who were not able to participate in the study to prevent inclusion bias. In this study this was not possible because of ethical regulations.

Conclusions
The PainDETECT -Dlv has a good internal consistency and test-retest reliability but is not an effective screening tool for the assessment of a neuropathic pain component in a population of patients with chronic pain, irrespective of the chosen comparison because of its moderate sensitivity and low specificity. However, the agreement by both the physicians and the agreement with the grading systems (performed by the physicians) were also not impressive. Moreover, the differences in the cut-off scores for the different comparisons reflects the fact that agreement in a not pre-stratified to the target outcome patient population is not easy to accomplish. Using the PainDETECT -Dlv (for screening purposes or as a surrogate for clinical assessment) may result in unreliably separating NePC presence from non-presence in patients with chronic pain in clinical outpatient practices and in research settings. Catch-22 situations in the validation of screening tools can be prevented by not pre-stratifying the patients on basis of the target outcome before inclusion in a validation study for screening instruments. For now, classifying patients pain still needs the clinical assessment based on history and physical examination including bed-side sensory testing by the physician and cannot be replaced by the use of the PainDETECT.

Additional file
Additional file 1: Table S1. The AUC and the sensitivity / specificity at the optimal cut-off point of the PainDETECT under the condition of equal costs of misclassification to classify a NePC by the diagnosis and the grading system of the physicians for the total group and according the pain locations. (PDF 423 kb)