The reliability and validity of a slightly revised Chinese version simplified modified Rankin scale questionnaire

Background: The slightly revised English version simplified modified Rankin scale questionnaire smRSq(2011) was shown to be reliable, valid, and useful in scoring the modified scale (mRS) after stroke. Our aim was to assess the inter-rater reliability and validity of a novel Chinese version smRSq(2011). Methods: The English version smRSq(2011) was translated into Chinese by a standard process. We recruited 300 consecutive hospitalized ischemic stroke patients in the department of neurology, Beijing Chaoyang Hospital. Six randomly paired raters scored the conventional mRS, the novel Chinese version smRSq(2011), the National Institutes of Health Stroke Scale (NIHSS), and the Barthel index (BI) in-person. Inter-rater reliability and validity were assessed. Results: Among the 300 ischemic stroke patients, mean age was 64.9±12.1 years, and 220 (73%) were male. For inter-rater reliability of the smRSq(2011), the percent agreement among the paired raters was 87%, the kappa (κ) was 0.84 (95% CI, 0.79-0.88), and the weighted kappa (κw) was 0.96 (95% CI, 0.95-0.98). The percent agreement between the smRSq(2011) scores by the first rater and the conventional mRS scores by the second rater in each pair was 55%, κ=0.47 (95% CI, 0.40-0.54), and κw=0.91 (95% CI, 0.89-0.93). In construct validity testing, the Spearman’s correlation coefficients comparing the smRSq(2011) scores by the first rater with the NIHSS and the BI scores by the second rater were 0.83 (P<0.001) and -0.86 (P<0.001), respectively. Conclusions: Our results suggest usefulness of the novel Chinese version smRSq(2011) in scoring the mRS in Chinese stroke patients. Further validation in other clinical settings, including in communities and by remote methods in China is warranted.


Background
Stroke is quite common throughout the world [1,2] including China [3]. Assessing functional outcomes after stroke accurately and reliably is a critical part of clinical research and stroke registries. The modified Rankin Scale (mRS) is the most commonly utilized scale for assessing functional outcome after stroke [4]. However, scoring the mRS involves some subjectivity and its reliability is limited [5]. Multiple mRS scoring aids have been reported to improve its reliability [6][7][8][9][10].
The simplified mRS questionnaire (smRSq)(2010) was initially reported in 2010 with good reliability among a wide variety of raters in addition to being relatively simple to use and an average time to 3 score the mRS of <2 minutes [7]. The initial version of the smRSq has been translated and validated in Chinese stroke patients [11].
A slightly revised version of the smRSq(2010), with improved agreements for mRS scores 3 to 5 (moderate to severe disability) was reported in 2011, the smRSq(2011) [9]. A panel of experts for the International Consortium of Health Outcomes has recommended the smRSq(2011) for standardized mRS scoring [12]. In this study we test the clinimetric properties of a novel Chinese version smRSq(2011). A useful Chinese version smRSq(2011) could facilitate the collection of internationally standardized functional outcome data in Chinese stroke patients. More accurate and standardized data could lead to a better understanding of stroke prognosis in China.

Materials And Methods
First, we translated the smRSq(2011) from English to Chinese with forward-backward translation to allow for inconsistency detection, and the draft questionnaire was checked for face validity, as with our prior translation process of the original smRSq [11]. We anticipated that the Chinese version We enrolled 300 consecutive ischemic stroke patients in the department of neurology, Beijing Chaoyang Hospital, between July and December 2014. We used the World Health Organization definition of stroke [13]. All strokes were confirmed by CT or MRI.
Six stroke experienced physicians performed all the ratings within 7 days after admission blinded to the patients' clinical data and to the other raters' scores. The six raters were randomly allocated into 300 pairs, one pair for each patient. The first rater in each pair assesses a patient on day one and the second rater on day two, in order to minimize the risk of change in the patient's condition. If patients could not answer the questionnaire, we interviewed their caregivers when possible. Each rater scored the conventional mRS first, followed by the Chinese version smRSq(2011), and the National Institutes of Health Stroke Scale (NIHSS). Only the second rater scored the Barthel Index (BI), either before or after the NIHSS. The NIHSS indicates stroke severity. The Barthel Index (BI) measures activities of daily living. Each rater estimated their average time to score the smRSq.

4
Our study was approved by the ethics committee of Beijing Chaoyang Hospital, Capital Medical University. Every patient gave a valid informed consent to participate.

Statistical Analysis
For inter-rater reliability of the conventional mRS and the smRSq(2011), we compared scores between the first and the second rater. We calculated the percent agreement and determined kappa (κ) and weighted kappa (κ w) with 95% confidence intervals (CI). For validity of the smRSq(2011) against the conventional mRS, we compared the smRSq(2011) scores of the first rater to the mRS scores of the second rater. We correlated the Chinese version smRSq(2011) scores by the first rater with the NIHSS scores and the BI scores by the second rater using the Spearman's correlation. We used the Statistical Package for Social Sciences (SPSS) version 16.0 (SPSS Inc., Chicago, IL, USA) for data analysis. P values less than 0.05 were considered statistically significant.
In construct validity testing, comparing the smRSq(2011) scores by the first rater with the NIHSS scores by the second rater, the Spearman correlation coefficient was 0.83 (P<0.001). Comparing the smRSq(2011) scores by the second rater with the NIHSS scores by the first rater gave a similar result (Spearman correlation coefficient 0.82). Comparing the smRSq(2011) scores by the first rater with the BI scores by the second rater, the Spearman correlation coefficient was -0.86 (P<0.001).

Discussion
Our primary objective in this study was to test a novel Chinese version smRSq(2011) (Figure 1.e). We assessed the inter-rater reliability of the smRSq(2011) and validated it against the conventional mRS interview. We tested the construct validity of the smRSq(2011) against the NIHSS and the BI. We found good reliability and validity of the Chinese version smRSq (2011).
The inter-rater reliability of the Chinese version smRSq(2011) was slightly better than that of the conventional mRS (κ=0.84 vs κ=0.76, respectively), and that of the Chinese version original smRSq(2010) [11]. In this study the inter-rater reliability is similar to those reported with a structured interview mRS [5,6,14].
Construct validity testing showed good correlations between the smRSq(2011) and the NIHSS and the BI. This result is consistent with other validity studies using the conventional mRS [15], the English version smRSq(2011) [16], and our prior Chinese smRSq(2010) study [11].
As in our prior study with the Chinese smRSq(2010), the Chinese smRSq(2011) questions were understood by the majority of patients and caregivers with little or no explanation, and the scale was easily performed by the raters. Time to score the smRSq was relatively brief (average 70 seconds).
This study has limitations. First, the paired ratings were done on two consecutive days, which may have introduced some recall bias. To limit recall bias we instructed the patients to treat each interview independently of the others. Second, the mRS should ideally be scored after some period of recovery from stroke and in a community setting. Thus, although the scores in our patients likely do not represent their ultimate functional outcome, the paired ratings were done under similar circumstances. Third, we did not test the novel Chinese version smRSq (2011) over the telephone or via telemedicine, and remote outcome assessments are often more practical than in-person assessments.

Conclusion
In conclusion, this study demonstrates good reliability and validity of the novel Chinese version

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon request.

Authors' contributions
JLY, WLH and AB conceived and designed the experiments. JLY analyzed the data and drafted the manuscript. JLY and YXW collected and analyzed data. All authors have read and approved the final manuscript to be published.

Ethics approval and consent to participate
Our work was approved by the Ethics Committee of Beijing Chaoyang Hospital. This study was performed in accordance with the Declaration of Helsinki and all authors agreed the publish statements of BMC Neurology.

Consent for publication
Written informed consent was obtained from all the subjects.  Tables   Table 1. Cross-tabulation between the smRSq(2011) scores by the first and the second rater in each pair.  Total  99  26  26  33  56  60  300   Table 2. Cross-tabulation between the smRSq(2011) scores by the first rater and the conventional mRS scores by the second rater. Bubble plot of agreements between the smRSq(2011) scores by the first and second rater in each pair (diameter of bubbles represents count at each point).

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.