Longitudinal randomised controlled trials in rehabilitation post-stroke: a systematic review on the quality of reporting and use of baseline outcome values
© Sauzet et al. 2015
Received: 22 April 2015
Accepted: 27 May 2015
Published: 1 July 2015
The World Health Organisation stresses the need to collect high quality longitudinal data on rehabilitation and to improve the comparability between studies. This implies using all the information available and transparent reporting. We therefore investigated the quality of reported or planned randomised controlled trials on rehabilitation post-stroke with a repeated measure of physical functioning, provided recommendations on the presentation of results using regression parameters, and focused on the difficulties of adjustment for baseline outcome measures.
We performed a systematic review of the literature from 2011 to 2013 and collected information on the way data was analysed. Moreover we described various approaches to analyse the data using mixed models illustrated with real data.
Eighty-four eligible studies were identified of which 61 % (51/84) failed to analyse the data longitudinally. Moreover, for 30 % (25/83) the method for adjustment for baseline is not known or not existent. Using real data we were able to show how much difference in results an adjustment for baseline data can make. We showed how to provide interpretable intervention effects using regression coefficients while making use of all the information available in the data.
Our review showed that improvements were needed in the analysis of longitudinal trials in rehabilitation post-stroke in order to maximise the use of collected data and improve comparability between studies. Reporting fully the method used (including baseline adjustment) and using methods like mixed models could easily achieve this.
KeywordsStroke Rehabilitation Physical functioning Longitudinal analysis Baseline values Regression
In 2011, the World Health Organisation (WHO) published their World Report on Disability , providing a framework “for disability data collection related to policy goals of participation, inclusion, and health. [Using it] will help create better data design and also ensure that different sources of data relate well to each other” (p. 45). In the rehabilitation chapter of this report, the lack of randomised trials in rehabilitation research is mentioned and the necessity of collecting comparable outcomes from various sources is pointed out. The report mentions the importance of longitudinal data to understand the “dynamic of disability”. Consequently, it is important in rehabilitation research not only to collect quality data but also to make the best use of it. This includes using all the (statistical) information contained in the data collected, providing the maximal transparency in the description of the methodology, and presenting informative intervention effects.
In order to reflect the dynamic nature of an intervention, the analysis of repeated measures must take the longitudinal nature of the data into account. This presents some difficulties due to the dependence of the measures reported by the same patients. Another less well known difficulty concerns adjusting the effect of intervention for the reduction to mean using baseline outcome values . Moreover, the interpretability of results is paramount for the comparability between studies. Reporting regression parameters with confidence intervals rather than p-values allows the interpretation of the effectiveness of an intervention in term of outcome measures. But this form of reporting, however, is done rarely [3, 4].
The aim of this paper is to present the results of a systematic review of the analysis of measures of physical functioning in randomised controlled trials evaluating interventions in rehabilitation post-stroke. The reasons some approaches are sub-optimal are discussed and we provide recommendations on how to present results using regression coefficients and confidence intervals [5–7]. Those recommendations are illustrated with data from the BOMeN study (Berufliche Orientierung in der Medizinischen Neurorehabilitation [Occupational Orientation in Medical Neurorehabilitation]), a RCT to evaluate the effectiveness of a return to work oriented intervention during residential rehabilitation of stroke and brain damaged patients [8, 9].
In December 2013, the databases Medline, Medpilot, Cochrane Library, and Scopus/SciVerse were searched for articles reporting RCTs or protocols of RCTs on the rehabilitation of stroke patients with a measure of physical functioning. Studies with only one post-intervention measure, no measure of physical functioning, and brain injuries not due to a stroke were excluded from the review. Systematic reviews were also excluded. In order to reflect recent practices, we restricted our search to articles published in 2011 or later. The MeSH terms are given in the online supplement, please see Additional file 1. All extracted studies were screened independently by two of the authors for eligibility by reading the title and abstract. The full texts of all eligible studies were obtained.
Description of studies
Body function outcome
19 % (16/84)
6 % (5/84)
Primary and secondary
56 % (47/84)
13 % (11/84)
Multiple primary outcomes
29 % (24/84)
Number of arms
81 % (68/84)
Number of patients per arm
Type of study
Comparison of treatment
35 % (29/84)
Comparison with placebo/usual care
80 % (67/84)
5 % (4/84)
Duration of follow-up
median (range) of duration
3 (0.3–60) months
Number of follow-up measures
55 % (46/84)
35 % (29/84)
Other (3,5, unclear)
11 % (9/84)
Method of analysis
Measurement of outcome
Baseline- 1. After intervention- 2. Follow-up
46 % (39/84)
54 % (45/84)
aMore follow-ups after intervention or assessment during intervention
Repeated measure data was analysed
Cross sectional at each time-point
38 % (32/84)
39 % (33/84)
Both longitudinal and cross sectional
8 % (7/84)
Repeated data not fully analysed
14 % (12/84)
Method of analysis
22 % (10/46)
35 % (16/46)
7 % (3/46)
Non parametric test/dichotomised data
37 % (17/46)
Correction for multiple testing due to repeated measures
76 % (31/41)
21 % (7/33)
Repeated measure ANOVA
72 % (24/33)
Generalised Estimating Equations
6 % (2/33)
Mean (SD)b at each time-point per group
66 % (47/71)
F values (for ANOVA/ANCOVA)
54 % (19/35)
71 % (6/12)
8 % (6/71a)
Use of baseline data in the primary analyse of physical functioning
99 % (83/84)
Method of adjustment
Mentioned in Methods
64 % (53/83)
If not, mentioned in Results
6 % (5/83)
13 % (11/83)
17 % (14/83)
Use of baseline data in the
Difference from baseline
33 % (19/58)
31 % (18/58)
29 % (17/58)
Used to compute a dichotomised outcome
3 % (2/58)
3 % (2/58)
The results of the review are presented in descriptive tables with absolute and relative numbers of articles for each item. The report of this review follows the PRISMA checklist . This article being based on a review of the literature and is methodological in nature therefore no ethical approval was required.
Models for the analysis of longitudinal data on rehabilitation
The discussion of the systematic review’s results is illustrated with examples and recommendations using mixed models. We show that the intervention effects can be reported using regression parameters. We provide suggestions on the presentation of method and results illustrated with data from the BOMeN study, analysed using Stata 12 . The BoMeN study (Berufliche Orientierung in der Medizinischen Neurorehabilitation [Occupational Orientation in Medical Neurorehabilitation]), was a RCT performed from 2007 to 2009 in two residential neurological rehabilitation clinics in Germany which evaluated the effectiveness of a return to work oriented intervention during residential rehabilitation of stroke and brain damaged patients. For the BoMeN study, the approvals of the ethic committee of the Medical chamber of Westfalen-Lippe and of the Faculty of Medicine of the Westfälischen Wilhelms-Universität Münster were obtained. Patients recruited included 93 women and 205 men aged 22 to 60 years and 15 to 60 years respectively. The total duration of follow-up was 15 months after the rehabilitation was concluded. The intervention consisted among other in a patient education programme and a better inclusion of workplace related needs in the therapeutic plan. For more detail see [8, 9]. While the primary aim of the study was to compare proportions of patients in work at each time point, the questionnaire FS-36 was also used to collect information about quality of life. The questionnaire has been answered at at least one follow-up time by 295 patients. We computed the physical functioning sub-score and used in the examples presented here.
We identified 84 eligible studies, 13 of which were protocols. The complete flowchart is available given as online supplement, please see Additional file 2. The study characteristics are presented in Table 1. Most studies had a measure of physical functioning as a primary outcome (68/84, 81 %) and 29 % (24/84) presented multiple primary outcomes in line with the recommendation of the WHO report on disability to reflect the diversity of the aspect of the International Classification of Functioning.
All results regarding the statistical analysis are presented in Table 2. Only 39 % (33/84) of studies performed a longitudinal analysis of the data. Other studies analysed the data cross-sectionally (32/84, 38 %), mostly at each measure time-point, thus losing the dynamics contained in the data. In twelve studies (14 %) not all the collected longitudinal data was analysed, thus a considerable amount of information available was ignored.
The results presented for the 71 studies which were not protocols included mostly mean and standard deviations at each time-point and for each group (47/71, 66 %) but no overall effect of the intervention over time was ever presented. For almost a third of studies (25/83, 30 %) it is unclear if baseline outcome values were used in the analysis. For data analysed longitudinally, the most common model estimated a time-group interaction and 14 studies from 21 used baseline as a time-point in the regression. Two studies used a change from baseline in a longitudinal analysis which means that outcomes at different time-points were not comparable. In all others, longitudinal analysis baseline was covariate in the model.
Our review has shown that baseline measures are consistently collected but not always adjusted for. Moreover, 52 % of studies ignored the longitudinal nature of the data among which 14 % did not use all the follow-up data available. This is evidence that a lot of the information collected and available is not used. Moreover, analysis based on the analysis of variance (for example repeated measures ANOVA) seems to remain popular even when the limitation of these relative to regression based method like mixed models have been often presented in the literature [4, 12].
We outline the difference between analysis of variance (ANOVA) and regression models. The ANOVA is a generalisation of the t-test and compare the means of several groups of patients. A regression model provides a relationship between an outcome and some predictors with an error term. The regression coefficient for the group effect is the effect of the intervention.
Mixed models are regression models in which the non-independence of the measures taken on the same patient is accounted for, in its simplest form, by allowing the constant in the model (intercept) to vary between patients (random intercept model). This is described as a random effect. In such models the effect of the intervention is the same for all and is given by the coefficient obtained for the intervention group. This is a fixed effect. Because the model consists of fixed and random effect, they are called mixed models. In studies with long-term community based follow-up, the number of patient with some non-completed follow-up can be large. Valuable information is nevertheless available for those patients. Repeated measures ANOVA can only take into account patients with all follow-up measurements . Mixed models use data from all patients with at least one post baseline measure [5–7], thus making the most of the data available. Of course, missing measures and loss-to-follow-up should be avoided in the first place by careful planning and by developing effort to track down patients who have moved. Another advantage of mixed model is that a large variability between patients in the actual time the measurement were taken can be taken into account. This is done by including the continuous time variable in the model.
A particular study design is motivated by the aims and the settings of the intervention. An intervention limited in time because performed in an inpatient medical institution (hospital or rehabilitation clinic) may have good short-term effects but the long-term effects are to be evaluated. For long-term community based interventions, the dynamic of the intervention may be more of interest. We illustrate the limitations of various approaches encountered in the review with the physical functioning score at three weeks, six, 12, and 15 months of the FS-36 questionnaire from the BOMeN study.
Reduction to mean
We illustrate the effects reduction to the mean using a mixed model to estimate the overall effect of the intervention over time. Reduction to the mean occurs when there are some extreme outcome values at baseline which will see stronger effects than the values close to the mean. Consider three subsets of our dataset: A. only patient with scores in a middle range; B. patient with score in a middle to upper range (add patients with worse conditions at baseline than in A); C. score in a lower and middle range (add patient with better conditions at baseline than in A). An overall reduction in score (negative regression coefficient) indicates an overall better physical functioning.
Illustration of the effect reduction to the mean when no adjustment for baseline outcome values is performed
Patient characterisation at baseline
Intervention’s effect (standard error) obtained with:
Consequence of the reduction to the mean
No adjustment for baseline values
Baseline values as a covariate
No major consequence
Middle + worse condition
Middle + better condition
Difference from baseline
Redefining the outcome as difference from baseline is problematic and can be easily avoided. If the data is analysed longitudinally, then the outcome value has a different meaning at each time-point due to the varying time laps between baseline and time-point. If the analysis is cross-sectional (i.e., one post intervention measure) then the best approach is to have baseline as a covariate in the model .
Results of data analysis for specific endpoints (N = 295)
Difference from baseline
Difference between the groups in score decrease per week.
Cross-sectional effect at each time point
Group difference at t1
Change in group difference t2 - t1
Change in group difference t3 - t1
Cross-sectional analysis at each time-point
Cross-sectional analysis at each time-point should be avoided because the dynamic within each patient is lost. There is also a loss of power due the necessary correction for multiple testing. A correct procedure is to use a mixed model with time as categorical variable with time-group interactions and baseline outcome values as a covariate. The model provides an estimate of the intervention effect at each time-point making a maximal use of the data available (Table 5).
Using data from the BOMeN study we obtained that the intervention group had a score lower by 0.097 (SD: 0.072) score points than the control group at the first time-point (three week). Then at six month this difference was decreased by 0.0002, i.e., unchanged compared to three weeks. Then at twelve months, the difference between the groups is decreased by 0.033 score points compared to the first time-point to −0.064 score points. This means that the maximum effect of the intervention is seen directly at the end of the intervention (three weeks) and is sustained the first six months and then decreases.
Conclusion and suggestions
Our review has shown that not only the reporting of RCTs in the rehabilitation post-stroke needs improvement (see recommendation of the CONSORT statement ) but also the method of analysis itself. A lot of collected information was lost. More methods are available for analysing longitudinal data which were not discussed here [6, 11]. We have attempted, using real data as an example, to show the consequences of using some of approaches which are sub-optimal. We also showed how results of a regression analysis can be presented in an informative way using regression parameters and confidence intervals. We recommend that, despite limited publication space, the primary research question should be clearly stated and the overall intervention effect over the duration of follow-up should always be reported. Secondly the intervention effect at the particular follow-up measurements (estimated from a longitudinal model with time represented by dummy variables and the interaction between time and the intervention variable) can be reported. Also by using time as a continuous variable an estimate of the overall rate of change can be obtained. This also applies to study protocols. All covariates and the method of adjustment for baseline should also be clearly indicated because they influence the estimated intervention effect.
We thank the reviewers for their very constructive comments which helped improving the manuscript a great deal.
We acknowledge the financial contribution granted to OS for this review from the Research Centre for Mathematical Modelling, Bielefeld University.
We acknowledge support of the publication fee by Deutsche Forschungsgemeinschaft and the Open Access Publication Funds of Bielefeld University.
- WHO. World report on disability. 2011. http://www.who.int/disabilities/world_report/2011/en/.Google Scholar
- Vickers AJ, Altman DG. Analysing controlled trials with baseline and follow up measurements. BMJ. 2001;323:1123–4.View ArticlePubMedPubMed CentralGoogle Scholar
- Gibbons RD, Hedeker D, DuToit S, editors. Advances in Analysis of Longitudinal Data, vol. 6. 2010.Google Scholar
- Ma Y, Mazumdar M, Memtsoudis SG. Beyond Repeated-Measures Analysis of Variance Advanced Statistical Methods for the Analysis of Longitudinal Data in Anesthesia Research. Reg Anesth Pain Med. 2012;37:99–105.View ArticlePubMedPubMed CentralGoogle Scholar
- Brown H, Prescott R. Applied mixed models in medicine. 2nd ed. Chichester and England and Hoboken and NJ: John Wiley; 2006 [Statistics in practice].View ArticleGoogle Scholar
- Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis, [Wiley series in probability and statistics]. 2nd ed. Hoboken and N.J.: Wiley; 2011.Google Scholar
- Hox JJ, Roberts JK. Handbook of advanced multilevel analysis. New York: Routledge; 2011 [European Association of Methodology].Google Scholar
- Menzel-Begemann A. Work-Related Medical Rehabilitation after Neurological Diseases. Aktuelle Neurol. 2013;40:507–12.View ArticleGoogle Scholar
- Menzel-Begemann A, Honemeyer S. BOMeN occupational orientation in medical neurorehabilitation. Gesndheitswesen. 2008;70:462.Google Scholar
- Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Grp. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. Ann Intern Med. 2009;151:264–W64.View ArticlePubMedGoogle Scholar
- StataCorp. Stata Statistical Software: Release 12. StataCorp LP: College Station, TX; 2011.Google Scholar
- Jos WR T. Applied longitudinal data analysis for epidemiology: A practical guide. 2013. Cambridge medicine.Google Scholar
- Schulz KF, Altman DG, Moher D. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMC Med. 2010;8:18.View ArticlePubMedPubMed CentralGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.