Deflazacort for the treatment of Duchenne Dystrophy: A systematic review

Background To complete a systematic review and meta-analysis based on the clinical question: Is Deflazacort (DFZ), a prednisolone derivative, an effective therapy for improving strength, with acceptable side effects, in children with Duchenne Dystrophy (DD)? Methods MEDLINE, EMBASE, Current Contents, Dissertation Abstracts, Health Star, PsychINFO and Cochrane, were searched using the following inclusion criteria: 1) A randomized controlled trial comparing DFZ with placebo or another therapy; 2) Male participants age 2–18 years with DD; 3) Outcomes of (a) any form of strength or functional testing, or (b) any form of side effect. Results Fifteen studies of potential relevance were identified, with five meeting the inclusion criteria. These five studies included 291 children and were published in English language journals between 1994 and 2000. Two studies compared DFZ versus placebo, two studies compared DFZ with prednisone and one study had both placebo and prednisone comparisions. Two large trials were identified that have not been published in article format. Due to the heterogeneity in outcome measures and the inconsistent reporting of summary statistics a meta-analytic approach could not be taken. Conclusions Examining individual studies it appears that DFZ improves strength and functional outcomes compared to placebo, but it remains unclear if it has a benefit over prednisone on similar outcomes. Two trials found that DFZ causes less weight gain than prednisone.


Background
Duchenne Dystrophy (DD) is a chronic degenerative muscle disorder that becomes evident in mid-childhood. Several therapies have been examined with the aim of slowing the progression of muscle weakness in children with DD. Corticosteroids have been the drug of main interest with prednisone demonstrating evidence of improvement in some patients and a lack of deterioration in others [1]. Multiple randomized trials have found improved function and strength in children treated with prednisone [2][3][4][5]. Unfortunately, in these studies prednisone had a great deal of side effects which may temper its usefulness. A methyloxazoline derivative of prednisolone, deflazacort (DFZ), has shown some promise in providing similar effects to prednisone with a less concerning side effect profile [6,7]. If both drugs are similarly effective in improving strength, and if weight gain is less evident with DFZ then improvements in functional strength may exceed those seen with prednisone. A recent call has been noted for a systematic examination of the evidence for the effectiveness and safety of DFZ in the DD population [8].
Thus, the clinical questions that arise are whether DFZ is an effective therapy for improving strength in children with DD?; is DFZ more effective than other therapies?; and, does DFZ have fewer side effects than other steroid agents used in the treatment of DD? This systematic review attempted to answer these queries by identifying randomized controlled trials of DFZ in the DD population with the goal of completing a meta-analysis of the treatment effects and safety profile.

Methods
The review was conducted adhering to the principles of the QUORUM statement [9]. The techniques for identifying relevant studies included several different procedures. First, electronic database sources were searched using Ovid without restriction on language or publication status. The original search strategy is given in Appendix A [see Additional file 1]. At first, the search included specifics to identify randomized trials but, the search was broadened to find any articles with the population terms "Duchenne muscular dystrophy", "muscular dystrophy" or "myopathy" and the treatment terms "deflazacort", "21-deacetyl deflazacort" or "steroid". The databases Second, reference lists from the selected trials were reviewed as well as reference sections from textbooks and review articles. Third, four content experts were contacted to try to identify unpublished trials. Finally, the company that is manufacturing DFZ for use in Canada, Aventis, was contacted to identify trials that were conducted internally.

Trial Selection
The authors assessed the title and abstract during the searches and retained references that appeared relevant. Every effort was made to be liberal in the approach to reference selection. A similar approach was taken with reference list searching. Once a reference was selected the full text article was obtained and assessed for eligibility criteria.
Trials were eligible for inclusion in the systematic review if they met the following criteria: 1) A randomized controlled trial comparing DFZ with placebo or another standard therapy; 2) The participants were male children age 2-18 years with genetic or muscle biopsy proven DD; 3) The outcome measure was either (a) any form of strength testing, including functional testing, or (b) any form of side effect.

Trial Quality
All trials meeting eligibility criteria were then assessed for quality using two measures. The first was the Jadad scale [10], which assigns one point each for a trial described as 1) randomized, 2) double-blind, and 3) that outlines withdrawals and drop-outs. An additional point is given if the method of randomization and double-blinding (one point each) is described and is correct and conversely a point is deducted for each aspect if the method is incorrect. A total score of five is possible with a score of three or greater considered high quality. The second was a rating of treatment allocation concealment including three categories: adequate concealment, inadequate concealment and unclear [11]. The quality ratings were done by both authors independently and one author (PJ) was blinded to the study authors and journal of publication. The Jadad scale and rating of allocation concealment were applied to abstracts, as well, recognizing that these scales have not been developed nor validated for this purpose but finding no other suitable way to assess the quality of abstracts.

Data Abstraction
One author (CC) reviewed all studies meeting eligibility criteria and abstracted data in the following areas: study description (language, year, centres, publication status, funding source); trial design (design, quality, sample size); population (method of diagnosis, age, stage of disease at baseline); intervention (dosage, timing, duration, compliance, co-interventions); comparitor (placebo vs. drug, dose, timing, duration); outcomes (strength measures, task speed, change over baseline); adverse effects (frequency and severity of weight gain, osteoporosis, cataract, etc.).
In cases where little numerical data was presented the study authors were contacted and asked to provide quantitative summary data with standard deviations. In cases where no standard deviations were given these were imputed based on presented p values assuming they were derived by t tests with common variance.

Outcomes
Outcome data included any measure of strength and included direct muscle strength testing, timed functional tasks and time to loss of ambulation. Outcomes in trials comparing DFZ to placebo were examined separately from those comparing DFZ to another steroid. Data for side effect and safety assessment was primarily focused on weight gain, however, other side effects were included when reported.

Sensitivity, Subgroup Analysis and Publication Bias
In the initial planning of the systematic review the goal was to explore differences in trials with respect to quality and design characteristics such as blinding. A secondary goal of the review was to present findings in different age groups and stages of disease. Due to the relatively few studies and lack of presentation of data amenable to such analysis this aspect was unable to be completed.
The assessment of publication bias was abandoned given the identified publication issues in this particular group of studies. This will be discussed further in the results and discussion.

Results
The citations identified during the various search techniques are presented in Table 1. For electronic database searching a broad search strategy was used, as a narrow search strategy (including filters for randomized trials) identified very few studies to review. Reassuringly, although the broad search strategy in MEDLINE elicited 15 times more studies initially, the exact same relevant studies were identified using both techniques. As might be expected by its thorough hand searching techniques, the Cochrane Library was the most fruitful database to search and had the highest ratio of identified citations to potentially relevant studies. Cochrane searching also identified four abstracts, none of which were identified in other searches.
Searching article and textbook reference lists identified two additional references, however neither resulted in studies meeting eligibility criteria. Two of four content experts responded to our correspondence, with one identifying a study not previously identified via other methods. Aventis did not know of any trials, conducted or sponsored, that were not already in the literature.
After eliminating duplicate studies identified via different search techniques 15 studies of potential relevance were collected in full text and the study characteristics are listed in Table 2. One non-English language trial was not retrieved (Study J) but this was not based on the language of publication but rather that it appeared to be a preliminary report on a trial that was also published several times in English (Studies D, H, L and O). Ultimately, five of these studies met the inclusion criteria (Studies A, B, F, G and N). Inclusion criteria violations are reported for the other 11 studies in Table 2. The characteristics of the studies meeting inclusion criteria are presented in Table 3. The included studies were published between 1994 and 2000. All were published in English language journals, with three of the five reports appearing in one journal. Three of the trials described being double blind while two others had no mention of blinding at all. None of the trials provided information on pre-defined clinically important differences expected or on sample size calculations. One trial used 2:1 randomization into treatment and control groups respectively but no reason for this design was provided. One trial used a four-arm randomization with placebo, prednisone and two doses of DFZ. In that study, the placebo group was dismantled at three months and these patients were rerandomized into the other arms.
A total of 291 patients were randomized, however, because two trials did not specify the size of the groups it is not clear how many received treatment compared to control. One trial made up 73% (n = 196) of the total participant number. Four of the trials included appeared to arise from one group of investigators and these appeared to be financially supported by the same agency.
The Jadad scale score and allocation concealment status is given in Table 3. Quality assessment was consistent between authors with identical scores on four of five studies. The disagreement came on the description of dropouts in study B with one author assigning no point and the other one point. Following review of the article the consensus was that an adequate account had been given in the text regarding dropouts and the study was given one point for this criterion. Two studies received a score of 3 on the Jadad scale. None of the studies made any comment on allocation concealment techniques. Attempts to clarify methodological information from the authors of the two studies found only in abstract form were unsuccessful.
The dose of DFZ was relatively uniform over the four trials ranging between the equivalent of 0.9 and 1.2 mg/kg each day. In only one trial was any mention made of co-interventions and in this case it was dietary advice to help control weight gain. Three trials compared DFZ to prednisone. Of these, two trials reported follow-up data to 12 months and one until three months. Three trials compared DFZ to placebo. One had 24 months follow-up, however, marked drop out occurred in the second year so reliable data was only available until the 12 month time period. The second trial comparing placebo to DFZ was the four-arm trial (Study N) but because of substantial differences found in strength which was in favor of steroids the placebo arm was discontinued at three months time.
Fourteen different strength measurements were used in the four trials. As anticipated, three major categories of testing were evident: (1) Direct strength measures, (2) Functional measures and (3) Time to loss of ambulation. Table 4 describes each outcome measure and in which study it was used.

ENMC = European Neuromuscular Centre
The results of the three trials comparing DFZ and prednisone are given in Table 5. Not one measure of strength overlapped between studies. Only two studies reported numerical forms of summary statistics but none of the studies reported standard errors. Only one study provided p-values for statistical significance but these were for the steroid groups versus placebo. In this case standard errors were imputed, and sample size had to be estimated, but given the similarity between the point estimate for mean change in strength between the placebo and prednisone group the standard errors would have had to be quite small to invoke comparing the values through a t-test. Because of the limited data and heterogeneity in strength measures meta-analysis could not be performed. Two of the studies were presented in the form of abstracts and this may have limited the ability to present appropriate figures for meta-analysis. The author of the largest study was contacted in order to obtain original data but indicated that the data was unavailable for analysis.
Similar difficulties arose in comparing the results in the three studies using placebo as the control as seen in Table  6. Although one study used a wide array of strength measures and included standard errors and p values, the other studies only provided point estimates or no numeric values. Attempts to convert the data to mean percentage change from baseline were not successful, as one study did not present the baseline values. Again, it was impossible to use this data for meta-analysis. The results from Study B provided the most thorough and convincing evidence for the superiority of DFZ to placebo. In seven of the 11 tests performed to measure strength there was a significant difference in favor of DFZ and in the other four tests not reaching statistical significance, the point estimates for change from baseline all favored DFZ. The most clinically important result was the 12.7 month difference in time to wheel chair use between the groups. Table 7 outlines the five studies that reported on side effects. Measures of side effects focused on weight gain, as expected, but other documented side effects included osteoporosis, hirsuitism, cushingoid appearance, increase appetite, behavior changes and cataract formation. All five studies commented on weight gain. Two studies did not find a significant mean difference in weight gain compared to placebo but did not present the numerical values. This result was surprising, as these studies compared DFZ to placebo. The other two studies found that there was an 11.3 % and 8.4% difference in mean weight gain (as expressed as a percentage increase from baseline weight) between DFZ and prednisone groups, favoring DFZ. The final study reported the difference between prednisone and DFZ as minimal. No standard errors were presented in these studies and in one study group sizes were unknown so these results were not combined using metaanalytic techniques.

Discussion
This review included five randomized controlled trials comparing DFZ treatment to a control group. One further RCT was identified (study D, L, H, O and likely J) but was not included because the outcomes have never been published. These six studies have included 391 participants in at least three different countries. Unfortunately, the two largest trials, including 196 [12] and 100 [13] patients have never been published in journal article format for peer review. The first of these two trials (Study N) was published in abstract form but, as was demonstrated in Table 5 and 6, the reporting of the trial in this format did not allow the reader to gain insight into the efficacy of DFZ. The second large trial (Study D, H, L, O, and likely J), a multi-centre study, was last reported on in 1998 and compared prenisolone with DFZ. At the time of the meeting proceeding the randomization code had been broken, however, the proceeding gave no numeric indication of the results and suggested that "both steroids slowed the degradation equally well" [13]. The investigator of this study confirmed that the results remain unpublished and that data is not currently available for use in meta-analysis.
The fact that data remains unpublished from these two trials is an unfortunate situation from both a clinical and a research ethics perspective. As this review has demonstrated, there is no convincing body of evidence to support the superiority or, on the other hand, the equivalence of prednisone and DFZ for improving or maintaining strength or function in children with DD. This issue would certainly be clarified by publication of this data. Access to DFZ is difficult in Canada and the United States as there is currently very little information to support its use. From an ethical perspective, participants in research studies deserve that the results of their outcomes in trials will be used to improve care for others in similar circumstance. In this scenario the participants have not been well served for the risks they incurred by entry into the studies. From a methodologic perspective, the findings of this review support the belief that unpublished data should be sought and included in systematic reviews and meta-analyses [14].
Of the studies eligible under the criteria and mandate of this review, only one study provided evidence of the effi- The sum of MRC scale in four muscles. Two in the right upper extremity and two in the right lower extremity. Maximum score 20. [32] A, G

MRC index
The MRC score divided by the maximum possible score of 20 then multiplied by 100 to give a percentage representation of the MRC score. [32] B Average muscle score Called Strength score in previous trials. [33] This is the mean grade of 34 tested muscles on a ten point scale (modified from the MRC system).

Functional Measures
Timed gait Number of seconds needed to walk for 10 metres. A, B Timed stairs Number of seconds needed to walk up four stairs. A, B Timed chair Number of seconds needed to rise from seated in a chair to standing.

A, B Timed Gower
Number of seconds needed to rise from a sitting position on the floor to standing. A, B Graded gait A score assigned for 10 m of walking. Points assigned for increasing dysfunction for a range between zero and seven. [32] A, B Graded stairs A score assigned for walking up four stairs. Points assigned for increasing dysfunction with a range between zero and seven. [32] A, B Graded chair A score assigned for rising from a chair to standing. Points assigned for increasing dysfunction with a range between zero and six. [32] A, B Graded Gower A score assigned for rising from a sitting position to standing. Points assigned for increasing dysfunction with a range between zero and seven. [32] A, B

Others
Patient reported benefit No description of measure available.  B). This study had a placebo control and used 11 different methods of measuring efficacy. The sample size was relatively small, 28 patients total, but was able to demonstrate significant differences between placebo and DFZ on seven of 11 strength measures. The other four tests all showed point estimates favoring DFZ. The second study demonstrating efficacy of DFZ over placebo (study N) followed the placebo group for only three months and the third study in this groupclaimed an improvement in strength. Unfortunately, given the minimal information in these latter two studies the results could not be synthesized by meta-analysis. The evidence from these three randomized trials suggests a clinically and statistically significant benefit on measures of strength with the use of DFZ compared to no treatment (placebo).
The other eligible studies (study A, F, and N) all compared DFZ to prednisone, and as demonstrated in Table 5 and discussed in the result section, no reasonable conclusions can be drawn from these studies. None of the studies described what was considered a minimal clinically important difference or the power of the study to detect it. All the authors concluded that there was no substantial difference between DFZ and prednisone with one author (study A) stating that the two therapies were equally effective.
Weight gain was the side effect which received the most focus in the studies examined. Two studies (study A and N) provided numerical comparisons with statistical tests comparing the mean weight gain as a percentage of baseline weight. These two studies both found an approximately ten percent difference in weight gain favoring DFZ over prednisone. No age by treatment or severity by treatment effects are reported to determine who is at the greatest risk for weight related problems. Thus, the evidence from these two trials show that weight gain is less of a problem on DFZ and this should be considered in therapeutic decisions. Unfortunately the incidence of side effects was infrequent and not often reported by treatment group making conclusions difficult. A body of anecdotal and observational literature exists regarding side effects

Other Measures
Patient reported benefit -Patients reported benefit on both prednisone and DFZ. No numeric representation of reported benefit was reported.
-A dash represents that this outcome measure was not used. MRC = Medical Research Council * For MRC score, MRC index and muscle score higher scores are more favorable. ** For Graded scores a lower score is more favorable.

Other Measures
Time to loss of ambulation DFZ: 33.2 +/-9 m Placebo: 20.5 +/-11 m p < 0.005 --A dash represents that this outcome measure was not used. MRC = Medical Research Council * For MRC score, MRC index and muscle score higher scores are more favorable. ** For Graded scores a lower score is more favorable.  --A dash represents that this side effect was not reported.
but at the current time conclusions can not be drawn from randomized trial sources or meta-analysis.
Quality assessment of randomized controlled trials for the purpose of systematic review and meta-analysis is difficult with numerous scales available [15]. In this case the Jadad scale was chosen as it is a well validated scale [10]. An indication of allocation concealment was also used as is suggested in the Cochrane guidelines [16]. The studies identified for this review were generally of low quality with an average Jadad score of 1.8 and none commenting on allocation concealment. This may be primarily due to the fact that three of the studies, with scores of 2 (Study N), 1 (Study F) and 0 (Study G) were available only in abstract form, which is not what the Jadad scale was ideally intended for. Furthermore, the trials were largely published prior to the widespread use of CONSORT guidelines. The quality of reporting of summary statistics was poor, perhaps even more so than expected from the Jadad scores. The data presentation and lack of summary statistics in most cases did not allow the reader a thorough exploration of the outcomes, and certainly provided no ability to carry out a meta-analysis. The clinical conclusions from the individual studies included in this review must be interpreted with the knowledge that some evidence exists to suggest that studies obtaining lower quality scores may be associated with increased estimate of benefit [17].
As in any systematic review, the limitations in drawing conclusions are directly dependent on the thoroughness of reporting in identified studies. Despite an intention to synthesize the data using meta-analytic techniques this could not be accomplished due to the inability to obtain and extract data. In this respect the review is certainly limited, and any strong conclusions or changes in clinical practice should be tempered by the realization that the reporting of trial results was such that summary statistics were not available or consistently presented. This review does not use the flow diagram format suggested in the QUORUM statement as it was felt that with such a small number of relevant studies using a tabular format would be more informative. The approach to the reporting of the review otherwise adheres to the QUORUM statement [9].
Clearly, the most important component needed to determine the efficacy and safety of DFZ is the publication of the two large trials identified in this review. If these studies do not provide a meaningful answer to the efficacy and safety issues then calls for a further randomized trial would be necessary. A systematic review of observational studies, despite being practically and methodologically more challenging, may help to clarify the situation. Although randomized controlled trials are considered the highest quality of study, high quality observational stud-ies may not over estimate treatment effects to a point of concern [18].