The Canadian prospective cohort study to understand progression in multiple sclerosis (CanProCo): rationale, aims, and study design

Background Neurological disability progression occurs across the spectrum of people living with multiple sclerosis (MS). Although there are a handful of disease-modifying treatments approved for use in progressive phenotypes of MS, there are no treatments that substantially modify the course of clinical progression in MS. Characterizing the determinants of clinical progression can inform the development of novel therapeutic agents and treatment approaches that target progression in MS, which is one of the greatest unmet needs in clinical practice. Canada, having one of the world’s highest rates of MS and a publicly-funded health care system, represents an optimal country to achieve in-depth analysis of progression. Accordingly, the overarching aim of the Canadian Prospective Cohort Study to Understand Progression in MS (CanProCo) is to evaluate a wide spectrum of factors associated with the clinical onset and rate of disease progression in MS, and to describe how these factors relate to one another to influence progression. Methods CanProCo is a prospective, observational cohort study with investigators specializing in epidemiology, neuroimaging, neuroimmunology, health services research and health economics. CanProCo’s study design was approved by an international review panel, comprised of content experts and key stakeholders. One thousand individuals with radiologically-isolated syndrome, relapsing-remitting MS, and primary-progressive MS within 10–15 years of disease onset will be recruited from 5 academic MS centres in Canada. Participants will undergo detailed clinical evaluation annually over 5 years (including advanced, app-based clinical data collection). In a subset of participants within 5–10 years of disease onset (n = 500), blood, cerebrospinal fluid, and research MRIs will be collected allowing an integrated, in-depth evaluation of factors contributing to progression in MS from multiple perspectives. Factors of interest range from biological measures (e.g. single-cell RNA-sequencing), MRI-based microstructural assessment, participant characteristics (self-reported, performance-based, clinician-assessed, health-system based), and micro and macro-environmental factors. Discussion Halting the progression of MS remains a fundamental need to improve the lives of people living with MS. Achieving this requires leveraging transdisciplinary approaches to better characterize why clinical progression occurs. CanProCo is a pioneering multi-dimensional cohort study aiming to characterize these determinants to inform the development and implementation of efficacious and effective interventions. Supplementary Information The online version contains supplementary material available at 10.1186/s12883-021-02447-7.

Scenario with n=100 RIS To assess a "micro" or "macro" factor of interest that is a continuous variable, we will estimate the odds ratio (OR) that can be detected when increasing the baseline factor level of 1 standard deviation above the mean value. An adjustment was made assuming that a multiple regression of the independent variable of interest on the other independent variables in the logistic regression to have a correlation of 0.4 (an R-squared of 0.16).

Report Definitions
Power is the probability of rejecting a false null hypothesis. Here it has been set to 80% or 90%. N is the size of the sample drawn from the population, here set to 100. P0 is the probability of conversion to MS at the mean of X, where X is the continuous variable to be studied (baseline biomarker). Here set to 50% (at 5 years). P1 is the probability of conversion to MS when X is increased to one standard deviation above the mean. Odds Ratio is the odds ratio when P1 is in the numerator. That is, it is [P1/(1-P1)]/[P0/(1-P0)]. R-Squared is the R2 achieved when X is regressed on the other independent variables in the regression. Alpha is the probability of rejecting a true null hypothesis. Beta is the probability of accepting a false null hypothesis.

Summary Statements
A logistic regression of a binary response variable (conversion to MS) on a continuous, normally distributed variable (X) with a sample size of 100 observations achieves 90% power at a 0.05 significance level to detect a change in the probability of conversion to MS after 5 years from the value of 50% at the mean of X to 67% when X is increased to one standard deviation above the mean. This change corresponds to an odds ratio of 2. An adjustment was made assumed that the variable of interest is correlated to other variables included in the model with a global R squared of 0.16. If we accept a power of 80%, the OR that we will be able to detect is 1.84.
Using the same assumptions, for a binary biomarker, the calculation are as follows: For this sample size estimation, placebo arms of Phase III clinical trials, that include patients selected for activity according to the mentioned criteria and that were treatment naïve.

Pcnt
According to the previous literature (De Stefano et al, JNNP 2014) a cut-off denoting a pathological percentage brain volume change (PBVC) over 1 year is -0.4% using the SIENA analysis method. According to a paper recently published by Opfer et al (Journal of Neurology 2018) in order to be sure that a patient has a brain volume loss higher than 0.4% over 1 year, the cutoff must be set to -0.94% (including physiological fluctuations and measurement error).
The outcome is therefore set a disability progression or a pathological brain volume loss over 1 year. Data from placebo arms of clinical trials indicate that the percentage of patients with a disability progression or a PBVC higher than -0.94% are around 35% (data on file).
The first case is for assessing a factor represented by a continuous variable, and we estimate the OR we can detect when increasing the baseline factor level of 1 standard deviation above the mean value. An adjustment was made assuming that a multiple regression of the independent variable of interest on the other independent variables in the logistic regression to have a correlation of 0.4 (an R-squared of 0.16).

Summary Statements
A logistic regression of progression over 1 year (defined as an EDSS progression event or a PBVC<-0.94%) on a continuous, normally distributed variable (X) with a sample size of 200 observations achieves 90% power at a 0.05 significance level to detect a change in the risk of progression from the value of 35% at the mean of X to the value of 24% when X is increased to one standard deviation above the mean. This change corresponds to an odds ratio of 0.59. For a power of 80% the OR is 0.63.

Summary Statements
A logistic regression of progression over 1 year (defined as an EDSS progression event or a PBVC<-0.94%) on a binary independent variable (X) with a sample size of 200 observations (of which 50% are in the group X=0 and 50% are in the group X=1) achieves 90% power at a 0.05 significance level to detect a change in the risk of progression from the baseline value of 35% to 13.8%. This change corresponds to an odds ratio of 0.297. For a power of 80% the detectable OR is 0.36 (Hsieh, Bloch & Larsen, Stat Med 1998).

Subcohort 3: PPMS within 5 years of onset (n=100)
Outcome of interest: disease "progression" (as defined by EDSS according to typical clinical trial criteria OR brain atrophy rate).
SUMMARY: OR (difference in effect between PPMS that develop progression vs. PPMS that do not) is 0.49-0.54 for a continuous variable, 0.17-0.23 for a binary variable, which is reasonable for most imaging and biological measures.
For this sample size estimation data from a PPMS clinical trial were utilized, selecting patients with less than 5 years of disease duration (Sormani, personal data on file).
The outcome is therefore set a disability progression or a pathological brain volume loss over 1 year. Data from the PPMS trial indicate that the percentage of patients with a disability progression or a PBVC higher than -0.94% are around 40% (data on file).
The first case is for assessing a factor represented by a continuous variable, and we estimate the OR we can detect when increasing the baseline factor level of 1 standard deviation above the mean value. An adjustment was made assuming that a multiple regression of the independent variable of interest on the other independent variables in the logistic regression to have a correlation of 0.4 (an R-squared of 0.16).

Summary Statements
A logistic regression of progression over 1 year (defined as an EDSS progression event or a PBVC<-0.94%) on a continuous, normally distributed variable (X) with a sample size of 100 observations achieves 90% power at a 0.05 significance level to detect a change in the risk of progression from the value of 40% at the mean of X to the value of 24.5% when X is increased to one standard deviation above the mean. This change corresponds to an odds ratio of 0.49. For a power of 80% the OR is 0.54.

Summary Statements
A logistic regression of progression over 1 year (defined as an EDSS progression event or a PBVC<-0.94%) on a binary independent variable (X) with a sample size of 100 observations (of which 50% are in the group X=0 and 50% are in the group X=1) achieves 90% power at a 0.05 significance level to detect a change in the risk of progression from the baseline value of 40% to 10%. This change corresponds to an odds ratio of 0.17. For a power of 80% the detectable OR is 0.23.