3:35pm - 3:55pmA non-parametric proportional risk model to assess a treatment effect in an application to randomized controlled trials
Lucia Ameis1, Oliver Kuß2, Annika Hoyer3, Kathrin Möllenhoff1
1Institute of Medical Statistics and Computational Biology (IMSB), Faculty of Medicine, University of Cologne, Cologne, Germany; 2German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich Heine University Düsseldorf, Institute for Biometrics and Epidemiology, Düsseldorf, Germany; 3Biostatistics and Medical Biometry, Medical School OWL, Bielefeld University, Bielefeld, Germany
Time-to-event analysis often relies on prior parametric assumptions or, if a non-parametric approach was chosen, Cox’s proportional hazards model that is inherently tied to an assumption of proportional hazards. This limits the quality of the results in case of any violation of these assumptions. Especially the assumption of proportional hazards was recently criticized for being rarely verified. In addition, most interpretations focus on the hazard ratio, that is often misinterpreted as the relative risk and comes with the restriction of being a conditional measure. Our approach introduces an alternative to the proportional hazard assumption and allows for a direct estimation of the relative risk as well as the absolute measure of the number needed to harm, therefore provides the possibility of an easy and holistic interpretation.
In this talk, we propose a new non-parametric estimator to assess the relative risk of two groups to experience an event under the assumption that the risk is constant over time, namely the proportional risk assumption. Precisely, we first estimate the respective cumulative distribution functions of both groups by means of the Kaplan-Meier estimator and second combine their ratio at different time points to estimate the mean relative risk. We then combine the result with one of the estimated cumulative distribution functions to assess the number needed to harm. This offers the possibility to interpret the treatment effect solely based on a Kaplan-Meier estimator and offers a flexible alternative to Cox's model if the proportional hazard assumption is violated.
We demonstrate the validity of the approach by means of a simulation study and present an application to data from a large randomized controlled trial investigating the effect of dapagliflozin on all-cause mortality.
3:55pm - 4:15pmExhausting the type I error level in a group-sequential design with a closed testing procedure for progression-free and overall survival
Moritz Fabian Danzer1, Kaspar Rufibach2, Jan Beyersmann3, Rene Schmidt1
1University of Münster, Germany; 2Merck KGaA, Darmstadt, Germany; 3University of Ulm, Germany
In the planning and analysis of clinical trials with time-to-event endpoints, multiple outcomes that reflect the course of disease can be of high importance. This applies in particular to oncological trials in which both overall survival (OS) and progression-free survival (PFS) may be used as confirmatory endpoints where appropriate. In such settings, which the health authorities refer to as multiple primary endpoints, control of the family-wise error rate (FWER) can be of great importance [1,2].
Recently, a planning approach has been proposed that exploits the relationship between PFS and OS to potentially gain efficiency in trial designs [3]. In particular, it takes into account that the inflow of information for the endpoints may be of different speeds and that assuming proportional hazards for both endpoints simultaneously is not realistic. In such a design, the FWER can be controlled in a conservative way by splitting the overall significance level between the two endpoints in a weighted Bonferroni procedure.
In this work, we want to address the extent to which we can achieve uniform improvements to this approach. To do this, we identify two different methods that we can also combine with each other. First, we want to exploit the dependence between the endpoints in the calculation of rejection bounds for endpoint-specific log-rank tests. To do this, a model-free characterization of the joint distribution of the test statistics across different analysis times and endpoints will be crucial [4]. Second, we want to show to what extent this dependence can also be exploited in the context of a closed testing procedure within a group-sequential design [5].
While we will initially limit ourselves to calculations and simulations within our specific example of PFS and OS, the basic approach should also be applicable to other cases.
References: [1] FDA (U.S. Food and Drug Administration). Multiple endpoints in clinical trials: guidance for industry. 2017. [2] EMA (European Medicines Agency). Guideline on multiplicity issues in clinical trials. 2017. [3] Erdmann A, Beyersmann J, Rufibach K. Oncology clinical trial design planning based on a multistate model that jointly models progression-free and overall survival endpoints. arXiv preprint 2023. [4] Lin DY. Nonparametric sequential testing in clinical trials with incomplete multivariate observations. Biometrika 1991; 78(1):123-131. [5] Anderson KM, Guo Z, Zhao J, Sun LZ. A unified framework for weighted parametric group sequential design. Biometrical Journal 2022; 64(7):1219-1239
4:15pm - 4:35pmSample size calculation based on differences of quantiles from right-censored data
Beatriz Farah1,2,5, Aurélien Latouche2,4, Olivier Bouaziz3, Xavier Paoletti2,5
1Université Paris Cité, France; 2Insitut Curie, France; 3Université de Lille, France; 4Conservatoire National des Arts et Métiers, France; 5Université de Versailles Saint-Quentin-en-Yvelines, France
When evaluating treatment effect, it is common to rely on the hazard ratio, typically by using the ubiquitous Cox model. In the presence of right-censoring, the hazard rate can be easily estimated from the observed data which makes this model very appealing. In Randomized Clinical Trials (RCT), standard methods already exist to determine the sample size when the estimand is a hazard ratio (typically the hazard ratio for comparing the effect of two treatments) based on either the log-rank test or the Cox model. However, in cancer studies, some treatments may have a late effect and the proportional hazard assumption imposed by the Cox model is no longer verified. Thus, we would like to shift from the hazard ratio to the difference in quantile of failure time as estimand because:
-
It allows for quantifying different treatment effects across quantiles;
-
Quantile regression doesn’t assume proportional risks, making it appropriate for analyzing delayed treatment effects associated with immunotherapy;
-
It offers a clinically interpretable way to measure the benefit of one treatment over another as a function of time.
Our goal is to propose a sample size formula for evaluating treatment effects by comparing pre-specified quantiles in each treatment group. A versatile method for testing equality of quantiles was proposed by Kosorok (1999), which allows to either test simultaneously different quantiles or to test the same quantile at different analysis times in a group sequential clinical design. This method requires an estimator of the density of the distributions at the quantiles, for which we propose a gaussian resampling method inspired by Lin at. al. (2015). We studied the effect of the variance of the generated normal variables on the estimation of the density and showed that this parameter has an influence on the quality of estimation. As a result, we developed a method to choose this variance parameter from the data in an efficient way.
We propose an explicit expression for the power of the test which allows us to derive a formula for computing minimal sample size. Extensive simulation studies that compare the power of the proposed method with other approaches from literature are also presented.
4:35pm - 4:55pmOne-sample survival tests for non-proportional hazards in oncology clinical trials: a simulation study
Chloé Szurewsky1, Guosheng Yin2, Gwénaël Le Teuff1
1CESP, INSERM U1018, Université Paris-Saclay, France; 2Department of Statistics and Actuarial Science, University of Hong Kong Pokfulam Road, Hong Kong
In recent years, many one and two-stage designs for single-arm trials with time-to-event outcome have been proposed as an alternative to the randomized clinical trial which may be unfeasible to design for rare diseases in oncology (pediatric studies or personalized medicine). These designs rely on the one-sample log-rank test1 (OSLRT) and its modified version2 (mOSLRT) for comparing the survival curve of an experimental group to that of an external reference group. These tests are developed under the proportional hazards (PH) assumption which may be violated particularly when evaluating immunotherapies. We propose to adapt the OSLRT and evaluate alternatives for situations where PH does not hold. We extended Finkelstein's score test1 developed under PH by using a piecewise exponential3 model with change-points (CPs) for early, middle or delayed effect. For crossing hazards, we use an accelerated hazards model. We extended the restricted-mean survival time (RMST) based test4 to single-arm trials and constructed a maximum combination (maxCombo) test5 combining the mOSLRT, early and delay score tests. The performances (type I error and power) of the developed tests are evaluated through a simulation study of a phase II single-arm trials with an accrual and a follow-up of 3 and 4 years, respectively. The survival times are generated using an exponential distribution assuming no sampling variability for the reference group and a piecewise exponential for the experimental group. We varied the sample size of the experimental group (from 20 to 200 patients), the exponential censoring rate (from 0 to 35%) and the relative treatment effect (hazard ratio from 0.5 to 1). For illustration, we applied these different approaches to pediatric trials for neuroblastoma. Simulations show that the score tests are more conservative than the mOSLRT. When the data generation matches with the model, the associated score test is the most powerful even when the CPs are misspecified. The RMST-based test is more powerful than mOSLRT just for an early effect with censoring rate less than 15%. The maxCombo test is conservative and has a higher power than mOSLRT with enough large sample size (n>50 or n>100) but less than the right score test under non PH. To conclude, the score tests are efficient when the approximate values of CPs are known and the maxCombo test is an alternative when the CPs are unknown. Further researches may be conducted to study the impact of the reference survival curve variability and its survival distribution.
|