1:45pm - 2:05pmBootstrap-based inference for Pseudo-value regression models
Simon Mack1,2,3, Dennis Dobler1,3, Morten Overgaard4
1TU Dortmund University, Germany; 2Otto von Guericke University Magdeburg, Germany; 3Research Center Trustworthy Data Science and Security, Germany; 4Department of Public Health - Department of Biostatistics, Aarhus University, Denmark
Generalized estimating equations (GEE) are a popular method to model the effects of covariates on various estimands, which only rely on the specification of a functional relationship without the need of restrictive distributional assumptions. However, if the response variable is not fully observable, e.g. in the case of time-to-event data, the GEE approach is not directly applicable. Andersen et al. (2003) proposed to replace the partially unobservable response variables by jackknife pseudo-observations, and Overgaard et al. (2017) showed that the resulting parameter estimates are consistent and asymptotically normal under very general conditions. For further inference about the parameter vector an estimator of the asymptotic covariance matrix is necessary. But due to the dependence of the pseudo-observations, the limiting covariance matrix is highly complicated and the usual sandwich estimator seems to be inconsistent (Jacobsen and Martinussen (2016), Overgaard et al. (2018)). Overgaard et al. (2017) proposed an alternative estimator which incorporates the dependence of the pseudo-observations and performs well in medium to large samples. These results would in principle allow for the construction of tests for general linear hypotheses about the parameters. However, mainly confidence intervals for individual parameters or simple contrasts, e.g. risk differences, have been considered in the past. In this talk we aim to bridge this gap by introducing different test statistics for general linear hypotheses in pseudo-value regression models. To improve the small sample performance of these tests we discuss different bootstrap methods for pseudo-observations as well as possible extensions to multiple testing problems and simultaneous confidence intervals for contrasts. Acknowledgements We would like to thank Marc Ditzhaus for his invaluable collaboration and guidance in the early phase of this work. Sadly, he has deceased and he could not complete this work together with us.
References Per Kragh Andersen, John P. Klein, and Susanne Rosthøj. "Generalised linear models for correlated pseudo‐observations, with applications to multi‐state models." Biometrika 90.1 (2003): 15-27. Morten Overgaard, Erik Thorlund Parner, and Jan Pedersen. "Asymptotic theory of generalized estimating equations based on jack-knife pseudo-observations." The Annals of Statistics 45.5 (2017): 1988-2015. Martin Jacobsen, and Torben Martinussen. "A note on the large sample properties of estimators based on generalized linear models for correlated pseudo‐observations." Scandinavian Journal of Statistics 43.3 (2016): 845-862. Morten Overgaard, Erik Thorlund Parner, and Jan Pedersen. "Estimating the variance in a pseudo‐observation scheme with competing risks." Scandinavian Journal of Statistics 45.4 (2018): 923-940.
2:05pm - 2:25pmImplications of Pseudo-Observations in Prognostic Modelling: Addressing Left Truncation.
Nickson Murunga, Sarah Booth, Mark Rutherford
Biostatistics Research Group, Department of Population Health Sciences, University of Leicester
Background: In many time-to-event prognostic models, the Cox model is used, commonly assuming constant covariate effects over the full follow-up period (1). However, when interest is in fact in estimating prognosis at specific future timepoints after diagnosis, then Pseudo-Observations (POs), introduced by Andersen et al., can be used to directly model covariate effects on survival at the timepoint of interest (2).
We aim to utilize POs in prognostic modelling for specified time points when the objective is to provide an up-to-date estimate of long-term survival, which is typically solved with period analysis. In this approach, only risk times and events within a defined recent period window inform survival estimates, providing more up-to-date predictions than standard full-cohort approaches, which may underestimate survival as prognosis improve over time (3).
However, defining this period window introduces left-truncated data, as individuals enter the study after a specified calendar time point. This complicates analysis because late-entry individuals receive POs without contributing to the Kaplan-Meier (K-M) estimate at the time of interest, potentially distorting survival probabilities (4).
This study aims to refine POs for better alignment with K-M estimates in left-truncated scenarios, enhancing prognostic accuracy and fully incorporating both right-censored and left-truncated data, resulting in updated survival estimates within period window analysis.
Methods: We demonstrate this approach with a prognostic model using historical data for colon cancer patients aged 18-99, diagnosed between 1975 and 1994. A stratified approach to POs is introduced to address left truncation in prognostic modelling. First, POs are calculated without delayed entry to build a model categorizing individuals into risk groups based on covariates. A period window then introduces delayed entry, and POs are recalculated within each risk group, averaged, and compared with K-M estimates for agreement. Once consistency is established, the new POs are used to update the baseline of the prognostic model in period analysis. We compare a range of choices for how many risk groups are necessary to achieve good agreement.
Results and Discussion: This approach demonstrates that average PO estimates within risk groups align closely with K-M estimates, supporting the use of stratified POs in prognostic models with period window analysis and left truncation. Updated POs can refine the baseline of prognostic models, helping account for survival improvements over time. However, a more general method, independent of predefined risk groups or model, remains necessary. Alternative methods like inverse probability of censoring weighting, offer an alternative approach.
|