8:30am - 8:50amEstimation and variables selection in a joint model of survival times and longitudinal data with random effects
Antoine Caillebotte1,2, Estelle Kuhn1,2, Sarah Lemler1,3
1Université Paris-Saclay; 2INRAE UR MaIAGE; 3CentraleSupélec, Laboratoire MICS
This work considers a joint survival and mixed-effects model to explain the survival time from longitudinal data and high-dimensional covariates. The longitudinal data is modeled using a nonlinear mixed effects model, whose regression function serves as a link function incorporated into a Cox model as a covariate. In that way, the longitudinal data is related to the survival time. Additionally, the Cox model takes into account high-dimensional covariates. The main objectives of this research are two-fold: identify the relevant covariates that contribute to explaining survival time and estimate all unknown parameters of the joint model, in particular the strength of the link between the two models. For that purpose, we consider the estimate obtained by maximizing a LASSO penalized likelihood. To tackle the optimization problem, we implement a pre-conditioned stochastic gradient to handle the latent variables of the nonlinear mixed-effects model associated with a proximal operator to manage the non-differentiability of the LASSO penalty. Variable selection lies on the LASSO penalization and therefore on a regularization parameter chosen according to the eBIC criterion. The latter is better suited to the high-dimensional context. We provide a simulation study showcasing the performance of the proposed variable selection and parameter estimation method.
8:50am - 9:10amSign-flip test for coefficients in the Cox regression model
Riccardo De Santis1, Jelle J. Goeman2, Hein Putter2, Livio Finos3
1University of Siena, Italy; 2Leiden University Medical Center, The Netherlands; 3University of Padova, Italy
Cox regression model is a popular tool in survival analysis, whose aim is to quantify the impact of covariates on the survival times. The relevance of the coefficients is usually tested through a parametric test. However, the properties of this test are only asymptotical and can show a slow convergence to the nominal level. We propose a different approach to perform the test on the coefficients based on sign-flipping of the score contributions. Simulations show a faster convergence to the nominal level of the test based on the proposed method. Further, we embed the new test in a permutation-based framework to tackle the case of multiple coefficients testing, which turns out to be relevant especially in high-dimensional problems. We show the potential of our proposal through a real data application in genomics
9:10am - 9:30amTargeted learning with right-censored data using the state learner
Anders Munch, Thomas Gerds
University of Copenhagen, Denmark
Targeted or debiased machine learning provides a methodology for combining data-adaptive estimators with asymptotically valid inference for interpretable estimands. In particular, using a super learner as a data-adaptive model selector offers a general framework for obtaining valid statistical inference without relying on a single pre-specified model. Super learning evaluates model performance using cross-validation. However, cross-validation based on right-censored data typically depends on a pre-specified model for the censoring distribution, which can be challenging to provide, especially for observational data. To address this, we propose a new super learner, the state learner, which jointly evaluates the performance of models for both the outcome and censoring distributions. The state learner uses the data to select a pair of models that are optimal for predicting the state-occupation probabilities characterizing the observed data distribution. This approach readily extends to settings with competing risks and is particularly well suited for use in combination with targeted learning. We discuss the theoretical properties of the state learner and demonstrate how it can be integrated with targeted learning for estimation of low-dimensional, interpretable estimands in a competing risks model observed under right-censoring.
9:30am - 9:50amDeep Learning for Survival Analysis: A Review
Simon Wiegrebe
LMU Munich, Germany
The influx of deep learning (DL) techniques into the field of survival analysis in recent years has led to substantial methodological progress; for instance, learning from unstructured or high-dimensional data such as images, text or omics data.
We conducted a comprehensive systematic review of 61 DL-based methods for time-to-event analysis, by defining dimensions according to which we classified all methods. These dimensions include survival-related aspects like the ability to handle different survival tasks (different types of censoring and truncation schemes, competing risks, multi-state models, etc.); handling of non-proportional hazards (time-varying effects and features); neural network-related aspects like architecture; as well as estimation-related aspects (model family, target of estimation, etc.).
On the one hand, this work provides a snapshot of the current state of the field and identifies potential gaps for future research, on the other hand it provides a scheme according to which future methods can be categorized.
From a technical perspective, we found that most methodologically innovative methods were survival-specific applications of novel methods developed in other areas of DL, such as computer vision or NLP. This usually yielded a more flexible estimation of associations of (structured and unstructured) features with the outcome, rather than solving survival-specific problems. Outcome types beyond right-censoring and competing risks, as well as handling of missing values, were rarely addressed and only little attention has been paid to optimization (e.g., choice of optimizers, tuning of hyperparameters, or neural architecture search). Additionally, there were some challenges which are specifically DL-related, in particular batching.
From an application-centered perspective, DL-based survival methods have mostly been deployed in estimating patient survival based on medical images or (multi-)omics data, some methods being explicitly motivated by a specific clinical use case (e.g., a specific cancer type). Other areas of application of DL-based survival methods included improved estimation of prognostic indices and of recurrence after cancer surgery.
In more general terms, we observed that the lack of openly accessible, high-dimensional, potentially multimodal datasets remains a major challenge for the development and training of novel DL-based survival methods.
In summary, deep survival methodology has advanced substantially in recent years and will certainly continue to benefit from developments in general DL, with big methodological advances being likely to swap over. The results of our review are summarized in an interactive online table (https://survival-org.github.io/DL4Survival), which can be extended through pull requests.
|