SAMPLE SIZE ESTIMATION FOR CANCER PROGRESSION MODELS

Christian Netzer and Jörg Rahnenführer

Keywords

Genetic progression models, sample size, survival analysis, simulation study

Abstract

Human tumours are often associated with the accumulation of chromosomal alterations in the cancer cells. The identification of characteristic pathogenic routes improves prediction of survival times and optimal therapy choice. The simplest model assumes independent alterations. Then progression is measured by the count statistic, the total number of alterations. An advanced model is the oncogenetic trees mixture model. An oncogenetic tree allows both independent and sequential relationships between alterations, and the mixture model divides the patients into groups with different progression paths. Progression along such a model can be quantified univariately by the GPS (genetic progression score). On real cancer data, the GPS was shown to discriminate better than the count statistic between patient subgroups with different survival prognosis. Here, in a simulation study, we evaluate the necessary numbers of patients for detecting true relationships between genetic progression and survival time. We generate survival times correlated with count statistic and GPS, respectively. If the simple model is the correct one, misspecification with the advanced model requires about 20% larger sample size, independent from the number of events. In contrast, misspecification with the simple model leads with increasing numbers of events from 20% to 70% larger sample size. Additionally, if the true data-generating model is the mixture model, the absolute numbers are more than twice as large, thus favouring the advanced modelling approach especially in situations with limited model knowledge.

Important Links:

Go Back