The often demand in the biostatistical research is to group patients depending on explanatory variables that are continuous. In some cases the requirement is to test overall survival of the subjects that suffer on a mutation in specific gene and have high expression (over expression) in other given gene. To visualize differences in the Kaplan-Meier estimates of survival curves between groups, first the discretization of continuous variable is performed. Problems caused by categorization of continuous variables are known and widely spread (Harrel, 2015), but in this case there appear a simplification requirement for the discretization. In this post I present the maxstat(maximally selected rank statistics) statistic to determine the optimal cutpoint for continuous variables, which was provided it in the survminer package by Alboukadel Kassambara kassambara.
In this post I will use data from TCGA study, that are provided in the RTCGA package Star and survminerStar package to determine the optimal cutpoint for continuous variable.